China's Z.ai GLM-5.2 Outperforms Claude on Cybersecurity Benchmark
모델 출시 | Mon Jun 29 2026 00:00:00 GMT+0000 (Coordinated Universal Time) | 2 sources
Zhipu AI's open-weight GLM-5.2 surpassed Claude in vulnerability detection, narrowing the US-China AI gap.
Analysis
[Zhipu AI (Z.ai)] released open-weight model GLM-5.2 [1][2]
- MIT-licensed open-weight model
- Launched June 13
- 2026 for GLM Coding Plan members
- Weights and release notes published on June 16
- Can be downloaded
- run
- and fine-tuned on proprietary hardware
[GLM-5.2] outperformed Claude Code on IDOR vulnerability detection benchmark [2]
- Achieved 39% F1 score on IDOR detection
- Surpassed Claude Code at 32% and Claude Opus 4.8
- Approximately $0.17 cost per vulnerability
- Fell short of Semgrep multimodal pipeline (53-61%)
[Chinese AI model gap] still lags US models on general tasks [1]
- Performance gap remains versus Anthropic Mythos and OpenAI GPT-5.6 on general tasks
- Gap dramatically narrowed in bug detection and cybersecurity domains
- Open-weight nature provides accessibility and flexibility
[US Government] exposed limits of policies restricting Chinese AI access [1]
- Attempts to restrict access to high-performance models like Mythos and Fable
- Export controls on AI training hardware
- Trump administration views vulnerability detection AI as a national security threat
- Concerns over circumventing controls via open-weight models
[Semgrep] conducted experiment on harness vs. model performance contribution [2]
- Compared with frontier models using identical IDOR dataset and prompts
- Tested in a simple Pydantic AI harness environment
- Separated model and harness contributions to vulnerability detection performance
- Achieved 53-61% F1 with dedicated harness