Wisdom Gate AI News [2026-02-13]
⚡ Executive Summary
Zhipu AI's release of GLM-5, a 744B parameter open-source model, marks a new SOTA in agentic capabilities and represents a significant leap in China's domestic AI hardware stack. The model's adoption of DeepSeek's Sparse Attention architecture enables unprecedented efficiency for its scale, achieving top-tier performance on complex reasoning benchmarks like BrowseComp while running entirely on Chinese-made Huawei Ascend chips.
🔍 Deep Dive: Zhipu AI GLM-5 & The New Open-Source SOTA
Zhipu AI launched its fifth-generation flagship model, GLM-5, on February 11-12, 2026. This isn't just another parameter count bump; it's a strategic overhaul targeting production-ready agentic intelligence and hardware independence.
Architectural Leap: At 744 billion total parameters with only 44 billion active per inference (5.9% sparsity), GLM-5 doubles the scale of its predecessor GLM-4.7. The core innovation is its integration of DeepSeek Sparse Attention (DSA), a natively trainable mechanism that slashes the quadratic O(n²) attention complexity to O(n·k). This two-stage "indexer + top-k" pipeline uses coarse-grained token compression followed by fine-grained selection, enabling efficient handling of its 200,000-token context window. The result is an 11.6x decoding speedup and 99x forward-pass acceleration on long contexts.
Hardware Sovereignty: Perhaps the most geopolitically significant detail is that GLM-5 was trained exclusively on Chinese-made Huawei Ascend chips, achieving what Zhipu calls "full independence" from US hardware (NVIDIA). The model also maintains compatibility with other domestic chips like Moore Threads and Cambricon.
Benchmark Dominance: GLM-5 claims state-of-the-art open-source performance on key agentic benchmarks. It tops BrowseComp (a challenging web-based reasoning evaluation), leads on MCP-Atlas and τ2-Bench, and achieves a record-low hallucination rate on the Artificial Analysis Intelligence Index. On SWE-bench Verified (77.8), it surpasses Google's Gemini 3 Pro, demonstrating tangible progress in automated software engineering—shifting from "vibe coding" to scalable "agentic engineering."
Market Impact: The launch triggered a 28-34% surge in Zhipu's share price and prompted an immediate 30%+ price hike for its GLM Coding Plan, signaling both commercial confidence and surging enterprise demand for capable, open-weight agentic models.
📰 Other Notable Updates
- DeepSeek Sparse Attention (DSA) Explained: The architecture powering GLM-5's efficiency is DeepSeek's DSA, detailed in their V3.2 paper. It uses a hierarchical strategy of token compression and top-k selection, trained via an initial dense warm-up phase (1,000 steps) followed by sparse training (15,000 steps) with a detached indexer. This hardware-aligned design delivers massive speedups without degrading performance on reasoning or long-context tasks.
- BrowseComp as the New Agentic Benchmark: BrowseComp is emerging as a critical benchmark for evaluating AI agents on complex, multi-turn web reasoning and information retrieval tasks. GLM-5's SOTA performance here underscores its strength in practical, tool-using scenarios beyond simple question-answering.
🛠 Engineer's Take
The GLM-5 announcement is impressive, but let's cut through the hype. A 744B model that's allegedly faster than its smaller predecessors because of sparsity? We've been burned by "efficient" giants before. The DSA architecture is legit—DeepSeek's papers show real math—but the real test is in your inference stack. Can you actually run this "open-source" behemoth without a Zhipu enterprise contract and a data center full of Ascend chips? The hardware independence narrative is a massive deal for China, but for the global open-source community, it might just trade one vendor lock-in (NVIDIA/CUDA) for another (Huawei/CANN).
The BrowseComp SOTA is meaningful; it shows a shift from chatbot benchmarks to practical, web-enabled agent work. But until we see reproducible inference code, real latency numbers on consumer hardware, and an open dataset of those "28.5T tokens," this remains a very promising research release, not a plug-and-play production model. The immediate 30% price hike for Zhipu's API is the most telling detail: they know they've built something valuable, and it's not going to be cheap.
🔗 References
- https://www.scmp.com/tech/article/3343239/chinas-zhipu-ai-launches-new-major-model-glm-5-challenge-its-rivals
- https://evrimagaci.org/gpt/zhipu-ai-unveils-glm5-model-redefining-global-ai-race-528618
- https://news.futunn.com/en/post/68841584/the-open-sourcing-of-zhipu-glm-5-has-sparked-a
- https://www.emergentmind.com/topics/deepseek-sparse-attention-dsa
- https://arxiv.org/abs/2512.02556
- https://arxiv.org/abs/2502.11089
- https://news.smol.ai/issues/2026-02-11-glm-5
- https://www.latent.space/p/ainews-zai-glm-5-new-sota-open-weights
- https://www.moomoo.com/news/post/65471860/zhipu-has-released-its-new-flagship-model-glm-5-with