SiliconIMPACT 87

GLM-5 Breaks the NVIDIA Monopoly on Frontier Training

Zhipu AI's 744B model, trained entirely on 100,000 Huawei Ascend 910B chips, demolishes the assumption that competitive frontier AI requires NVIDIA hardware. The procurement and geopolitical implications are immediate.

2026-06-095 MIN READ#Huawei · #Ascend · #NVIDIA · #Zhipu AI · #GLM-5 · #export controls · #AI training · #China · #semiconductor · #procurement

The Assumption Is Gone

For two years, a single technical claim anchored NVIDIA's pricing power, data center procurement cycles, and the logic of US semiconductor export controls: you cannot train a frontier-class AI model without NVIDIA hardware. On February 11, 2026, that claim expired.

Zhipu AI, now rebranded as Z.ai, released GLM-5, a 744-billion-parameter language model that performs within single digits of GPT-5.2 and Claude Opus 4.5 on major benchmarks — and it was trained entirely on Huawei's Ascend 910B processors. Not a single NVIDIA chip was involved.

The entire GLM-5 family was trained on approximately 100,000 Huawei Ascend 910B chips using the MindSpore framework. No Nvidia GPUs were used at any point in the training process.

Zhipu then open-sourced it under the MIT license, meaning anyone on Earth can download it, modify it, and deploy it commercially with zero restrictions.

This is not a benchmark exercise. It is a structural break.

GLM-5 by the Numbers

Sources: AI Flash Report, Let's Data Science, LushBinary (verified May 2026)

Why This Is Different From Prior Claims

The Ascend ecosystem has been declared viable before. It was not. DeepSeek, the lab behind R1 that rattled global markets in January 2025, reportedly attempted to train its successor R2 on Huawei Ascend hardware. The effort failed. DeepSeek encountered stability issues that made large-scale Ascend training runs unreliable, and ultimately reverted to NVIDIA GPUs for R2. If China's most technically accomplished AI lab could not make Huawei hardware work for training at scale, the Ascend ecosystem appeared unready for frontier development.

GLM-5's successful training on 100,000 Ascend chips directly contradicts that conclusion.

The engineering required was substantial. Zhipu developed custom optimization techniques for Huawei's chip architecture, implementing dynamic graph multi-level pipelined deployment to run different training stages concurrently. The company built high-performance fusion operators compatible with Ascend and employed multi-stream parallelism to overlap communication and computation during distributed training — optimizations aimed at extracting maximum performance from hardware that operates differently from the NVIDIA GPUs most AI frameworks target by default.

The 28.5 trillion token training run was executed on Huawei Ascend AI processors using the MindSpore framework. The software lift was substantial. The result was a working frontier model.

The Architecture and the Benchmarks

GLM-5 uses a Mixture-of-Experts architecture. The model has 744 billion total parameters, but only 40 to 44 billion are active per inference pass — the same architectural approach that made DeepSeek V3 and Mixtral successful.

The performance picture is mixed. On coding, GLM-5 scores 77.8% on SWE-bench, approaching Claude Opus 4.5's 80.9%. On mathematics, it scores 92.7% on AIME 2026 I. On Terminal-Bench, a benchmark for autonomous command-line task completion, GLM-5 underperforms against Claude and GPT-5.2. Some of GLM-5's self-reported scores, particularly on HLE with tools, have not yet been independently verified. Operators should treat unverified self-reported numbers with appropriate skepticism.

SWE-Bench Verified Scores: GLM-5 vs. Frontier Models

Source: LushBinary GLM-5 Developer Guide; The Neuron (February 2026)

Throughput is a real constraint. GLM-5 generates roughly 17 to 19 tokens per second on throughput, compared to 25 to 30-plus for competitors. Training on non-NVIDIA silicon is proven. Raw inference speed at scale remains a gap.

What Actually Changed for Procurement

The export control thesis was simple: deny NVIDIA H100 access, deny frontier capability. Zhipu AI has been on the US Entity List since January 2025, meaning the company has no legal access to NVIDIA's data center GPUs — H100, H200, B200 — that power training runs at virtually every other frontier AI lab globally. The controls worked as a forcing function, not as a ceiling.

The company's success challenges the assumption that US export controls can effectively limit Chinese AI development. If anything, the restrictions may have accelerated domestic chip development.

For non-Chinese operators, the implication differs but carries real weight. Single-vendor GPU procurement was previously defensible on technical grounds: NVIDIA was necessary for frontier-class work. That argument is now gone. For multinational enterprises operating in China, GLM-5's training on domestic hardware provides evidence that Chinese AI infrastructure can support state-of-the-art model development. Companies with Chinese operations may need to evaluate strategies around platforms like Huawei's Ascend and frameworks like MindSpore.

More broadly, every procurement team that deferred multi-vendor qualification on the grounds that "nothing else works" needs to revisit that decision.

The Honest Limits

Ascend matching NVIDIA across all dimensions did not happen. The Ascend 910B shows competitiveness at mainstream precisions but has clear shortcomings: 64 GB HBM risks training interruptions for ultra-large models, 1800 GB/s HBM bandwidth is lower than NVIDIA's, and without NVLink, it relies on 64 GB/s PCIe for multi-card communication.

SMIC's 30 to 50% yield rate at 7nm DUV means Zhipu likely needed 200,000 to 330,000 total dies for 100,000 working chips — an inefficiency that translates into higher costs and slower scaling. That is a real cost disadvantage.

What changed is the binary. "Frontier-class on non-NVIDIA" has moved from impossible to demonstrated, with known engineering overhead. NVIDIA's moat now rests on cost efficiency, ecosystem maturity, and inference speed — a narrower and more contestable position than technical necessity.

What to Watch

DeepSeek's next move. DeepSeek failed on Ascend for R2. Watch whether they attempt another run now that Zhipu has published a working playbook, or continue on NVIDIA hardware obtained through opaque supply chains.
Other Chinese labs on Ascend. Chinese AI labs — Z.ai, Alibaba (Qwen), DeepSeek, and Moonshot AI (Kimi) — now hold most of the top positions among open-weight models on major leaderboards. If Qwen or Kimi replicate GLM-5's Ascend-native training, the data point becomes a pattern.
Huawei's roadmap execution. Huawei is working on three new Ascend chip series over the next three years: the Ascend 950 series, the Ascend 960 series, and the Ascend 970 series. Track whether yield and interconnect bandwidth close the gap to H100-class specs. That variable determines whether Ascend becomes a genuine global alternative or remains a geopolitically constrained workaround.
Western enterprise evaluation. If geopolitical unpredictability around US chip access persists, European and Asian enterprises not subject to US export rules may quietly begin Ascend qualification. Watch for early signals in procurement disclosures and cloud partnership announcements.
NVIDIA's pricing response. A demonstrated alternative is the precondition for pricing pressure. Monitor whether NVIDIA adjusts H100/H200 contract pricing in markets where Ascend is available. That is the first concrete signal that the competitive dynamic has shifted from monopoly to contest.

Sources

← back to the feed