TLDRs;
- China Mobile’s AI expansion faces bottlenecks due to HBM memory shortages, threatening its 2028 computing targets.
- Domestic chips will power China Mobile’s AI network, reflecting broader national technology self-sufficiency initiatives.
- Liquid cooling solutions see growing demand as high-density GPU clusters strain traditional air cooling capabilities.
- Scaling AI infrastructure drives need for high-speed networking and Tier 4 data center designs.
China Mobile, the state-owned Chinese telecom giant, has ambitious plans to expand its artificial intelligence (AI) capabilities, aiming for a 100,000-GPU cluster and a total of 100 EFLOPS in computing power by 2028.
However, the company’s push to rely entirely on domestically produced chips has run into a significant obstacle of high bandwidth memory (HBM) shortages.
While the company has already made substantial strides in AI, current memory limitations threaten to slow progress toward its long-term targets.
Domestic Chips Drive National AI Plans
At the company’s Global Partners Conference in Guangzhou, chairman Yang Jie outlined China Mobile’s strategy to double AI investments and establish a fully domestic AI computing network.
By the end of 2024, the company had achieved 29.2 EFLOPS using FP16 precision, marking a strong foothold in the AI supercomputing race. The move to exclusively domestic chips reflects broader national initiatives to reduce reliance on foreign hardware and strengthen China’s technological self-sufficiency.
HBM Shortages Pose Scaling Challenges
Despite these ambitions, the availability of HBM, a crucial component for AI chips, limits large-scale deployment. China stockpiled roughly 13 million HBM stacks prior to U.S. export restrictions, enough for approximately 1.6 million Huawei Ascend 910C packages.
Production plans from CXMT are expected to add about 2 million stacks in 2026, supporting 250,000–300,000 chips. Analysts note that without access to foreign memory, scaling to a 1 million AI-chip network by 2025 appears infeasible. While a 100,000-GPU cluster remains achievable, the full 100 EFLOPS target by 2028 carries uncertainty.
Cooling and Infrastructure Opportunities
The challenge of scaling is compounded by power and cooling requirements. Each Ascend 910B or 910C chip consumes around 310 W, driving rack power density beyond 15–30 kW—far beyond the capacity of traditional air cooling.
This has created opportunities in the liquid cooling sector, where China’s market reached $2.37 billion in 2024 and is projected to surge to $16.2 billion by 2029. Cold plate designs dominate the market due to lower retrofitting costs, offering near-term opportunities for suppliers of cooling distribution units, heat exchangers, and specialized piping.
Supporting Systems See Rising Demand
As AI deployment grows, so too does the need for supporting infrastructure. Vendors are increasingly focused on high-speed Ethernet and RDMA over Converged Ethernet (RoCE) gear to meet the bandwidth and latency requirements of large AI clusters.
Huawei’s UnifiedBus, for example, aims for terabytes-per-second throughput with latencies as low as 2.1 microseconds.Meanwhile, AI’s energy demands are pushing for Tier 4 data center designs, the highest standard for uptime and reliability, creating further opportunities in power systems and network equipment.
Looking Ahead
China Mobile’s drive toward domestic AI chip deployment illustrates both ambition and constraint. While memory shortages and cooling challenges present near-term obstacles, the company’s efforts align with a broader push for technological self-reliance in China.
Suppliers and infrastructure providers stand to benefit from this growth, while industry watchers will closely monitor whether China Mobile can meet its 100 EFLOPS goal amid existing HBM limitations.