TLDRs;
- DeepSeek trained its R1 AI model for just $294K using Nvidia H800 chips, far cheaper than US rivals.
- The company admitted for the first time that it used Nvidia A100 GPUs during R1’s early development phase.
- US-China chip tensions continue, as export controls limit access to high-end GPUs but don’t stop China’s innovation.
- DeepSeek is also developing a new AI agent model, aiming to rival OpenAI and Microsoft by late 2025.
China’s DeepSeek has confirmed for the first time that it had used Nvidia A100 chips during the early development of its R1 model, even as it primarily relied on the China-focused Nvidia H800 chips to train the full system.
The disclosure comes as part of a paper published in Nature on Friday, where DeepSeek reported that training its R1 large language model cost just $294,000, a fraction of the hundreds of millions often estimated for similar efforts by US developers such as OpenAI.
The figure underscores China’s growing ability to innovate within strict US export restrictions while maintaining competitive momentum in AI.
Training R1 for Just $294K
According to DeepSeek, training the R1 model required 512 Nvidia H800 GPUs, chips designed specifically for the Chinese market after Washington restricted the export of higher-performance hardware. Despite these limitations, the company successfully trained a model that matched benchmarks seen in much costlier US projects.
By comparison, industry insiders suggest that US firms typically spend tens of millions of dollars training models with comparable performance. While OpenAI has never disclosed its precise training costs for models like GPT-4, estimates put the figures in the multi-million-dollar range.
The discrepancy raises questions about how DeepSeek achieved such efficiency and whether a “leaner AI training strategy” could shift the dynamics of global AI development.
Nvidia A100 Role Acknowledged
For the first time, DeepSeek also acknowledged ownership of Nvidia A100 chips, a class of GPUs that was widely used in early large-model training before export controls tightened access.
The company admitted to using these chips during R1’s early development stages, particularly to attract top AI talent through its A100-powered supercomputing cluster.
This disclosure is significant given that US officials have repeatedly raised concerns over China’s access to advanced AI hardware. In June, reports suggested DeepSeek had access to large volumes of Nvidia H100 chips, but Nvidia clarified that the company’s R1 training was conducted with lawfully acquired H800 units.
US-China Chip Tensions Persist
DeepSeek’s reliance on the H800 underscores how export controls have reshaped the competitive landscape.
While US policymakers seek to limit China’s access to cutting-edge GPUs, firms like DeepSeek are proving that innovation is not solely defined by hardware availability but also by optimization strategies and cost efficiency.
The revelation that DeepSeek successfully trained its R1 model for under $300,000 suggests that US firms may face not only geopolitical competition but also a cost-efficiency challenge. If Chinese startups can produce competitive models at a fraction of the price, global AI leadership could tilt in unexpected directions.
New Agent Model in the Works
Beyond R1, sources familiar with DeepSeek’s strategy revealed that the company is working on a new AI agent model slated for release in late 2025.
Unlike traditional chatbots, this system will reportedly carry out complex, multi-step actions with minimal user input and the ability to learn from prior behavior.
This project places DeepSeek directly in competition with OpenAI and Microsoft, who have both recently launched AI agent frameworks. With Chinese giants like Alibaba pledging over $52 billion in AI and cloud investments, DeepSeek faces mounting pressure to accelerate development while maintaining its cost-efficient edge.