TLDRs;
- DeepSeek warns its R1 and V3 models can be jailbroken, raising safety concerns for open-source AI.
- Benchmark tests show DeepSeek models score slightly above rivals but remain vulnerable without external safety measures.
- Experts caution that open-sourcing enables misuse, making AI systems easier to manipulate for harmful outputs.
- DeepSeek highlights cost-efficiency with its $294K R1 training, while preparing to launch a new AI agent model.
Hangzhou-based startup DeepSeek has raised alarms over potential vulnerabilities in its open-source artificial intelligence systems.
The company disclosed that its flagship R1 and V3 models could be manipulated, or “jailbroken”, by malicious actors to bypass built-in safety controls.
DeepSeek’s findings add fresh urgency to ongoing debates around AI safety, transparency, and the risks of releasing powerful models to the public.
Benchmark Tests Reveal Mixed Results
According to the company’s evaluation, R1 and V3 posted slightly above-average safety scores when measured against industry leaders such as OpenAI’s o1 and GPT-4o, as well as Anthropic’s Claude 3.7 Sonnet.
However, the R1 model in particular was deemed “relatively unsafe” if used without additional layers of risk management.
The study also compared DeepSeek’s models to Alibaba’s Qwen2.5. Under controlled jailbreak attempts, all systems, including those from Chinese and U.S. rivals, exhibited a notable increase in harmful or unsafe responses. Yet, open-source models proved most vulnerable, underscoring the trade-off between accessibility and control.
Experts Warn on Open-Source Risks
Industry experts caution that open-sourcing AI code enables researchers and developers to innovate, but it also allows bad actors to strip away or weaken safety guardrails.
“Once a model’s parameters are publicly available, there’s no effective way to stop someone from re-engineering it without safeguards,” one analyst noted.
This concern comes at a time when global regulators are pressing companies to prioritize responsible deployment of AI. With jailbreaking methods proliferating online, the stakes of weak defenses are rising.
A Cost-Efficient Challenger
Despite the safety concerns, DeepSeek’s research highlighted one major achievement: the cost-efficiency of its R1 model. Training the system required just US$294,000, far below the budgets often reported by U.S. developers for comparable models.
This efficiency has helped the startup gain attention as a lean challenger in the AI race, even while larger firms like OpenAI, Anthropic, and Alibaba pour billions into scaling their infrastructure.
Earlier this month, sources revealed that DeepSeek is also developing an AI agent model set to debut in late 2025. The new system is designed to perform complex, multi-step tasks with minimal user input, signaling the company’s ambition to compete directly with Western players pushing into autonomous AI.
As DeepSeek balances innovation with safety challenges, its trajectory reflects the broader tensions shaping the global AI landscape: openness versus control, speed versus responsibility, and efficiency versus scale.