Learn AI in 5 minutes a day
This is the easiest way for a busy person wanting to learn AI in as little time as possible:
Sign up for The Rundown AI newsletter
They send you 5-minute email updates on the latest AI news and how to use it
You learn how to become 2x more productive by leveraging AI
China Just Matched GPT-5. For Free
How a hedge fund manager's obsession with cost-per-trade led to the most disruptive AI release of 2025 —
and what it reveals about who's actually winning the AI race.
By R. Lauritsen • December 2, 2025 • 11 min read
The most important AI release of 2025 didn't come from San Francisco. It came from Hangzhou, from a company most Americans have never heard of, built by a man who spent his career optimizing for a single metric: cost per trade.
On December 1st, Liang Wenfeng's DeepSeek released two models under an MIT license. The high-compute variant, V3.2-Speciale, won gold at the 2025 International Mathematical Olympiad. It matches GPT-5 on reasoning. It beats GPT-5 by 15 percentage points on real-world software debugging.
The price: $0.70 per 128,000 tokens. OpenAI charges $15.
That's not a typo. That's a 95% cost reduction in frontier AI reasoning—achieved under U.S. chip export controls specifically designed to prevent it. And understanding why DeepSeek succeeded requires understanding something most coverage has missed: Liang didn't build DeepSeek like a tech company. He built it like a trading floor.
The Trader's Edge
Here's what you need to know about quantitative hedge funds: they are pathologically obsessed with two things—latency and cost-per-operation. A quant fund that's 10% slower or 10% more expensive than its competitors doesn't underperform by 10%. It gets arbitraged into oblivion.
Liang Wenfeng ran High-Flyer, a quant fund that peaked at $14 billion in assets under management. His entire professional life was spent asking one question: how do I get the same output for less compute?
This matters because it explains the single most important decision DeepSeek made—one that Western labs, flush with unlimited compute budgets, never had to consider.
OpenAI optimized for capability first, then worried about cost. DeepSeek optimized for cost-per-capability from day one. They built for deployment economics before they built for benchmarks.
When you have $100 billion in Microsoft backing and a near-monopoly on frontier AI, you can afford to brute-force your way to performance. When you're operating under chip restrictions with a finite stockpile of pre-ban Nvidia hardware, you cannot. You have to be smarter.
The result is DeepSeek Sparse Attention (DSA)—not a minor optimization, but a fundamental rearchitecting of how transformers process context. And it emerged directly from the constraints Liang knew from trading: every unnecessary computation is money left on the table.
The Receipts
Before the analysis, the benchmarks. Because claims are cheap and receipts are expensive.
Competition performance (V3.2-Speciale):
• 2025 International Mathematical Olympiad: Gold medal
• 2025 International Olympiad in Informatics: Gold medal, 10th place
• ICPC World Finals 2025: 2nd place
• China Mathematical Olympiad 2025: Gold medal
Head-to-head vs. GPT-5 and Gemini 3 Pro:
• AIME 2025: DeepSeek 96.0% vs GPT-5 High 94.6% vs Gemini 3 Pro 95.0%
• HMMT 2025: DeepSeek 99.2% vs Gemini 3 Pro 97.5%
• SWE Multilingual (real GitHub bugs): DeepSeek 70.2% vs GPT-5 55.3%
That last number is the one that should scare OpenAI. Benchmarks can be gamed. Fixing real bugs in real codebases cannot. DeepSeek isn't just matching GPT-5 on math puzzles—it's outperforming it on the actual work developers need done.
The Architecture of Efficiency
Standard transformer attention has a problem: it scales quadratically. Process 128,000 tokens, and every token must attend to every other token. That's 16 billion attention computations—most of which are wasted on irrelevant context.
DeepSeek Sparse Attention asks a trader's question: what if we only computed the attention that matters?
DSA works in two stages. First, a "lightning indexer" scores every token pair for relevance—fast, approximate, cheap. Second, a selection mechanism pulls only the top-k most relevant tokens into full attention. The result: attention complexity drops from O(L²) to O(Lk), where k is a small fraction of context length.
In practice: 2-3x faster inference, 30-40% less memory, 50% better training efficiency. And because the model learns which tokens matter during training, quality loss is minimal—typically 1-2 points on benchmarks.
This is the insight Western labs missed while optimizing for peak performance: efficiency and capability aren't tradeoffs. Train for efficiency, and you can afford more training. More training produces better models. Better models, deployed cheaply, generate more real-world feedback. Feedback improves the next model. DeepSeek turned a constraint into a flywheel.
"When we first met him, he was this very nerdy guy with a terrible hairstyle talking about building a 10,000-chip cluster to train his own models. We didn't take him seriously."
— One of Liang's business partners, to the Financial Times
The Nerdy Guy With the Terrible Hairstyle
Liang Wenfeng was born in 1985 in a village in Guangdong province. His parents were primary school teachers. He studied engineering at Zhejiang University—the same school that produced Alibaba co-founder Jack Ma and Pinduoduo founder Colin Huang—and wrote his master's thesis on AI for surveillance systems.
During the 2008 financial crisis, while still a student, he formed a team to model financial markets. By 2015, that work had become High-Flyer, a quantitative hedge fund. By 2021, High-Flyer managed over $14 billion and Liang had developed a conviction: the same machine learning driving his trading models would eventually match human intelligence.
That year, he made the bet that would define DeepSeek. He began stockpiling Nvidia GPUs—10,000 A100 chips purchased before the Biden administration's export restrictions took effect. When U.S. controls finally hit, DeepSeek had the hardware it needed. What it lacked was permission to buy more.
"Money has never been the problem for us," Liang told an interviewer in 2024. "Bans on shipments of advanced chips are the problem."
This is the context that makes DSA make sense. Liang didn't invent sparse attention because it was theoretically elegant. He invented it because he had no choice. Efficiency wasn't a nice-to-have—it was the only path to frontier performance with a fixed compute budget. The U.S. designed export controls to slow Chinese AI. Instead, they created the conditions for architectural innovation that American labs, drowning in cheap compute, had no incentive to pursue.
The Backfire
Washington's theory of the case was simple: control the chips, control the AI. Deny China access to H100s, and they can't train frontier models.
DeepSeek's success reveals three flaws in this logic—and one that nobody is talking about.
Flaw one: stockpiling. Chinese labs accumulated hardware before restrictions took effect. DeepSeek's 10,000 A100s were purchased legally. The impact of export controls won't fully materialize until these stockpiles need replacing.
Flaw two: scarcity drives innovation. RAND's analysis is blunt: algorithmic advances have halved the compute needed for equivalent AI performance every eight months for over a decade. That's a 262,000-fold reduction in compute requirements since 2012. DeepSeek's efficiency breakthrough isn't an anomaly—it's part of a trend that chip restrictions accelerate rather than prevent.
Flaw three: the efficiency advantage compounds. When Huawei and SMIC eventually produce competitive chips—and they're closer than most Americans realize—China will have both domestic silicon and efficiency-optimized architectures. The U.S. may be creating a future competitor that's better at both.
But here's the flaw nobody mentions:
Export controls assumed capability was the moat. DeepSeek proves deployment economics are the moat. A model that's 5% worse but 95% cheaper wins in the real world—because the real world runs on margins.
OpenAI can claim benchmark leadership all it wants. If a competitor offers 90% of the capability at 5% of the cost, enterprise customers will defect. Not because they don't care about quality—because the gap isn't wide enough to justify 20x the spend. DeepSeek didn't just match GPT-5. It made matching GPT-5 affordable.
What This Means (Depending on Who You Are)
IF YOU'RE BUILDING ON AI APIS:
Use cases that were cost-prohibitive are now viable. Processing 10 books of text costs $17 on GPT-5 and $3 on DeepSeek. At $0.70 per 128K tokens, running an agent 24/7 costs less than a Slack seat. Test DeepSeek on your actual workloads this week. Not to switch—to know what 95% cost reduction feels like on your data.
IF YOU'RE AN EXECUTIVE SETTING AI STRATEGY:
The "best model" is no longer automatically the right choice. DeepSeek V3.2 is MIT-licensed—self-hosting frontier AI is now realistic for regulated industries. Healthcare, finance, and government can process sensitive data without external APIs. The compliance advantages of on-premises deployment may now outweigh the operational costs.
IF YOU'RE EVALUATING AI INVESTMENTS:
The moat thesis just got weaker. If a Chinese startup can match GPT-5 at 5% of the cost while operating under chip restrictions, proprietary model advantages are more fragile than valuations imply. Watch for enterprise customers publicly migrating to open-source deployments. By Q2 2026, at least one Fortune 500 will announce they've moved production workloads to self-hosted DeepSeek weights.
The Caveats You Should Actually Worry About
DeepSeek's API routes through Hong Kong. For enterprises with strict data governance, that's not a minor detail. Self-hosting mitigates this but requires significant infrastructure investment—the full model needs ~700GB VRAM for FP8 inference.
The V3.2-Speciale endpoint—the one that won gold at IMO—expires December 15, 2025. It's a research preview, not permanent infrastructure. The token efficiency is also significantly worse than Gemini 3 Pro; solving Codeforces problems required 77,000 tokens versus Gemini's 22,000.
And Liang himself acknowledged in a 2024 interview that DeepSeek's researchers have ties to Chinese government institutions. For some use cases, that's a dealbreaker regardless of technical merit.
The Paradigm Shift
For two years, the AI industry assumed closed-source meant better. If you wanted frontier reasoning, you paid OpenAI's premium. Open-source was for hobbyists, edge cases, or privacy-constrained deployments where you'd accept worse performance for data control.
That assumption just died.
DeepSeek V3.2-Speciale isn't "good for open-source." It's just good. Gold medal at the Math Olympiad. Superior performance on real-world debugging. And anyone can use it—commercially, without permission, forever.
"For years, Chinese companies monetized innovations developed elsewhere," Liang said in July 2024. "But this isn't sustainable. Our goal isn't quick profits—it's advancing the technological frontier."
He continued: "What we lack isn't capital but confidence. Open-sourcing and publishing papers don't result in significant losses. For technologists, being followed is rewarding. Open-source is cultural, not just commercial. Giving back is an honor."
This is the part that should unsettle Sam Altman. DeepSeek isn't trying to beat OpenAI at OpenAI's game. It's trying to make the game irrelevant. If frontier AI becomes a commodity—freely available, infinitely forkable, improvable by anyone—the value shifts from model providers to application builders. The model layer commoditizes. The action layer captures the value.
Where This Goes
Here's my prediction: within 18 months, the primary differentiator between AI products won't be which model they use. It'll be how they deploy it, what data they fine-tune on, and how efficiently they serve inference. The model itself becomes table stakes—like databases in the 2010s. Everyone has access to PostgreSQL. The winners are determined by what you build on top.
DeepSeek didn't just match GPT-5. It proved that matching GPT-5 can be done by a team with fewer resources, under active sanctions, optimizing for deployment economics from day one. If they can do it, so can others. The frontier just got crowded.
The window to experiment is still open. The window to wait is closing.
A hedge fund manager from Guangdong just changed the math. The question is what you're going to do about it.
• • •

