Anthropic Just Hit $965B. You Are Overpaying 7x For AI.
Anthropic is now worth more than OpenAI. On May 28, 2026, it closed a $65 billion Series H at a $965 billion post-money valuation. That edges past OpenAI's $852 billion. The engine behind the number is Claude Code, the coding agent whose run-rate revenue crossed $47 billion earlier that month.
Here is the part nobody puts on the slide. The exact same monthly AI workload that costs you around $2,500 on Claude Opus and $3,000 on GPT-5.5 costs about $348 on DeepSeek.
You are paying the premium. They are becoming a trillion-dollar company.
This is the AI API pricing war, and it is the single most important line item on your 2026 infrastructure bill. Let's do the actual math, because the gap is wider than you think.
The $965B number, and where it comes from
Anthropic's Series H raised $65 billion. Roughly $15 billion of that was previously committed capital from hyperscalers, including $5 billion from Amazon announced in April. It was co-led by Altimeter, Dragoneer, Greenoaks, Sequoia, Capital Group, Coatue, and D1 Capital Partners. Most analysts read it as the last private round before an IPO.
OpenAI's last raise was a $122 billion round in March at an $852 billion valuation. So Anthropic didn't just catch up. It passed the company that defined the category.
What changed between Anthropic's Series G in February and now? One thing, mostly: developers kept paying for tokens. Claude Code adoption climbed across enterprise customers, and run-rate revenue hit $47 billion. The round landed the same day Anthropic shipped Claude Opus 4.8, tuned for agentic tasks and coding.
Translation: the valuation is built on output tokens. Your output tokens.
DeepSeek just made the math impossible to ignore
On May 23, 2026, DeepSeek locked in a permanent 75% price cut on its V4-Pro model. Not a promo. A new floor. After the discount window closed on May 31, the standing rate became one quarter of the old price.
The numbers that matter: V4-Pro output now sits at $0.87 per million tokens, down from $3.48. Cache-hit input dropped to fractions of a cent. The headline is the output price, because for any agent that writes code, drafts content, or returns long responses, output is where your bill actually lives.
InfoWorld put it bluntly in its coverage: "high-margin, high-consumption token pricing models from Anthropic and OpenAI are becoming harder to justify." When the cheapest credible frontier-class model is 7x below the premium one, "harder to justify" is generous.
The per-token math, with no marketing in the way
Here is the published list pricing as of late May 2026, per million tokens:
- DeepSeek V4: ~$1.74 input / $3.48 output (V4-Pro output drops to $0.87 after the cut)
- Claude Opus 4.7: $5 input / $25 output
- GPT-5.5: $5 input / $30 output
- Gemini 3.1: mid-tier, around $2.50 output
Now scale it to a real workload. Say your product generates 100 million output tokens a month — a mid-size agent in production, nothing exotic.
- DeepSeek: ~$348/month
- Claude Opus 4.7: ~$2,500/month
- GPT-5.5: ~$3,000/month
That is a 7x gap to Claude and roughly 9x to GPT. Annualized, you are looking at $4,176 versus $30,000 versus $36,000 for the identical token count.
Zoom out across the whole market and the spread is almost comical. Between the cheapest open models and the priciest frontier APIs, the gap now hits 300x on input and 450x on output. Two products doing fundamentally the same thing — turning a prompt into text — priced 450 times apart.
So why does anyone pay the premium?
Because sometimes it's worth it. Let's be honest about that, because pretending otherwise is how you ship a worse product to save $2,000.
Frontier models still win on the hardest agentic tasks. Claude Opus 4.8 holds an edge on multi-step coding, long-horizon planning, and self-correction — the stuff where a 3% accuracy bump prevents a production incident that costs far more than the token spread. If your agent is refactoring a payments system, you do not optimize for $0.87.
But here's the trap: most workloads are not that. Classification, summarization, data extraction, first-draft generation, routing, internal tooling — the bulk of real API traffic is routine. Paying frontier rates for routine work is how the $965B valuation gets funded.
The pattern that wins in 2026 is routing by task: cheap model for the 80% that's routine, frontier model for the 20% that's hard. Teams doing this cut their bills 60–80% without users noticing a quality drop. The ones still hard-coding a single premium model are the ones writing the checks that build trillion-dollar startups. If you live in Claude Code, frameworks like Claude Forge make multi-model and cost-aware workflows far easier to wire up.
How the price war actually started
This didn't happen overnight. For two years the frontier labs operated on a simple assumption: capability commands a premium, and developers will pay it because the alternative is shipping a worse product. That held while open models trailed badly on coding and reasoning.
Then the gap closed. Open-weight models stopped being toys. By early 2026, DeepSeek's V4 line was scoring within a few points of frontier models on mainstream coding and reasoning benchmarks — close enough that for everyday tasks, users couldn't tell the difference in a blind test.
Once capability converged, price became the only lever left. DeepSeek pulled it hard. A 75% cut isn't a discount; it's a declaration that inference is becoming a commodity. And commodities don't sustain 450x markups for long.
The frontier labs know this. It's why Anthropic raced to ship Opus 4.8 the same day it announced the round — capability is the only thing that justifies the premium, so the premium players have to keep widening the capability gap faster than the cheap models close it. That's an expensive race, and it's exactly why they need your $2,500-a-month checks.
The hidden cost nobody budgets for
Here's what makes the gap worse than the sticker price suggests: token consumption is not fixed. Agentic workloads burn tokens in loops. A single complex task can fire dozens of model calls — plan, act, observe, retry. Multiply your per-token rate by that loop count and the premium isn't 7x, it's 7x on a number that's already inflated.
This is where context efficiency matters more than raw price. A model that re-reads your entire codebase on every query burns 10x the tokens of one fed only the relevant slice. That's why teams pair a cheaper model with retrieval infrastructure — the cheap rate and the lower token volume compound into savings the frontier loyalists never see.
What this means for your stack right now
Three concrete moves.
1. Audit your output-token spend, not your input. Output is 5–10x the price of input on premium models and it's where the bill compounds. If you don't know your output-to-input ratio, you don't know your real cost structure.
2. Benchmark the cheap model on your actual tasks. Not on a leaderboard — on your prompts, your data, your eval set. DeepSeek V4-Pro and other open-weight models clear the bar for a shocking share of production work. The only way to know is to run it.
3. Build a router, not a religion. Loyalty to a single lab is the most expensive habit in AI engineering. The cost-effective architecture sends each request to the cheapest model that passes your quality gate. Tools like semantic-caching layers and codebase-context servers — for example the Claude Context MCP server — cut the token volume itself, which multiplies every dollar you save on rates.
The pricing war isn't slowing down. DeepSeek's cut forces a response. When the floor drops 75%, the premium players either justify the gap with capability or quietly follow the price down. Either way, the developer who's paying attention wins.
Anthropic earned its valuation. The question is whether you need to keep funding it 7x over for work a $348 model could do — and whether the margin you reclaim is better spent turning your own AI agents into a product instead.
Frequently Asked Questions
What is the AI API pricing war?
The AI API pricing war is the 2026 race among model providers to undercut each other on per-token costs. It escalated when DeepSeek locked a permanent 75% cut on V4-Pro, pushing output pricing to $0.87 per million tokens — roughly 7–9x below frontier APIs like Claude Opus 4.7 and GPT-5.5.
How much cheaper is DeepSeek than Claude or GPT?
For 100 million output tokens a month, DeepSeek costs about $348, versus roughly $2,500 on Claude Opus 4.7 and $3,000 on GPT-5.5. That's a 7x gap to Claude and about 9x to GPT for the identical token volume.
Why is Anthropic worth $965 billion?
Anthropic closed a $65 billion Series H at a $965 billion post-money valuation in May 2026, overtaking OpenAI's $852 billion. The valuation is driven primarily by Claude Code, whose run-rate revenue crossed $47 billion — meaning developer API spend is the engine behind the number.
Should I switch from Claude to DeepSeek to save money?
Not blindly. Frontier models still win on the hardest agentic and coding tasks, where a small accuracy gain prevents costly failures. The smarter move is task-based routing: a cheap model for routine work (the bulk of traffic) and a frontier model for the hard 20%, which typically cuts bills 60–80% with no visible quality loss.
How big is the price gap across AI models in 2026?
The spread between the cheapest open models and the most expensive frontier APIs now reaches roughly 300x on input tokens and 450x on output tokens — two products doing essentially the same job, priced hundreds of times apart.
Key Takeaways
- ✓Anthropic closed a $65B Series H at a $965B post-money valuation, passing OpenAI's $852B — funded largely by Claude Code, whose run-rate revenue crossed $47B.
- ✓DeepSeek locked a permanent 75% price cut on V4-Pro: $0.87 per million output tokens, roughly 1/4 of its old rate.
- ✓100M output tokens a month costs about $348 on DeepSeek, $2,500 on Claude Opus 4.7, and $3,000 on GPT-5.5 — a 7–9x output gap.
- ✓The spread between the cheapest and priciest models now reaches 300x on input and 450x on output.
- ✓For most production workloads, the frontier premium buys marginal quality gains you may not need — route by task, not by brand.
Skila AI Editorial Team
The Skila AI editorial team researches and writes original content covering AI tools, model releases, open-source developments, and industry analysis. Our goal is to cut through the noise and give developers, product teams, and AI enthusiasts accurate, timely, and actionable information about the fast-moving AI ecosystem.
About Skila AI →Related Resources
Weekly AI Digest
Get the top AI news, tool reviews, and developer insights delivered every week. No spam, unsubscribe anytime.
Join 1,000+ AI enthusiasts. Free forever.