Back to Articles

MiniMax M2.5 Matches Claude at 5% the Cost. The AI Pricing War Just Escalated.

Skila AI Editorial Team

March 6, 2026

7 min read

MiniMax M2.5 Matches Claude at 5% the Cost. The AI Pricing War Just Escalated.

MiniMax M2.5 scores 80.2% on SWE-Bench at 1/20th the cost of Claude Opus. A $30B Chinese AI startup just broke the price-performance curve.

A Shanghai startup just posted an 80.2% score on SWE-Bench Verified — within 0.6 points of Claude Opus 4.6 — while charging $0.30 per million input tokens. That's roughly one-twentieth the price of Anthropic's flagship model. MiniMax, the company behind M2.5, isn't just competing on benchmarks anymore. It's dismantling the assumption that frontier AI performance requires frontier pricing.

The numbers hit different when you do the math: a typical SWE-Bench coding task costs about $0.15 with M2.5. The same task on Claude Opus 4.6 runs about $3.00. For teams running thousands of API calls daily, that's the difference between a $450/month bill and a $9,000 one.

MiniMax M2.5: What the Benchmarks Actually Show

M2.5 is a Mixture of Experts model with 230 billion total parameters, but only 10 billion active during inference — a design that keeps costs low while maintaining strong performance. Released on February 12, 2026, it was trained using CISPO (a reinforcement learning algorithm) across hundreds of thousands of real-world environments using Forge, MiniMax's in-house agent-native RL framework that achieves roughly 40x training speedup.

Here's where it gets interesting. On SWE-Bench Verified, M2.5 scores 80.2%, trailing Claude Opus 4.6 (80.8%) and Opus 4.5 (80.9%) by a hair. But on Multi-SWE-Bench — which tests complex multi-file engineering tasks — M2.5 actually overtakes Opus 4.6 with 51.3% versus 50.3%. When code changes span multiple files and require coordinated reasoning, MiniMax's model holds up better.

The gap widens dramatically on tool calling. In the BFCL Multi-Turn benchmark, M2.5 scored 76.8% while Claude Opus 4.6 managed just 63.3% — a 13.5-point lead for the Chinese model. For developers building agentic workflows where the model needs to call APIs, execute functions, and chain tools together, that difference is enormous.

On BrowseComp (web browsing tasks with context management), M2.5 hit 76.3%. And it completes SWE-Bench tasks 37% faster than its predecessor M2.1, matching Claude Opus 4.6's 22.9-minute average runtime.

MiniMax M2.5 Pricing Breaks the Cost Curve

Two versions are available through the API:

M2.5 (50 tokens/sec): $0.30 per million input tokens, $1.20 per million output tokens. Running it continuously for one hour at 100 output tokens/second costs about $1.

M2.5-Lightning (100 tokens/sec): Double the speed at $0.30 input / $2.40 output per million tokens. Even the premium tier costs a fraction of Western competitors.

For context, Claude Opus 4.6 charges $15 per million input tokens and $75 per million output. GPT-5 and Gemini 3 Pro sit in similar ranges. MiniMax is delivering 95%+ of the coding performance at 5% of the price. The 205K token context window means you can feed it entire codebases without worrying about truncation.

From Startup to $30 Billion: MiniMax's Rapid Rise

Founded in 2021 by Yan Junjie, Yang Bin, and Zhou Yucong, MiniMax's trajectory reads like a compressed version of OpenAI's story. Hillhouse Capital was the first investor, reportedly offering a term sheet with a blank valuation field after a three-hour pitch. The founders wrote down $200 million pre-money for a $30 million raise.

By early 2023, they'd closed a $260 million round at a $1.15 billion valuation, bringing in Tencent, Xiaomi, and Xiaohongshu. In March 2024, Alibaba led a $600 million round at $2.5 billion. Then came the Hong Kong IPO in January 2026, raising HK$4.8 billion ($614 million). Since listing, MiniMax shares have more than quadrupled, pushing the company's market cap to roughly $30 billion — approaching JD.com and Kuaishou territory.

Revenue growth tells the same story: a reported 159% increase following the IPO, driven by developer adoption of the M-series API.

What M2.5 Can Actually Do (Beyond Coding)

While the coding benchmarks grab headlines, M2.5's capabilities extend well beyond software engineering. The model handles office productivity tasks — generating and manipulating Word documents, Excel spreadsheets, and PowerPoint presentations. It switches between software environments, works across agent-human teams, and performs financial modeling.

Search grounding is built in, with M2.5 requiring 20% fewer search rounds than M2.1 to reach equivalent results. The model supports multilingual coding across Python, TypeScript, Go, C++, Rust, and Java, making it practical for polyglot codebases.

The 205K context window supports complex reasoning chains. M2.5 is a reasoning model that uses extended thinking (chain-of-thought) to work through problems before answering — similar to how Claude's extended thinking works, but at a fraction of the token cost.

What This Means for the AI Market

MiniMax M2.5 is the latest signal that the cost of frontier AI is collapsing faster than anyone predicted. When a Chinese startup can match Cursor's underlying model performance at 5% of the cost, every AI-powered product's margin structure comes into question.

For developers using tools like Aider or building custom coding agents with frameworks like CrewAI, the economics of switching to MiniMax's API are compelling. A team running 50,000 API calls per day would save roughly $8,500/month compared to Claude Opus pricing.

The counterargument is straightforward: benchmarks aren't everything. Claude's strength lies in instruction following, nuanced reasoning, and safety guardrails that Chinese models haven't been independently audited for. Enterprise customers with compliance requirements won't switch based on price alone. And Anthropic's ecosystem — from MCP servers to the Claude Code CLI — creates switching costs that raw API pricing doesn't capture.

But for cost-sensitive applications like batch processing, code review automation, and internal tooling, M2.5 just became the obvious choice. The question isn't whether MiniMax will take market share — it's how much, and how fast Anthropic, OpenAI, and Google will respond on pricing.

Key Takeaways

✓MiniMax M2.5 scores 80.2% on SWE-Bench Verified, within 0.6 points of Claude Opus 4.6, at roughly 1/20th the price
✓On Multi-SWE-Bench (complex multi-file tasks), M2.5 actually outperforms Opus 4.6: 51.3% vs 50.3%
✓Tool calling performance shows a 13.5-point lead over Claude (76.8% vs 63.3% on BFCL Multi-Turn)
✓Pricing starts at $0.30/1M input tokens — a typical coding task costs $0.15 vs $3.00 on Claude
✓MiniMax IPO'd in Hong Kong in January 2026 and is now valued at approximately $30 billion
✓The 230B parameter MoE architecture uses only 10B active parameters during inference, keeping costs structurally low
✓Enterprise adoption will hinge on safety auditing, compliance certification, and ecosystem maturity — not just benchmarks

S

Skila AI Editorial Team

The Skila AI editorial team researches and writes original content covering AI tools, model releases, open-source developments, and industry analysis. Our goal is to cut through the noise and give developers, product teams, and AI enthusiasts accurate, timely, and actionable information about the fast-moving AI ecosystem.

About Skila AI →

Minimax M25

Ai News

Ai Pricing

Chinese Ai

Llm Benchmarks

Swe Bench

Related Resources

AI Tools Directory

Find and compare AI tools related to Minimax M25

Open-Source Repositories

Explore related open-source projects, MCP servers, and agent skills

Weekly AI Digest

Get the top AI news, tool reviews, and developer insights delivered every week. No spam, unsubscribe anytime.

Join 1,000+ AI enthusiasts. Free forever.