Back to Articles

Perplexity Ditched MCP. The Numbers Behind the Breakup Are Brutal.

March 17, 2026
7 min read
Perplexity Ditched MCP. The Numbers Behind the Breakup Are Brutal.
Perplexity's CTO just pulled the plug on MCP internally. Cloudflare found 81% context window waste. 43% of servers have injection flaws. The protocol Anthropic called 'USB-C for AI' is hitting a wall in production — and the companies saying so are getting louder.

On March 11, Perplexity CTO Denis Yarats walked on stage at the Ask 2026 conference and said what thousands of production engineers have been muttering over Slack threads for months: they're moving away from MCP internally.

This isn't some fringe startup complaining about a protocol spec. Perplexity — a company that shipped its own MCP Server barely six months ago — just told the world that the protocol Anthropic billed as "USB-C for AI" isn't cutting it where it actually matters: production.

And the data backing up that decision? It's worse than you'd expect.

81% of Your Context Window, Gone Before You Ask a Question

Here's the number that should alarm every developer building with MCP: Cloudflare's research found that complex agents can waste up to 81% of their context window on MCP overhead alone. That's tool schemas, parameter definitions, authentication hints, call traces, and error metadata — all consuming tokens before your agent processes a single user query.

The GitHub MCP server burns roughly 50,000 tokens just initializing. A database server with 106 tools? That's 54,600 tokens of overhead. MCPGauge measured input-token budgets inflating by up to 236x in production environments.

You're paying for context window capacity you can't use. At $15 per million tokens for a frontier model's context, that's not a rounding error — it's a design tax on every agent interaction.

The Authentication Mess Nobody Warned You About

Yarats cited authentication friction as his second core complaint. Each MCP server manages its own auth flow — API keys, OAuth tokens, session cookies — creating a distributed identity management nightmare as you connect more services.

In Perplexity's case, managing token distribution, refreshes, permissioning, and incident response across multiple MCP protocol layers became operationally heavier than the problems MCP was supposed to solve. The "universal standard" was creating a unique configuration headache per server.

Contrast this with Perplexity's new Agent API: one endpoint, one API key, OpenAI-compatible syntax. You get GPT-5.4, Claude Opus 4.6, Gemini 3.1 Pro, Grok 4.1 Fast, and Sonar models through a single POST https://api.perplexity.ai/v1/agent call. Web search costs $0.005/call. URL fetch is $0.0005. Function calling is free. Zero protocol overhead.

43% of MCP Implementations Have Injection Flaws

The context window problem is expensive. The security picture is frightening.

An audit of MCP server implementations found that 43% had command injection vulnerabilities. 30% were vulnerable to server-side request forgery. 22% allowed arbitrary file access. And among 2,000 internet-exposed MCP servers surveyed, the count with zero authentication was... all of them.

The mcp-remote npm package, with over 558,000 downloads, had a CVSS 9.6 vulnerability. Invariant Labs demonstrated a malicious MCP server that silently exfiltrated WhatsApp conversation histories. Even Anthropic's own MCP Inspector had an unauthenticated remote code execution flaw.

MCP's security model essentially trusts every server you connect. When your AI agent has access to your file system, your databases, and your messaging apps, "trust by default" isn't a philosophy — it's a liability.

Perplexity Isn't Alone — The Rebellion Is Growing

Yarats didn't break ranks solo. Y Combinator President Garry Tan built a CLI instead of using MCP, publicly calling it inadequate for production. Pieter Levels, the indie hacker behind Nomad List, declared the protocol dead. Nx, the build system company, deleted most of their MCP tools in February 2026, replacing them with "Skills" that showed measurably better accuracy.

The pattern is consistent: teams that move from prototyping to production hit the same walls. Cumulative latency from multi-hop routing (agent runtime → protocol adapter → tool server → external API). Stale sessions that force restarts. Zombie processes with no timeout detection. One developer documented 14 forced Claude Code restarts over 7 days due to unresponsive MCP connections.

The Bull Case: MCP Still Has 5,800+ Servers and Big-Name Backing

Before writing MCP's obituary, consider the institutional momentum. OpenAI, Google DeepMind, and Microsoft have committed to MCP support. The Linux Foundation adopted the protocol. Microsoft is embedding MCP features into Windows 11 and Copilot Studio. Gartner predicts 75% of API gateway vendors will add MCP features by the end of 2026.

The ecosystem numbers are real: 5,800+ verified servers, 17,000+ across all registries, roughly 50,000 GitHub repositories, and 300+ MCP clients. Google just auto-enabled their BigQuery MCP server for all users starting today, March 17. That's not a dying protocol — it's one with serious growing pains.

MCP advocates argue that dynamic tool discovery — agents finding and using tools at runtime without explicit programming — is the protocol's irreplaceable value. Hardcoded API integrations can't replicate that capability. And Anthropic's 2026 roadmap, published March 9, directly addresses the production complaints: transport scalability, enterprise-grade auth, and governance improvements.

What This Actually Means for You

If you're building production agents today, Yarats and the data are telling you something specific: MCP works great for experimentation and local development. It's a genuine headache at scale.

The pragmatic approach? Use MCP for development and prototyping where its plug-and-play discovery is genuinely useful. Switch to direct API calls for production workloads where you know exactly which tools you need and can't afford 81% context waste or authentication complexity.

Perplexity didn't abandon AI agents. They streamlined the plumbing. Their Agent API still connects to six frontier models with built-in search and function calling. The difference is they absorbed the complexity internally instead of distributing it across protocol layers.

The question isn't whether MCP will survive — its institutional backing virtually guarantees it will. The question is whether Anthropic can fix the production-readiness gaps before more teams follow Perplexity's lead. The 2026 roadmap says they know the problems. The numbers say they're running out of time to fix them.

Key Takeaways

  • Perplexity CTO Denis Yarats announced at Ask 2026 that the company is moving away from MCP internally
  • Cloudflare research shows MCP overhead can consume up to 81% of available context window tokens
  • 43% of MCP implementations have command injection vulnerabilities
  • Y Combinator's Garry Tan and Pieter Levels have also publicly moved away from MCP
  • Despite criticism, MCP has 5,800+ servers and backing from OpenAI, Google, and Microsoft
  • Perplexity's Agent API offers one endpoint with six frontier models at direct provider rates
  • Practical strategy: use MCP for development, direct APIs for production workloads
S

Skila AI Editorial Team

The Skila AI editorial team researches and writes original content covering AI tools, model releases, open-source developments, and industry analysis. Our goal is to cut through the noise and give developers, product teams, and AI enthusiasts accurate, timely, and actionable information about the fast-moving AI ecosystem.

About Skila AI →
Mcp Protocol
Perplexity
Ai Agents
Api Design
Context Window
Ai Infrastructure

Related Resources

Weekly AI Digest

Get the top AI news, tool reviews, and developer insights delivered every week. No spam, unsubscribe anytime.

Join 1,000+ AI enthusiasts. Free forever.