Back to Articles

Everyone Thinks AI Makes Coders Faster. 22,000 Devs Say Otherwise.

June 4, 2026
9 min read
Everyone Thinks AI Makes Coders Faster. 22,000 Devs Say Otherwise.
Uber burned its entire 2026 AI budget in four months with no measurable productivity gain. Amazon killed its AI leaderboard after engineers gamed it. The data from 22,000 developers says the thing everyone believes about AI coding tools is wrong: you feel faster, the system gets slower.

Uber burned through its entire 2026 AI coding budget in four months. The measurable productivity increase? None.

Amazon quietly shut down an internal leaderboard that tracked AI tool usage after engineers started gaming it — racking up AI calls to look productive. And in February 2026, researchers at METR tried to run a study on AI's effect on developers, only to discover the developers refused to participate without their AI tools. They had to switch to surveys instead.

That last detail is the whole story in miniature. Developers are now so attached to AI coding tools that you cannot pry the tools away long enough to measure whether they actually help. They feel essential. The data says something much weirder.

The belief everyone holds

Ask almost any developer in 2026 and you will hear a version of the same claim: AI made me a 10x engineer. Adoption backs up the enthusiasm. Roughly 90 to 93% of developers now use AI coding tools, according to DORA's 2025 report, JetBrains surveys, and a DX study of 121,000 developers. By some estimates, around 27% of production code is now AI-generated.

So the productivity numbers should be exploding. Universal adoption of a tool everyone swears makes them faster should show up as a massive jump in shipped software, fewer bugs, shorter lead times. Something.

It doesn't. And the gap between what developers feel and what the telemetry shows is the most important — and most uncomfortable — story in software right now.

The study that should stop the room

Start with METR, a research lab that ran the rare thing this debate desperately needs: a randomized controlled trial. They took experienced open-source developers working on their own repositories — people who knew the codebases cold — and measured them on 246 real tasks, with and without AI assistance.

The developers predicted AI would make them about 20% faster. After the tasks, they reported it had made them roughly 20% faster.

They were measured to be 19% slower.

Read that again. Not slightly off. A 39-point swing between perception and reality. People who believed AI sped them up by a fifth were actually slowed down by nearly a fifth. (METR later published a February 2026 update revising the measured slowdown to around 4% after correcting for selection bias — still negative, still nowhere near the 20% gain everyone felt.)

This is the part that should unsettle you. The slowdown is invisible from the inside. You feel productive because the AI is doing visible work — generating code, filling the screen, answering instantly. The time you lose reviewing its output, fixing its mistakes, and re-prompting it doesn't register as friction. It registers as progress.

22,000 developers, two years of data

One trial is a data point. The Faros AI Engineering Report 2026 is a flood.

Faros analyzed telemetry from 22,000 developers across more than 4,000 teams over two years — not surveys, not vibes, actual engineering-system data. They called the pattern the "Acceleration Whiplash," and the numbers explain why.

The output side looks great. Per developer, epics completed rose 66%, task throughput rose 33.7%, and PR merge rate rose 16.2%. If you stopped reading there, AI is a triumph.

Keep reading.

  • Bugs per developer: up 54%. In the 2025 report this was up 9%. As teams matured their AI programs, the defect rate didn't flatten — it steepened.
  • Incidents per PR: up 242.7%. More of what ships breaks something in production.
  • Median time in code review: up 441.5%. Reviewers now spend over five times longer wading through PRs.
  • Code churn: up 861%. Code written and then rewritten or deleted shortly after — the signature of throwing output at the wall.
  • PRs merged without review: up 31.3%. The safety valve is being bypassed exactly when the code is least trustworthy.

Here is the math that matters. Output went up by tens of percent. Bugs, incidents, churn, and review time went up by hundreds. You are generating code faster than your system can safely absorb it.

So where did the productivity go?

This is the core of the myth-bust. "More code" is not "more productivity." They are different things, and AI is brilliant at the first and indifferent to the second.

Think of your engineering org as a pipeline. AI dramatically widens the front — code generation. But the back of the pipe is human-paced: review, testing, debugging, incident response, maintenance. Flood the front with more output and the back becomes the bottleneck. The extra code piles up in review queues, surfaces as incidents, and comes back as rework.

The result: six independent studies — Faros, METR, DX, DORA, and others — converge on roughly a 10% system-level productivity gain despite near-universal adoption. Not the 10x everyone feels. Single digits, at the level that actually pays the bills.

TechCrunch captured the downstream cost in its May 2026 reporting: one startup founder estimated companies now spend around 44% of their AI tokens on fixing bugs their own AI generated. You are paying the AI to clean up after the AI.

The quality problem nobody priced in

The bugs aren't just more frequent. They're more dangerous.

Veracode found that 45% of AI-generated code samples introduced a security vulnerability. CodeRabbit's analysis of open-source pull requests found AI-produced code carried 1.7x more problems than human-written code, and other measures put the security-vulnerability multiple as high as 2.74x. Meanwhile, Stack Overflow's 2025 survey found 66% of developers cite "AI solutions that are almost right, but not quite" as their single biggest frustration — the exact failure mode that costs the most, because subtle wrong is harder to catch than obviously broken.

Researchers at Singapore Management University warned in April 2026 that AI-generated code introduces long-term maintenance costs into real projects — debt you take on today and pay down for years. One programmer quoted by TechCrunch called the trade "permanent indenture": short-term speed bought with long-term burden.

The contrarian take you should actually hold

Here is where most hot takes go wrong, so let's not. The answer is not "AI coding tools are bad, stop using them." The data doesn't support that, and neither do I.

AI coding tools are genuinely useful for specific jobs: boilerplate, well-specified functions, test scaffolding, learning an unfamiliar API, exploring a new codebase. The problem is the story we tell about them — that they make you uniformly, dramatically faster — and the behavior that story produces: shipping AI output with less scrutiny, not more.

The teams that get the ~10% gain without the bug explosion do three things differently.

They treat AI output like a junior developer's pull request. Useful, fast, and absolutely not trusted by default. Every line gets reviewed as if a first-week hire wrote it, because functionally one did.

They measure system outcomes, not individual speed. Lines of code and PR count are vanity metrics now — AI inflates them trivially. Lead time, incident rate, and change-failure rate are what actually moved, and usually not in the direction people expected.

They invest in the back of the pipeline. If AI widens code generation, you have to widen review and testing to match. That's where tools like an AI-powered code reviewer earn their place — not generating more code, but catching what the generators got wrong. Our roundup of the best AI coding tools separates the generators from the guardrails, and the dedicated reviewer Qodo exists precisely because someone has to catch the bugs the AI writes.

What this means for your job

If you're a developer, the takeaway isn't guilt. It's awareness. The slowdown is invisible from the inside, so trust the system metrics over the feeling. Use AI where it's strong, review its output where it's weak, and stop measuring your week in lines shipped.

If you run a team, the warning is sharper. Uber and Amazon are not fringe cases — they're the leading edge of a pattern thousands of teams are about to hit. Budgets get torched, leaderboards get gamed, and the dashboard says "more PRs!" while production gets shakier. Watch incidents and lead time, not throughput.

And if you're evaluating whether to go all-in on autonomous coding agents, go in with eyes open. Open-source agents like OpenHands can genuinely out-produce a human on the right task — which means they can also out-bug one. The tooling to keep AI coding sane, from context-continuity systems to dedicated reviewers, is now its own fast-growing category for exactly this reason.

The myth was that AI coding tools make developers faster. The truth, from 22,000 developers and two years of data, is that they make developers produce more — and producing more is only the same thing as being faster if your system can absorb it. Most can't. Yet.

Frequently Asked Questions

Do AI coding tools actually make developers faster?

Not the way most people believe. Individual output (PRs, lines of code, epics) rises sharply, but the Faros AI Engineering Report 2026 found bugs per developer up 54%, incidents per PR up 242.7%, and review time up 441.5%.

What did the METR study find about AI coding tools?

METR ran a randomized controlled trial with experienced developers on 246 real tasks. Participants believed AI made them about 20% faster, but they were measured to be 19% slower — a 39-point gap between perception and reality.

Why do developers feel faster with AI if the data says otherwise?

The AI does visible, instant work — generating code that fills the screen — which feels like progress. The time lost reviewing output, fixing subtle errors, and re-prompting doesn't register as friction.

Should I stop using AI coding tools because of this?

No. The data argues for using them more carefully, not abandoning them. AI is strong for boilerplate, test scaffolding, and exploring unfamiliar code.

How much do the bugs from AI-generated code actually cost?

Veracode found 45% of AI-generated code samples introduced a security vulnerability, and CodeRabbit found AI code carried 1.7x more problems than human code.

Key Takeaways

  • The Faros AI Engineering Report 2026 (22,000 developers, 4,000+ teams, two years of telemetry) found bugs per developer up 54%, median time in PR review up 441.5%, incidents per PR up 242.7%, and code churn up 861%.
  • METR's controlled trial found experienced developers were 19% SLOWER with AI tools while believing they were 20% faster — a 39-point gap between perception and measured reality.
  • Six independent studies converge on roughly a 10% system-level productivity gain despite 90%+ adoption of AI coding tools.
  • Output rises (more PRs, more lines, more epics) but bugs, review time, incidents, and lead time rise faster — so net throughput barely moves.
  • The fix is not abandoning AI but treating its output as a junior developer's: review hard, test harder, and measure system outcomes instead of individual speed.
S

Skila AI Editorial Team

The Skila AI editorial team researches and writes original content covering AI tools, model releases, open-source developments, and industry analysis. Our goal is to cut through the noise and give developers, product teams, and AI enthusiasts accurate, timely, and actionable information about the fast-moving AI ecosystem.

About Skila AI →
Ai Coding Tools
Developer Productivity
Ai Code Quality
Metr Study
Faros Ai Engineering Report
Ai Productivity Paradox

Related Resources

Weekly AI Digest

Get the top AI news, tool reviews, and developer insights delivered every week. No spam, unsubscribe anytime.

Join 1,000+ AI enthusiasts. Free forever.