Back to Articles

Claude Found 22 Firefox Bugs in 14 Days. 14 Were High-Severity. Here's the Exploit.

March 7, 2026
8 min read
Claude Found 22 Firefox Bugs in 14 Days. 14 Were High-Severity. Here's the Exploit.
Claude Opus 4.6 found 22 vulnerabilities in Firefox in 14 days — 14 high-severity — then Anthropic spent $4,000 trying to exploit them. The results reshape AI security research.

Twenty minutes. That's how long it took Claude Opus 4.6 to find its first Use After Free vulnerability in Firefox's JavaScript engine — a bug type that lets attackers overwrite memory with arbitrary malicious content. Over the next two weeks, the model would uncover 21 more.

Anthropic's Frontier Red Team partnered with Mozilla in February 2026 to stress-test Firefox's codebase with AI-assisted security analysis. The results shook both organizations: 22 confirmed vulnerabilities, 14 classified as high-severity by Mozilla, and 112 total unique bug reports across nearly 6,000 C++ files. Those 14 high-severity findings represented nearly one-fifth of all high-severity Firefox vulnerabilities remediated in 2025.

But here's the part that should make every software team pay attention: Claude didn't just find the bugs. It wrote the patches. And then Anthropic spent $4,000 trying to make it exploit them — and mostly failed.

The 20-Minute Discovery That Started Everything

Anthropic's security evaluation began in late 2025 with CyberGym benchmark testing before shifting to real-world vulnerability hunting against Firefox's current release. The team pointed Claude Opus 4.6 at Firefox's SpiderMonkey JavaScript engine first — the same codebase that Mozilla has subjected to decades of fuzzing, static analysis, and manual security review.

Within 20 minutes, Claude identified a Use After Free memory vulnerability. The model then expanded its search across the rest of the browser codebase, ultimately scanning nearly 6,000 C++ files over two weeks.

What makes this significant isn't just the volume. Mozilla's engineers confirmed that Claude found logic errors that traditional fuzzers had never uncovered. Fuzzers excel at finding crashes and assertion failures through random input mutation. But logic errors — where code executes without crashing but does the wrong thing — require the kind of semantic understanding that AI models bring to the table.

Each bug report included minimal test cases, detailed proofs-of-concept, and candidate patches — all written and validated by Claude. Mozilla's security team called the submission quality exceptional, noting they could verify and reproduce issues within hours of receiving reports.

Inside CVE-2026-2796: The JIT Miscompilation Exploit

Anthropic published a full technical deep-dive on one vulnerability that reveals just how sophisticated AI-driven security research has become. CVE-2026-2796 is a JIT miscompilation bug in Firefox's WebAssembly component — and the exploitation chain Claude constructed reads like a graduate-level systems security paper.

The root cause sits in MaybeOptimizeFunctionCallBind(), a SpiderMonkey optimization function. When JavaScript instantiates a WebAssembly module with imports, this function unwraps Function.prototype.call.bind() wrappers to extract the underlying target function. The critical flaw: it validates the wrapper pattern but never checks that the extracted function's type signature matches the module's declared import type.

This creates a type confusion vulnerability. Two code paths access the import's callable field — one safe (routing through ToJSValue conversion) and one unsafe (returning WebAssembly function references directly via ref.func and call_ref, bypassing the interop layer entirely). When a WebAssembly function calls the imported reference through the unsafe path, parameter bytes pass between modules without conversion.

Claude decomposed the exploitation into three classical browser exploit primitives:

Phase 1 — Information Leakage: Claude created an addrof primitive (pass JavaScript object, receive as i64, leak memory address) and a fakeobj primitive (pass i64 address, receive as object reference, forge fake object). The model immediately recognized it needed i64 types for 64-bit pointer leaks.

Phase 2 — Memory Access: Rather than attempting direct ArrayBuffer corruption (which requires a prior arbitrary write — a chicken-and-egg problem), Claude pivoted to WebAssembly GC structs. It leveraged struct.get operations, which compile to machine-level memory loads at fixed offsets, turning them into read64 and write64 primitives.

Phase 3 — Code Execution: With read/write primitives operational, Claude constructed a fake ArrayBuffer object and corrupted its backing store pointer to achieve arbitrary memory access. The entire exploit chain uses only standard JavaScript and WebAssembly APIs — no heap spray or layout manipulation required.

The $4,000 Question: Can AI Exploit What It Finds?

Here's where the story takes a reassuring turn — for now. Anthropic ran several hundred exploitation attempts against the discovered vulnerabilities, burning approximately $4,000 in API credits. The success rate was dismal: only 2 successful exploits out of hundreds of attempts.

Even those two successes came with a massive caveat: they only worked in test environments with Firefox's security features deliberately disabled. The browser's sandbox architecture would have blocked both attacks in any real-world deployment.

As Anthropic's team noted, exploiting vulnerabilities requires chaining multiple weaknesses together. Finding a single high-severity bug isn't enough to compromise a modern browser — you need a sandbox escape, a privilege escalation, and a reliable exploit primitive, all working in concert.

This asymmetry — AI finds bugs far better than it exploits them — currently favors defenders. But Anthropic was candid that this advantage "may not persist as models improve."

Mozilla's Response: Fixes Shipped to Hundreds of Millions

Mozilla's security team moved fast. Engineers began landing fixes within hours of receiving Anthropic's reports, and most vulnerabilities were patched in Firefox 148.0, which shipped in February 2026 to hundreds of millions of users. A handful of remaining fixes are scheduled for the next release.

Mozilla called the collaboration "a powerful new addition in security engineers' toolbox" and has already started integrating AI-assisted analysis into its internal security workflows. The key insight from Mozilla's side: despite decades of the most extensive fuzzing, static analysis, and regular security review, AI found what traditional tools missed.

The Firefox team particularly valued three qualities in Claude's submissions: accompanying minimal test cases that made reproduction instant, detailed proofs-of-concept that showed the attack path, and candidate patches that were ready for review. This wasn't a model dumping raw findings — it was doing the full job of a senior security researcher.

500 Zero-Days and Counting: The Bigger Picture

The Firefox collaboration is part of a larger campaign. Anthropic disclosed that Claude found more than 500 previously unknown zero-day vulnerabilities across well-tested open-source software during Opus 4.6 testing — of which 112 reports went to Mozilla alone.

Anthropic also announced Claude Code Security, now in limited research preview, which packages vulnerability discovery and patching capabilities for developers. The company is expanding its cybersecurity efforts to include Linux kernel vulnerability research and actively recruiting for security-focused positions.

This signals a fundamental shift in how security research operates. The economics have changed: a two-week AI engagement found nearly one-fifth of a year's worth of high-severity bugs in one of the most scrutinized codebases on the internet. The API cost for the entire vulnerability discovery phase was a fraction of what a human security audit would cost.

For open-source projects running on volunteer time, AI security analysis could close the gap between the security standards they aspire to and the resources they actually have. For commercial software teams, the message is starker: if you're not using AI to audit your code, someone else will use AI to find what you missed.

What This Means for Your Codebase

The Firefox findings expose a blind spot in traditional security tooling. Fuzzers catch crashes. Static analyzers catch patterns. Neither catches the logic errors that Claude found — bugs where code runs perfectly but does the wrong thing.

If Firefox — with its decades of security investment, its dedicated fuzzing infrastructure, and its bounty program — had 14 high-severity logic errors lurking in production code, what's hiding in your codebase?

Anthropic's coordinated disclosure with Mozilla followed standard industry norms, but the company acknowledged that processes may need adjustment as model capabilities advance. The implicit message: the window where only well-resourced teams can do AI-assisted security research is closing fast.

Key Takeaways

  • Claude Opus 4.6 found 22 Firefox vulnerabilities in 14 days, with 14 classified as high-severity by Mozilla — nearly one-fifth of all high-severity Firefox bugs remediated in 2025.
  • The first vulnerability — a Use After Free in SpiderMonkey — was discovered in just 20 minutes of exploration across nearly 6,000 C++ files.
  • Claude found logic errors that decades of fuzzing, static analysis, and manual security review had missed — a class of bugs traditional tools are blind to.
  • Anthropic spent $4,000 on exploitation attempts but only succeeded twice, and both exploits required disabling Firefox's sandbox — defenders currently have the advantage.
  • CVE-2026-2796 involved a JIT miscompilation in WebAssembly that Claude exploited using only standard JavaScript and WebAssembly APIs, no heap spray needed.
  • Mozilla shipped fixes in Firefox 148 to hundreds of millions of users, with engineers landing patches within hours of receiving Claude's reports.
  • Anthropic disclosed 500+ zero-day vulnerabilities across open-source software during Opus 4.6 testing, with 112 reports going to Mozilla alone.
S

Skila AI Editorial Team

The Skila AI editorial team researches and writes original content covering AI tools, model releases, open-source developments, and industry analysis. Our goal is to cut through the noise and give developers, product teams, and AI enthusiasts accurate, timely, and actionable information about the fast-moving AI ecosystem.

About Skila AI →
Ai Security Research
Ai News
Claude
Firefox
Cybersecurity
Vulnerability Discovery

Related Resources

Weekly AI Digest

Get the top AI news, tool reviews, and developer insights delivered every week. No spam, unsubscribe anytime.

Join 1,000+ AI enthusiasts. Free forever.