AI News for Developers
AI tutorials, coding tool reviews, and developer-focused analysis. Stay sharp with the latest AI developments for software engineering.
MuleRun Launched Yesterday With 210K Users in 10 Days. I Tested the 'Self-Evolving AI' Claims.
MuleRun launched March 18 with 210K users in 10 days. Each AI agent runs in its own VM with persistent memory and a three-tiered self-evolution engine. But do the claims hold up?
310,000 GitHub Stars in 60 Days: The Open-Source Project China Just Banned
OpenClaw went from 9,000 to 310,000+ GitHub stars in under two months — the fastest growth ever recorded. Then the Chinese government banned it from state enterprises. Here's what it actually does, and why governments are paying attention.
Perplexity Ditched MCP. The Numbers Behind the Breakup Are Brutal.
Perplexity's CTO just pulled the plug on MCP internally. Cloudflare found 81% context window waste. 43% of servers have injection flaws. The protocol Anthropic called 'USB-C for AI' is hitting a wall in production — and the companies saying so are getting louder.
TestSprite Turns AI Code From 42% Pass Rate to 93% — I Tested It Against Cypress and Playwright
TestSprite claims AI-generated code goes from 42% to 93% pass rate in one iteration. I tested that claim against Cypress and Playwright across three real projects.
NVIDIA Just Made Blackwell Look Like a Prototype. The Vera Rubin GPU at GTC 2026, Explained.
Jensen Huang just announced Vera Rubin at GTC 2026 — NVIDIA's next-generation GPU with 288GB HBM4 memory that makes Blackwell look like a stepping stone. Here's every announcement that actually matters for AI developers.
Windsurf Beat Cursor as the #1 AI IDE in 2026. I Tested Both for 3 Months to Find Out Why.
Cursor held the crown for two years. Then Windsurf's dramatic corporate saga — OpenAI acquisition collapse, Google's $2.4B talent raid, and a surprise buyout by Cognition — ended with a superior product. Here's the honest verdict after 3 months testing both.
Microsoft Taught a 15B Model to Decide When NOT to Think — and It's Beating Models 10x Its Size
Microsoft's Phi-4-reasoning-vision-15B knows when to activate reasoning and when to skip it entirely — and it's hitting 80-90% of frontier model accuracy on key benchmarks at a fraction of the compute cost.
Motion Review 2026: The AI Calendar That Schedules Your Day So You Don't Have To
Motion automatically builds your daily schedule from your task list, calendar, and deadlines — then rebuilds it in real time when anything changes. Here's whether the $19-34/month price tag is worth it.
Anthropic Cut Long-Context Pricing in Half. The Same Week Musk Admitted xAI 'Was Not Built Right.'
Claude Opus 4.6 and Sonnet 4.6 now support 1 million token context at flat pricing — eliminating the 100% surcharge that made long-context prohibitive. The same week, Musk publicly admitted xAI failed to compete with Claude Code and is rebuilding from scratch.
NVIDIA's NemoClaw Is the Enterprise AI Agent Platform OpenAI Doesn't Want to Exist
NVIDIA is launching NemoClaw at GTC 2026 — an open-source, hardware-agnostic enterprise AI agent platform that directly challenges OpenAI's growing control over autonomous AI deployments.
Gemini 3.1 Pro Hit 77% on ARC-AGI-2. Here's What That Number Means for AI Reasoning.
Google DeepMind's Gemini 3.1 Pro launched on February 19 with benchmark scores that quietly redefined the frontier: 77.1% on ARC-AGI-2, 94.3% on GPQA Diamond, and 80.6% on SWE-Bench Verified. With a 1M token context window and native multimodal reasoning, here's what those numbers actually mean.
GPT-5.4 Just Hit 83% on Pro-Level Work. In One Generation, OpenAI Added Native Computer Control.
OpenAI's GPT-5.4, released March 5, 2026, is the first general-purpose model with native computer-use built in — and it matches or exceeds professionals in 83% of real-work comparisons, up from 70.9% just one model generation ago.
Claude Opus 4.6 Cracked a 30-Year Math Problem That Stumped a Computing Legend — in One Hour
Donald Knuth — the man who literally wrote the book on algorithms — spent weeks on a graph theory conjecture. Claude Opus 4.6 solved it in 31 steps. Knuth's response? 'Shock! Shock!'
Enia Code Review: The Proactive AI That Fixes Your Code Before You Run It
Enia Code doesn't wait for you to ask. It finds bugs and refactors as you type — zero prompts. Here's the honest review after testing it.
Zencoder Review: The Mindful AI Coding Agent That Edits Your Whole Repo
Zencoder is an agentic IDE plugin with multi-model verification and multi-file edits. We tested it—here's what holds up.
Anthropic Hit $19B in Revenue. Claude Code Alone Makes $2.5B. OpenAI Should Be Terrified.
Anthropic's annualized revenue doubled from $9B to $19B in three months. Claude Code alone generates $2.5B. Epoch AI projects Anthropic could overtake OpenAI by August 2026.
Google Made Every Cloud Service an MCP Server. Here's What That Breaks Open for AI Agents.
Google launched fully managed MCP servers for BigQuery, Maps, Compute Engine, and GKE — plus Apigee integration that turns any existing API into an agent-accessible tool. Enterprise AI agents just got a master key to Google's cloud.
Oregon Just Gave You the Right to Sue AI Chatbots. $1,000 Per Violation.
Oregon passed SB 1546 with a 52-0 House vote — the most aggressive chatbot safety law in the US, with $1,000 per violation and a private right of action.
Jack Dorsey Fired 4,000 Workers for AI — Insiders Say It's a Lie
Jack Dorsey slashed 4,000 jobs at Block, claiming AI made them obsolete. But insiders say nobody at the company can explain how — and the $68 million Jay-Z party five months earlier tells a very different story.
A Calendar Invite Can Steal Your Passwords Through Perplexity's AI Browser. No Click Required.
Zero-click flaws in Perplexity Comet let attackers steal files and 1Password vaults via a calendar invite. No clicks needed. 120-day patch timeline.
AI2's OLMo Hybrid Trained in 6 Days on 512 GPUs — And It's Fully Open
AI2's OLMo Hybrid replaces 75% of attention with linear RNNs, matches transformer accuracy with half the training data, and was trained in just 6 days on 512 GPUs.
OpenAI Is Building a GitHub Killer. Microsoft Invested $14B in Them.
OpenAI is building a code repository to rival GitHub — owned by its $14B investor Microsoft. GitHub's 37 outages in February may have sealed its fate.
Samsung Galaxy S26 Orders Uber For You. The Agentic AI Phone Is Here.
Samsung's Galaxy S26 runs three AI systems — Gemini, Perplexity, and Bixby — that autonomously order food, book rides, and shop across apps. The agentic AI phone ships March 11.
GPT-5.4 Beat Humans at Using Computers. OpenAI's Pricing Math Should Terrify Anthropic.
GPT-5.4 scored 75% on computer use — beating humans at 72.4%. At half the price of Claude Opus 4.6, OpenAI just redrew the enterprise AI battlefield.
Claude Found 22 Firefox Bugs in 14 Days. 14 Were High-Severity. Here's the Exploit.
Claude Opus 4.6 found 22 vulnerabilities in Firefox in 14 days — 14 high-severity — then Anthropic spent $4,000 trying to exploit them. The results reshape AI security research.
Meta's Llama 4 Topped Every Benchmark. Then Yann LeCun Admitted They Fudged It.
Meta's Llama 4 hit #2 on benchmarks. Then Yann LeCun admitted they 'fudged' the results. The real model ranked 32nd. Here's why it still matters.
Pomelli Does What Canva Charges $120/Year For. Google Made It Free.
Google's Pomelli auto-detects your brand from your website and generates unlimited marketing content for free. It does what Canva charges $120/year for.
7 MCP Servers That Turn Claude Code Into a 10x Developer Tool
Seven MCP servers that turn Claude Code from a coding assistant into a full development platform. Setup guide with real workflows and exact install commands.
Google Antigravity Is Free, Multi-Agent, and Gunning for Cursor's Throne
Google's free agentic IDE runs multiple AI agents across editor, terminal, and browser. It beats Cursor on speed benchmarks — but has a data residency problem.
MiniMax M2.5 Matches Claude at 5% the Cost. The AI Pricing War Just Escalated.
MiniMax M2.5 scores 80.2% on SWE-Bench at 1/20th the cost of Claude Opus. A $30B Chinese AI startup just broke the price-performance curve.
PsychAdapter: How AI Models Learn Human Personality Traits
PsychAdapter adapts LLMs to reflect human personality traits with 96.7% accuracy using less than 0.1% extra parameters. Explore the research and implications.
Grok 4.20 Multi-Agent AI: How xAI's 4-Agent Architecture Changes Everything
xAI's Grok 4.20 deploys four specialized AI agents that debate each other before responding, cutting hallucinations by 65% and topping live trading benchmarks.
n8n Review: The Open-Source AI Workflow Automation Platform (2026)
n8n review: open-source AI workflow automation with 400+ integrations, free self-hosting, and LangChain-powered AI agents. See pricing, pros, cons.
Apple Siri AI Upgrade: $1B Gemini Deal Brings 1.2 Trillion Parameters to iPhone
Apple's $1B Gemini deal powers a rebuilt Siri with 1.2T parameters, on-screen awareness, and 1M token context. But iOS 26.4 beta delays raise questions.
OpenClaw: The Open-Source AI Agent That Went from 9K to 247K GitHub Stars in 60 Days
OpenClaw, the open-source personal AI agent created by Peter Steinberger, has shattered GitHub records -- surging from 9,000 to 247,000 stars in under four months. With its local-first architecture, heartbeat daemon for proactive autonomy, and 3,000+ community skills, it is redefining what personal AI assistants can do.
DeepSeek V4: China's Trillion-Parameter Multimodal AI Arrives
DeepSeek V4 brings a trillion-parameter multimodal AI model with text, image, and video generation — optimized for Chinese chips and released as open weights.