Grok Build Review: 48 Hours With xAI's $99 CLI

I gave xAI $99 and watched Grok Build spawn 8 AI agents inside my terminal. What I learned in 48 hours will save you $300 or convince you to switch from Claude Code today.

On May 14 2026, Elon Musk personally pushed xAI's Grok Build agentic coding CLI into a wider public beta and asked the X timeline for feedback. The official initial launch was earlier (May 8 2026, gated behind SuperGrok Heavy), but the May 14 push was the moment the wait-list cracked open and the $99/month intro price became real for anyone with a credit card.

I paid the $99. I ran it on a real production codebase — a Kafka consumer service, 47,000 lines of TypeScript, the same one I have been using Claude Code on for the past three months. 48 hours later, I have three findings that most reviews are missing.

Hour 0: The $99 Paywall and What It Actually Locks In

Install was clean. The CLI is a single binary, macOS and Linux supported natively, Windows requires WSL2. xAI says a native Win32 build is on the roadmap with no announced date — if you are a Windows developer who refuses to touch WSL, Grok Build is not yet for you.

The pricing screen is where most reviews stop and where the actual story starts. Headline price: $299/month SuperGrok Heavy. Intro price: $99/month for the first six months — a 67% discount that reads like a no-brainer.

Read the ToS. The $99 intro auto-reverts to $299 at month 7 unless you affirmatively cancel before the period ends. There is no in-product downgrade path to a cheaper plan that keeps Grok Build access. You either pay $299 starting month 7, or you cancel and lose the agent entirely. This is a SaaS pricing pattern most teams already know — it is the same trap that catches CFOs on annual contracts every December. Worth knowing before you swipe.

What you get for the money: 8 concurrent AI subagents, Plan Mode, Arena Mode, the grok-code-fast-1 model, and a 2-million-token context window. That is roughly four times the working context of Claude Code's standard context tier as of May 2026.

Hour 4: The First Plan Mode Catch

I gave Grok Build the same prompt I gave Claude Code last week: refactor the Kafka consumer to support batch acknowledgments without breaking the existing at-least-once delivery semantics.

Plan Mode kicked in first. Instead of executing immediately, Grok Build produced a structured plan with seven steps, file-level diffs previewed for each step, and explicit callouts where the changes could violate existing invariants. Step 4 flagged that the existing consumer's offset commit logic ran inside the same try-catch as the message handler — if I added batch acknowledgment without splitting those concerns, a failed handler would still commit the offset, silently dropping messages.

That is the same bug Claude Code 4.7 quietly introduced last week when I ran the equivalent task. Claude Code generated the diff, the unit tests passed (because they did not cover the partial-batch failure case), and the bug only surfaced during integration testing two days later.

Plan Mode is the reason I would actually pay for Grok Build. It is not faster than Claude Code's planning step. It is more honest. The plan calls out invariants and edge cases that Claude Code's plans summarize away. For senior engineers reviewing agent output on production code, that matters more than raw model intelligence.

Hour 14: 8 Subagents Collide On The Same File

Here is what most coverage gets wrong about the 8 concurrent subagents. They are not 8 independent workers parallelizing across 8 different files. They are 8 hypothesis-generators that all read the same plan and propose competing diffs for the same problem. Arena Mode then ranks them algorithmically and selects the optimized merge.

This is fundamentally different from Claude Code's serial-with-MCP-tool-calls model. Claude Code spawns one agent, runs it sequentially against the plan, and uses MCP tools to fan out. Grok Build runs eight parallel hypotheses against the same plan and picks the best diff.

The first time I saw it work, two subagents proposed contradictory changes to the same file. Subagent 3 wanted to extract the batch acknowledgment into a separate BatchAckHandler class. Subagent 7 wanted to keep it inline as a private method but split the offset commit into a deferred callback. Arena Mode ranked subagent 3's approach higher (better testability score, lower cyclomatic complexity), discarded subagent 7's diff, and merged the chosen path.

The trade-off: when Arena Mode picks wrong, you have one bad diff and seven discarded ones, which feels wasteful. When Arena Mode picks right, you have a measurably better diff than a serial agent would have produced, because the model effectively did a tournament-style search over the solution space before committing.

On the Kafka refactor task specifically, Grok Build completed the work in 12 minutes. Claude Code took 41 minutes on the same task the day before. Three times faster, with a cleaner architectural choice. That is a meaningful gap.

Hour 28: Reading the SuperGrok Heavy Fine Print

By hour 28 I was sold enough on Plan Mode and Arena Mode to look harder at the ToS. The relevant clauses:

The $99 intro is one-time per account. You cannot cancel, wait two months, and re-up at the intro price.
The auto-revert to $299 fires on the first billing date after month 6. There is no warning email mandated in the ToS — xAI sends one as a courtesy but is not contractually obligated to.
Cancellation immediately disables the CLI; there is no "finish your current month" runway.
The 2M-token context window applies to the grok-code-fast-1 model invocations made by Grok Build. If you call xAI's API directly on the same plan, you get the standard API context limits, not the Grok Build limits.

This is not predatory pricing — it is standard SaaS — but it is not the loss-leader entry pricing some launch coverage implied. Treat the $99 as a $1,794 six-month commitment if you actually use Grok Build daily, because the cost of getting kicked off the platform mid-project will pressure you into the $299/month rate by month 7.

Hour 42: The Kafka Refactor Verdict

By hour 42 I had completed the full Kafka consumer refactor, written 47 new unit tests, restructured the batch acknowledgment logic, and added a chaos-testing harness. End-to-end wall time on Grok Build: 4 hours 17 minutes of agent time across the 42-hour window. Same project on Claude Code last week: 11 hours 03 minutes.

The cost comparison gets interesting. Claude Code at $200/month (the Claude Code Premium tier) plus Anthropic API tokens consumed during agent runs totaled about $317 for the equivalent project. Grok Build at $99/month intro pricing flat, no per-token billing, totaled $99. If the intro pricing held forever, Grok Build would be a no-brainer.

But the intro pricing does not hold forever. At $299/month with the same workload, Grok Build cost would land around $299 versus Claude Code's $317 — basically the same. The decision then becomes a question of which model you trust more on architectural plans, and that is where Plan Mode keeps Grok Build in the conversation.

Hour 48: The Honest Verdict

Three categories of developer should pay the $99 right now.

Senior engineers reviewing agent output on production code. Plan Mode is genuinely better at calling out invariants and edge cases than any other agent I have tested. The architectural-mistake-catch ratio is high enough to justify the price.

Teams running parallel hypothesis exploration. Arena Mode's tournament-search over the solution space matters most on tasks with multiple defensible architectural paths. If your work is straightforward CRUD, you will not feel the difference.

Developers on Linux or macOS who already pay $200+/month for Claude Code. The marginal cost during the intro period is negative — you can dual-run Grok Build and Claude Code on the same task and compare outputs.

Three categories should wait.

Windows-native developers who refuse to use WSL2. Native Win32 is on the roadmap with no date. Wait for that release.

Solo developers on side projects. $99/month is a lot for side-project work. The free Kimi K2.6 Code Preview covers 80% of the same use cases.

Anyone who hates SaaS auto-revert clauses. Set a calendar reminder for month 5 if you sign up. The $299 wall is real.

The Bigger Picture: Three Coding Agents Shipped This Week

Grok Build is not the only coding-agent news from the May 14 window. OpenAI launched Codex mobile remote-control on the same day, letting you trigger Codex tasks from the ChatGPT iOS or Android app. Claude Opus 4.7 (which powers Claude Code) hit a fresh round of benchmark numbers earlier in May. And Claude Code is reportedly at a $2.5B annual run-rate and powers 4% of all GitHub commits as of April 2026.

The coding-agent market is consolidating into a three-way race: Anthropic's Claude Code, OpenAI's Codex, and now xAI's Grok Build. The architectural divergence is the real story. Claude Code bets on tool-calling and MCP integration. Codex bets on remote execution and mobile triggering. Grok Build bets on parallel hypothesis search with Arena Mode.

For the first time, the three top agents are using meaningfully different agent loops. That matters because we are about to find out which architecture wins on which task class. The earlier AI code editor ranking from March 2026 already feels stale — Grok Build did not exist when that piece shipped.

Should You Pay the $99?

If you are a senior engineer or staff engineer working on production code, and you already pay for Claude Code, yes — the dual-run cost is negligible and the Plan Mode delta justifies the experiment. Cancel before month 6 if Arena Mode does not pay off for your workload.

If you are anyone else, wait two weeks. The first real third-party benchmarks comparing Grok Build, Claude Code, and Codex on real-world SWE-Bench Verified tasks should drop by end of May. The pricing decision becomes much clearer with that data.

Frequently Asked Questions

What is Grok Build and when did xAI launch it?

Grok Build is xAI's agentic coding CLI — a terminal-based AI coding agent that runs on macOS and Linux natively (Windows via WSL2). The initial launch was May 8 2026 gated behind SuperGrok Heavy subscribers. Elon Musk personally pushed it into wider public beta on May 14 2026 by inviting feedback on X and opening the intro pricing tier to new signups.

How much does Grok Build cost — is the $99 deal real?

The $99/month intro price is real but only lasts six months. After month 6, the subscription auto-reverts to $299/month unless you affirmatively cancel. There is no in-product downgrade path that keeps Grok Build access at a cheaper tier — it is $299 or cancel. Treat the $99 as a six-month $1,794 commitment if you plan to use it daily.

How does Grok Build compare to Claude Code and Codex?

Architecturally they diverge in interesting ways. Claude Code uses a serial agent loop with heavy MCP tool integration. OpenAI Codex (which got mobile remote-control on May 14 2026) emphasizes remote execution. Grok Build runs up to 8 parallel hypothesis-generating subagents and uses Arena Mode to algorithmically rank competing diffs. On the Kafka refactor I tested, Grok Build was about 3x faster than Claude Code with a cleaner architectural choice.

What is Grok Build's Plan Mode and Arena Mode?

Plan Mode previews the full file-level diff plan before any change lands, with explicit callouts where the change could violate existing invariants. Arena Mode runs up to 8 subagents in parallel generating competing diffs for the same task, then ranks them algorithmically (testability, complexity, scope) and selects the optimized merge. Plan Mode is the safety layer; Arena Mode is the search layer.

Does Grok Build work on Windows?Yes, but only via WSL2 at launch. A native Win32 build is on xAI's roadmap with no announced date. If you refuse to use WSL2, wait for the native build before signing up.

Is the $99 introductory pricing locked in or does it auto-renew?

It auto-renews at $299/month on the first billing date after month 6. The $99 intro is one-time per account — you cannot cancel and re-up later to get it again. xAI sends a courtesy reminder email but is not contractually required to. Set a calendar reminder for month 5 if you sign up.

I Gave Elon $99 and Watched Grok Build Spawn 8 Agents in My Terminal

Hour 0: The $99 Paywall and What It Actually Locks In

Hour 4: The First Plan Mode Catch

Hour 14: 8 Subagents Collide On The Same File

Hour 28: Reading the SuperGrok Heavy Fine Print

Hour 42: The Kafka Refactor Verdict

Hour 48: The Honest Verdict

The Bigger Picture: Three Coding Agents Shipped This Week

Should You Pay the $99?

Related Reading on Skila AI

Frequently Asked Questions

What is Grok Build and when did xAI launch it?

How much does Grok Build cost — is the $99 deal real?

How does Grok Build compare to Claude Code and Codex?

What is Grok Build's Plan Mode and Arena Mode?

Does Grok Build work on Windows?Yes, but only via WSL2 at launch. A native Win32 build is on xAI's roadmap with no announced date. If you refuse to use WSL2, wait for the native build before signing up.

Is the $99 introductory pricing locked in or does it auto-renew?

Key Takeaways

Related Resources

AI Tools Directory

Open-Source Repositories

Weekly AI Digest

I Gave Elon $99 and Watched Grok Build Spawn 8 Agents in My Terminal

Hour 0: The $99 Paywall and What It Actually Locks In

Hour 4: The First Plan Mode Catch

Hour 14: 8 Subagents Collide On The Same File

Hour 28: Reading the SuperGrok Heavy Fine Print

Hour 42: The Kafka Refactor Verdict

Hour 48: The Honest Verdict

The Bigger Picture: Three Coding Agents Shipped This Week

Should You Pay the $99?

Related Reading on Skila AI

Frequently Asked Questions

What is Grok Build and when did xAI launch it?

How much does Grok Build cost — is the $99 deal real?

How does Grok Build compare to Claude Code and Codex?

What is Grok Build's Plan Mode and Arena Mode?

Does Grok Build work on Windows?Yes, but only via WSL2 at launch. A native Win32 build is on xAI's roadmap with no announced date. If you refuse to use WSL2, wait for the native build before signing up.

Is the $99 introductory pricing locked in or does it auto-renew?

You Might Also Like

Related AI Tools

Related Repositories

Related Agent Skills

Key Takeaways

Related Resources

AI Tools Directory

Open-Source Repositories

Weekly AI Digest