Project Mariner: Google's 83.5% WebVoyager Agent Ships

Everyone said Google was cooked on agents. OpenAI had the Responses API. Anthropic had computer use. Google had a 2024 research preview that nobody remembered.

Then on April 22, 2026, Sundar Pichai walked on stage at Cloud Next and shipped Project Mariner. 83.5% on WebVoyager. 10 concurrent browser tasks. Generally available today.

That number matters. WebVoyager is the hardest public benchmark for autonomous web agents — it tests real websites, multi-step tasks, and error recovery. 83.5% puts Mariner ahead of every publicly reported score from OpenAI's Computer Use and Anthropic's computer-use tooling at comparable task difficulty.

And that is not even the headline.

What Project Mariner Actually Does

Mariner is a web-browsing agent built on Gemini 2.0 (up from the Gemini 1.5 research preview shown in late 2024). You give it a goal — "book the cheapest Tuesday flight from SFO to Tokyo, under $1,200, no Basic Economy" — and it opens a browser, navigates, clicks, types, and completes the task.

Three things set it apart:

It runs on Google's cloud, not your laptop. OpenAI's Computer Use drives your local browser. Anthropic's implementation does the same. Mariner spins up isolated VMs in Google Cloud. Your machine is free while the agent works. Your cookies are not exposed.
Ten tabs at once. You can dispatch 10 parallel tasks. One Mariner instance can be comparing flights while another drafts an email while another scrapes three competitor websites. This is the first web agent where parallelism is a product feature, not a hack.
It's GA, not a waitlist. If you have a Gemini Enterprise subscription, you can use it right now.

The Benchmark Number Nobody Is Disputing

83.5% on WebVoyager. For context: the original Mariner research preview in December 2024 scored around 83.5% on a prior version of the test set, and independent agent teams in 2025 reported numbers in the 60-75% range on updated splits. Google claims Mariner's new GA build lands at 83.5% on the current public benchmark.

We will see independent replications over the next week. But two things are already clear from the Cloud Next keynote:

Bloomberg's Mark Bergen confirmed the number came straight from the public leaderboard run, not an internal eval.
The product demo showed Mariner completing a Kayak booking, a Workday expense submission, and a Salesforce lead-capture flow in parallel — all three finished without human intervention.

That is the kind of demo you do not ship unless the number is real.

Gemini Enterprise: The Rebrand That Actually Matters

Vertex AI is dead. Long live Gemini Enterprise Agent Platform.

This is not cosmetic. Google consolidated five products — Vertex AI, Agent Builder, Model Garden, Agentspace, and Workspace Studio — into one agent control plane. What shipped:

200+ models in the Model Garden, including Anthropic Claude Opus 4.6 and Sonnet 4.5 (yes, Google ships Anthropic models — the Anthropic investment is paying off differently than anyone predicted).
Managed MCP servers across every Google Cloud service. BigQuery, Spanner, GKE, Cloud Run, Firestore — every one of them now has a first-party MCP server that just works. No install. No token rotation. OAuth 2.1 baked in.
ADK 1.0 (Agent Development Kit) for building custom agents with Python or TypeScript.
A2A v1.0 (Agent-to-Agent protocol) — the official spec for agents to talk to each other across vendors. Salesforce, Workday, Box, and ServiceNow all adopted it at launch.
Workspace Studio no-code builder that lets non-engineers wire an agent to Gmail, Docs, Sheets, and Drive in minutes.

The Real Story: Partner Agents on Day One

Here is what people are going to miss.

Google shipped Mariner with pre-built partner agents from Box, Workday, Salesforce, and ServiceNow. These are not generic "here is an API" integrations. They are full agents that speak A2A v1.0 and can be invoked by Mariner as subroutines.

Translation: your enterprise stack is already wired. If your company runs Workday for HR, Salesforce for CRM, and ServiceNow for IT, Mariner can hand off tasks to those systems' agents without you writing a single line of glue code.

This is the kind of platform move Microsoft tried with Copilot Studio and failed to land, because Microsoft does not own the LLM. Google owns the LLM, the cloud, and now the agent protocol. That is why this ships differently.

What It Means for OpenAI and Anthropic

OpenAI's Responses API launched in March 2025. Anthropic's Claude Agent SDK shipped in September 2025. Both are excellent. Both assume you want to wire up tool-use yourself.

Gemini Enterprise is the first platform where the agent, the models, the MCP servers, the enterprise connectors, and the observability all ship together. For a Fortune 500 buyer, that is a dramatically shorter procurement cycle.

If you are a startup that built on OpenAI's agent tooling in 2025, nothing changes tomorrow. If you are a VP of Engineering at a 5,000-person company evaluating where to standardize AI agents in Q3 2026 — Google just became the easy answer.

That is the story. Not the benchmark. The easy answer.

Pricing That Is Actually Competitive

Google did not announce Mariner pricing line items at the keynote, but the Gemini Enterprise Agent Platform tiers were confirmed:

Standard: $30/user/month — includes Gemini 2.5 Flash, ADK, up to 100 agent runs/day.
Pro: $85/user/month — Gemini 2.5 Pro, Mariner access, 1,000 agent runs/day, all partner agents.
Enterprise: custom — unlimited Mariner tabs, managed MCP across every Google Cloud service, dedicated support.

For comparison, Anthropic's Claude for Enterprise starts at $60/user/month and does not include web-agent functionality. OpenAI's ChatGPT Enterprise is $60/user but web-agent features are still metered separately.

The 10-Tab Parallelism Changes Agent Economics

Single-agent tools are cheap to build and expensive to scale. The bottleneck is always the same: one browser, one task, one result. If you need to compare 40 flights across 10 booking sites, you run 10 serial queries. That takes minutes.

Mariner's cloud-VM architecture flips that math. Ten concurrent tabs means a research task that took a human analyst 45 minutes can finish in 4. A procurement specialist comparing SaaS vendors across five websites can run all five simultaneously. A recruiter sourcing candidates across LinkedIn, GitHub, AngelList, Wellfound, and Lever runs one query, not five.

This is the kind of capability that justifies the $85/user/month Pro tier on a single use case. For ops-heavy teams, the payback is measured in days, not quarters.

Security Posture: The Quiet Enterprise Win

Cloud-VM execution also solves the security question that has haunted local web agents. When OpenAI's Computer Use drives your local browser, it has your cookies, your saved passwords, and your session tokens. Most corporate security teams block it for exactly that reason.

Mariner runs in an isolated Google Cloud VM with a fresh browser session per task. Credentials are injected through a managed secrets vault — never stored in the agent's memory. Audit logs capture every page visit, every click, every form submission. If your compliance team rejected computer-use agents in 2025 on data-exfiltration grounds, Mariner is the first architecture that answers their objections.

That is the unsexy reason Fortune 500 CIOs will actually green-light this.

Related Resources

Tool listing: Rowboat — an AI work app building a knowledge graph across meetings, email, and notes (for teams that do not want full agent automation yet).
Repo: smolagents — Hugging Face's lightweight Python agent library, the counter-programming to corporate agent platforms.
MCP server: Basecamp MCP Server — 100+ tools exposing Basecamp 3 to Claude, Cursor, and Copilot via MCP v2.1.
Skills: Awesome Claude Skills (travisvn) — hand-curated skills list if you are sticking with Claude Code over Google's stack.

Verdict

Google was never behind on agents. It was behind on shipping agents. That changed on April 22.

If you run an enterprise stack on Google Workspace, you have no reason not to evaluate Mariner this quarter. If you built on OpenAI or Anthropic in 2025, the switching cost just dropped — not to zero, but low enough that a 1,000-seat customer will ask their procurement team to run the numbers.

The 83.5% number is real. The 10-tab parallelism is real. The managed MCP servers across every GCP service is the quiet kill shot.

OpenAI and Anthropic will respond. But for the next 90 days, Google owns the enterprise agent narrative.

Frequently Asked Questions

What is Project Mariner?

Project Mariner is Google's autonomous web-browsing AI agent, shipped GA at Cloud Next 2026 on April 22. It runs on Gemini 2.0, scores 83.5% on the WebVoyager benchmark, and can execute up to 10 concurrent browser tasks on Google Cloud VMs.

How does Project Mariner compare to OpenAI's Computer Use and Anthropic's Claude Computer Use?

Mariner runs on Google's cloud VMs instead of your local machine, supports 10 parallel tabs, and reports a higher WebVoyager score (83.5%) than publicly disclosed numbers from OpenAI or Anthropic agents. It is also GA, while Anthropic's computer-use feature remains in beta.

How much does Gemini Enterprise Agent Platform cost?

Standard is $30/user/month, Pro is $85/user/month (includes Mariner access and 1,000 agent runs/day), and Enterprise pricing is custom. Pro tier is the cheapest way to get full Mariner functionality.

What is the A2A protocol and why does it matter?

A2A v1.0 (Agent-to-Agent) is the open protocol Google shipped alongside Mariner so agents from different vendors can call each other. Salesforce, Workday, Box, and ServiceNow adopted it at launch — which is why Mariner can hand off tasks to partner agents without custom glue code.

Is Project Mariner worth switching from OpenAI or Anthropic agents in 2026?

For Fortune 500 enterprises already on Google Workspace, yes — the managed MCP servers across GCP, pre-built partner agents, and integrated billing make procurement dramatically simpler. For startups happy with OpenAI Responses API or Claude Agent SDK, there is no urgent reason to migrate.

Google Just Shipped a Web Agent That Runs 10 Tabs at Once. It Beat OpenAI's Score.

What Project Mariner Actually Does

The Benchmark Number Nobody Is Disputing

Gemini Enterprise: The Rebrand That Actually Matters

The Real Story: Partner Agents on Day One

What It Means for OpenAI and Anthropic

Pricing That Is Actually Competitive

The 10-Tab Parallelism Changes Agent Economics

Security Posture: The Quiet Enterprise Win

Related Resources

Verdict

Frequently Asked Questions

What is Project Mariner?

How does Project Mariner compare to OpenAI's Computer Use and Anthropic's Claude Computer Use?

How much does Gemini Enterprise Agent Platform cost?

What is the A2A protocol and why does it matter?

Is Project Mariner worth switching from OpenAI or Anthropic agents in 2026?

Key Takeaways

Related Resources

AI Tools Directory

MCP Servers

Weekly AI Digest

Google Just Shipped a Web Agent That Runs 10 Tabs at Once. It Beat OpenAI's Score.

What Project Mariner Actually Does

The Benchmark Number Nobody Is Disputing

Gemini Enterprise: The Rebrand That Actually Matters

The Real Story: Partner Agents on Day One

What It Means for OpenAI and Anthropic

Pricing That Is Actually Competitive

The 10-Tab Parallelism Changes Agent Economics

Security Posture: The Quiet Enterprise Win

Related Resources

Verdict

Frequently Asked Questions

What is Project Mariner?

How does Project Mariner compare to OpenAI's Computer Use and Anthropic's Claude Computer Use?

How much does Gemini Enterprise Agent Platform cost?

What is the A2A protocol and why does it matter?

Is Project Mariner worth switching from OpenAI or Anthropic agents in 2026?

You Might Also Like

Related AI Tools

Related Repositories

Related Agent Skills

Key Takeaways

Related Resources

AI Tools Directory

MCP Servers

Weekly AI Digest