Back to Articles

Google Just Shipped a Web Agent That Runs 10 Tabs at Once. It Beat OpenAI's Score.

April 23, 2026
8 min read
Google Just Shipped a Web Agent That Runs 10 Tabs at Once. It Beat OpenAI's Score.
Google was supposed to be behind on agents. Then Sundar Pichai walked on stage at Cloud Next 2026 and shipped a web-browsing agent that scores 83.5% on WebVoyager and handles 10 concurrent tabs. Here is what actually changed.

Everyone said Google was cooked on agents. OpenAI had the Responses API. Anthropic had computer use. Google had a 2024 research preview that nobody remembered.

Then on April 22, 2026, Sundar Pichai walked on stage at Cloud Next and shipped Project Mariner. 83.5% on WebVoyager. 10 concurrent browser tasks. Generally available today.

That number matters. WebVoyager is the hardest public benchmark for autonomous web agents — it tests real websites, multi-step tasks, and error recovery. 83.5% puts Mariner ahead of every publicly reported score from OpenAI's Computer Use and Anthropic's computer-use tooling at comparable task difficulty.

And that is not even the headline.

What Project Mariner Actually Does

Mariner is a web-browsing agent built on Gemini 2.0 (up from the Gemini 1.5 research preview shown in late 2024). You give it a goal — "book the cheapest Tuesday flight from SFO to Tokyo, under $1,200, no Basic Economy" — and it opens a browser, navigates, clicks, types, and completes the task.

Three things set it apart:

  1. It runs on Google's cloud, not your laptop. OpenAI's Computer Use drives your local browser. Anthropic's implementation does the same. Mariner spins up isolated VMs in Google Cloud. Your machine is free while the agent works. Your cookies are not exposed.
  2. Ten tabs at once. You can dispatch 10 parallel tasks. One Mariner instance can be comparing flights while another drafts an email while another scrapes three competitor websites. This is the first web agent where parallelism is a product feature, not a hack.
  3. It's GA, not a waitlist. If you have a Gemini Enterprise subscription, you can use it right now.

The Benchmark Number Nobody Is Disputing

83.5% on WebVoyager. For context: the original Mariner research preview in December 2024 scored around 83.5% on a prior version of the test set, and independent agent teams in 2025 reported numbers in the 60-75% range on updated splits. Google claims Mariner's new GA build lands at 83.5% on the current public benchmark.

We will see independent replications over the next week. But two things are already clear from the Cloud Next keynote:

  • Bloomberg's Mark Bergen confirmed the number came straight from the public leaderboard run, not an internal eval.
  • The product demo showed Mariner completing a Kayak booking, a Workday expense submission, and a Salesforce lead-capture flow in parallel — all three finished without human intervention.

That is the kind of demo you do not ship unless the number is real.

Gemini Enterprise: The Rebrand That Actually Matters

Vertex AI is dead. Long live Gemini Enterprise Agent Platform.

This is not cosmetic. Google consolidated five products — Vertex AI, Agent Builder, Model Garden, Agentspace, and Workspace Studio — into one agent control plane. What shipped:

  • 200+ models in the Model Garden, including Anthropic Claude Opus 4.6 and Sonnet 4.5 (yes, Google ships Anthropic models — the Anthropic investment is paying off differently than anyone predicted).
  • Managed MCP servers across every Google Cloud service. BigQuery, Spanner, GKE, Cloud Run, Firestore — every one of them now has a first-party MCP server that just works. No install. No token rotation. OAuth 2.1 baked in.
  • ADK 1.0 (Agent Development Kit) for building custom agents with Python or TypeScript.
  • A2A v1.0 (Agent-to-Agent protocol) — the official spec for agents to talk to each other across vendors. Salesforce, Workday, Box, and ServiceNow all adopted it at launch.
  • Workspace Studio no-code builder that lets non-engineers wire an agent to Gmail, Docs, Sheets, and Drive in minutes.

The Real Story: Partner Agents on Day One

Here is what people are going to miss.

Google shipped Mariner with pre-built partner agents from Box, Workday, Salesforce, and ServiceNow. These are not generic "here is an API" integrations. They are full agents that speak A2A v1.0 and can be invoked by Mariner as subroutines.

Translation: your enterprise stack is already wired. If your company runs Workday for HR, Salesforce for CRM, and ServiceNow for IT, Mariner can hand off tasks to those systems' agents without you writing a single line of glue code.

This is the kind of platform move Microsoft tried with Copilot Studio and failed to land, because Microsoft does not own the LLM. Google owns the LLM, the cloud, and now the agent protocol. That is why this ships differently.

What It Means for OpenAI and Anthropic

OpenAI's Responses API launched in March 2025. Anthropic's Claude Agent SDK shipped in September 2025. Both are excellent. Both assume you want to wire up tool-use yourself.

Gemini Enterprise is the first platform where the agent, the models, the MCP servers, the enterprise connectors, and the observability all ship together. For a Fortune 500 buyer, that is a dramatically shorter procurement cycle.

If you are a startup that built on OpenAI's agent tooling in 2025, nothing changes tomorrow. If you are a VP of Engineering at a 5,000-person company evaluating where to standardize AI agents in Q3 2026 — Google just became the easy answer.

That is the story. Not the benchmark. The easy answer.

Pricing That Is Actually Competitive

Google did not announce Mariner pricing line items at the keynote, but the Gemini Enterprise Agent Platform tiers were confirmed:

  • Standard: $30/user/month — includes Gemini 2.5 Flash, ADK, up to 100 agent runs/day.
  • Pro: $85/user/month — Gemini 2.5 Pro, Mariner access, 1,000 agent runs/day, all partner agents.
  • Enterprise: custom — unlimited Mariner tabs, managed MCP across every Google Cloud service, dedicated support.

For comparison, Anthropic's Claude for Enterprise starts at $60/user/month and does not include web-agent functionality. OpenAI's ChatGPT Enterprise is $60/user but web-agent features are still metered separately.

The 10-Tab Parallelism Changes Agent Economics

Single-agent tools are cheap to build and expensive to scale. The bottleneck is always the same: one browser, one task, one result. If you need to compare 40 flights across 10 booking sites, you run 10 serial queries. That takes minutes.

Mariner's cloud-VM architecture flips that math. Ten concurrent tabs means a research task that took a human analyst 45 minutes can finish in 4. A procurement specialist comparing SaaS vendors across five websites can run all five simultaneously. A recruiter sourcing candidates across LinkedIn, GitHub, AngelList, Wellfound, and Lever runs one query, not five.

This is the kind of capability that justifies the $85/user/month Pro tier on a single use case. For ops-heavy teams, the payback is measured in days, not quarters.

Security Posture: The Quiet Enterprise Win

Cloud-VM execution also solves the security question that has haunted local web agents. When OpenAI's Computer Use drives your local browser, it has your cookies, your saved passwords, and your session tokens. Most corporate security teams block it for exactly that reason.

Mariner runs in an isolated Google Cloud VM with a fresh browser session per task. Credentials are injected through a managed secrets vault — never stored in the agent's memory. Audit logs capture every page visit, every click, every form submission. If your compliance team rejected computer-use agents in 2025 on data-exfiltration grounds, Mariner is the first architecture that answers their objections.

That is the unsexy reason Fortune 500 CIOs will actually green-light this.

Related Resources

  • Tool listing: Rowboat — an AI work app building a knowledge graph across meetings, email, and notes (for teams that do not want full agent automation yet).
  • Repo: smolagents — Hugging Face's lightweight Python agent library, the counter-programming to corporate agent platforms.
  • MCP server: Basecamp MCP Server — 100+ tools exposing Basecamp 3 to Claude, Cursor, and Copilot via MCP v2.1.
  • Skills: Awesome Claude Skills (travisvn) — hand-curated skills list if you are sticking with Claude Code over Google's stack.

Verdict

Google was never behind on agents. It was behind on shipping agents. That changed on April 22.

If you run an enterprise stack on Google Workspace, you have no reason not to evaluate Mariner this quarter. If you built on OpenAI or Anthropic in 2025, the switching cost just dropped — not to zero, but low enough that a 1,000-seat customer will ask their procurement team to run the numbers.

The 83.5% number is real. The 10-tab parallelism is real. The managed MCP servers across every GCP service is the quiet kill shot.

OpenAI and Anthropic will respond. But for the next 90 days, Google owns the enterprise agent narrative.

Frequently Asked Questions

What is Project Mariner?

Project Mariner is Google's autonomous web-browsing AI agent, shipped GA at Cloud Next 2026 on April 22. It runs on Gemini 2.0, scores 83.5% on the WebVoyager benchmark, and can execute up to 10 concurrent browser tasks on Google Cloud VMs.

How does Project Mariner compare to OpenAI's Computer Use and Anthropic's Claude Computer Use?

Mariner runs on Google's cloud VMs instead of your local machine, supports 10 parallel tabs, and reports a higher WebVoyager score (83.5%) than publicly disclosed numbers from OpenAI or Anthropic agents. It is also GA, while Anthropic's computer-use feature remains in beta.

How much does Gemini Enterprise Agent Platform cost?

Standard is $30/user/month, Pro is $85/user/month (includes Mariner access and 1,000 agent runs/day), and Enterprise pricing is custom. Pro tier is the cheapest way to get full Mariner functionality.

What is the A2A protocol and why does it matter?

A2A v1.0 (Agent-to-Agent) is the open protocol Google shipped alongside Mariner so agents from different vendors can call each other. Salesforce, Workday, Box, and ServiceNow adopted it at launch — which is why Mariner can hand off tasks to partner agents without custom glue code.

Is Project Mariner worth switching from OpenAI or Anthropic agents in 2026?

For Fortune 500 enterprises already on Google Workspace, yes — the managed MCP servers across GCP, pre-built partner agents, and integrated billing make procurement dramatically simpler. For startups happy with OpenAI Responses API or Claude Agent SDK, there is no urgent reason to migrate.

Key Takeaways

  • Project Mariner is now generally available and scores 83.5% on the WebVoyager benchmark, beating most OpenAI and Anthropic web-agent numbers reported so far.
  • Mariner can run up to 10 concurrent browser tasks on cloud-hosted VMs, not on your laptop — so your machine stays free while the agent works.
  • Gemini Enterprise Agent Platform replaces Vertex AI as the single control plane with 200+ models, managed MCP servers, ADK 1.0, and A2A v1.0.
  • Box, Workday, Salesforce, and ServiceNow shipped partner agents on day one — enterprise migration paths are already wired.
  • If you built on OpenAI's Responses API or Anthropic's agent tooling, Google just made the switching cost lower than you think.
S

Skila AI Editorial Team

The Skila AI editorial team researches and writes original content covering AI tools, model releases, open-source developments, and industry analysis. Our goal is to cut through the noise and give developers, product teams, and AI enthusiasts accurate, timely, and actionable information about the fast-moving AI ecosystem.

About Skila AI →
Project Mariner
Google Cloud Next 2026
Gemini Enterprise
Ai Agents
Webvoyager
A2a Protocol
Managed Mcp Servers

Related Resources

Weekly AI Digest

Get the top AI news, tool reviews, and developer insights delivered every week. No spam, unsubscribe anytime.

Join 1,000+ AI enthusiasts. Free forever.