New Google model to autonomously control browsers and mobile apps

3 weeks ago 13
ARTICLE AD BOX

Google Deepmind has introduced a new AI model capable of operating web and mobile interfaces. The Gemini 2.5 Computer Use model is now available in preview.

Developers can access it through the Gemini API. Built on Gemini 2.5 Pro, the model is designed to help agents interact directly with graphical user interfaces.

It works in a continuous loop: the system receives a screenshot of the environment, the user's request, and a record of past actions. From this, it generates UI actions like clicking, typing, or scrolling. After each action, a new screenshot is sent back to the model, and the process repeats.

Ad

THE DECODER Newsletter

The most important AI news straight to your inbox.

✓ Weekly

✓ Free

✓ Cancel at any time

Google says the model is primarily optimized for web browsers but can also handle mobile UI control. It is not yet intended for desktop operating system-level tasks.

According to Google, the model outperforms alternatives in benchmarks like Online-Mind2Web, WebVoyager, and AndroidWorld. These results come from internal tests and evaluations by Browserbase. It reportedly reaches over 70 percent accuracy with an average latency of about 225 seconds.

Safety mechanisms against misuse

Google identifies three main risks: intentional misuse by users, unexpected model behavior, and prompt injections on the web. The company says it has built safety features directly into the model.

A per-step safety service reviews every proposed action before execution. Developers can also use system instructions to require user confirmation or block specific high-stakes actions, such as bypassing CAPTCHAs or controlling medical devices.

Google is already using the model internally for UI testing, Project Mariner, the Firebase Testing Agent, and the AI Mode in Search. Gemini 2.5 Computer Use is available through Google AI Studio and Vertex AI, with a demo environment hosted by Browserbase.

Recommendation

Read Entire Article
LEFT SIDEBAR AD

Hidden in mobile, Best for skyscrapers.