Google's Gemini 2.5 Flash upgrades AI image editing with better prompt accuracy

2 weeks ago 3
ARTICLE AD BOX

Google Deepmind is adding a new image editing model to the Gemini app that can make dramatic changes to photos on demand while keeping people and animals recognizable.

The new "Gemini 2.5 Flash Image Generation" model builds on Gemini's earlier native image generation tools but delivers much sharper prompt handling. Google says it often outperforms the GPT-4o model used in ChatGPT, especially when it comes to following text prompts for image edits. While many pure image models still struggle with prompt accuracy, Gemini 2.5 Flash gets it right more often.

A key feature is "character consistency": the model can keep a person, animal, or object visually consistent across multiple images, even as poses, backgrounds, or lighting change.

Gemini 2.5 Flash keeps characters consistent across new scenes. Whether it outperforms more complex fine-tuning remains to be seen. | Image: Google Deepmind

This opens up new possibilities for creating image series or product shots from multiple angles. Google says the model is ideal for generating consistent brand assets and product catalogs, and claims Gemini 2.5 Flash outperforms other image systems on a wide range of editing tasks.

Ad

THE DECODER Newsletter

The most important AI news straight to your inbox.

✓ Weekly

✓ Free

✓ Cancel at any time

Gemini 2.5 Flash outperforms previous models on several human-rated image editing benchmarks (ELO score). | Image: Google

The model also supports precise, localized edits through text prompts. Users can blur backgrounds, remove blemishes, add colors, or erase entire objects without manual selection. A template app called "PixShop" shows off these editing features with a simple interface and prompt controls.

PixShop demonstrates Gemini 2.5 Flash's text-based editing tools. | Image: Google Deepmind

Image composition, style transfer, and real-world reasoning

Gemini 2.5 Flash can blend up to three images at once. For example, you can combine a product photo and a room photo to create a realistic interior scene. Complex compositions with several elements can be generated from a single prompt. Google also offers an interactive canvas tool for multi-image fusion.

Gemini 2.5 Flash blends multiple images into one composition. | Image: Google Deepmind

The model handles style transfer too, moving patterns, colors, or textures from one object to another while keeping the shape and details intact. Typical examples include dresses with butterfly patterns or boots with floral textures.

Gemini 2.5 Flash applies patterns and styles across objects. | Image: Google Deepmind

Gemini 2.5 Flash can also visualize simple cause-and-effect, which Google calls "real-world reasoning." In one demo, the model generates an image of a balloon drifting toward a cactus, then another image showing what happens next.

The model can illustrate cause and effect, such as a balloon meeting a cactus. | Image: Google Deepmind

These semantic features draw on Gemini 2.5's world knowledge, Google says. You can try them yourself using a painting app that follows text instructions.

Recommendation

Available to users and developers

The Gemini 2.5 Flash image tools are now available in the Gemini app. Instead of selecting the "Imagen" image model in the chat bar, you need to switch to the "Flash" language model at the top left to use the new features. The setup might be a little confusing at first, but it makes sense given Gemini's language-based editing approach.

To use Gemini 2.5 Flash image editing, select the "Flash" language model in the Gemini app. | Image: Screenshot of THE DECODER

After picking the right model, you can upload an image and give Gemini editing instructions. Every image comes with both a visible watermark and an invisible SynthID digital watermark.

Gemini 2.5 Flash Image is also available in preview through the Gemini API, Google AI Studio, and Vertex AI. Pricing is $30 per million output tokens. Each image uses about 1,290 tokens, or roughly $0.039 per image, the same as Gemini 2.0 Flash Image.

Read Entire Article
LEFT SIDEBAR AD

Hidden in mobile, Best for skyscrapers.