Google has announced the rollout of native image generation in Gemini 2.0 Flash, making it available for free to developers through Google AI Studio and the Gemini API. The experimental feature marks the first time a major US tech company has integrated text and image generation within the same AI model.
Unlike traditional AI image generation setups, which rely on separate diffusion models linked to LLMs, Gemini 2.0 Flash generates images natively within the same model that processes text prompts. This approach is expected to improve accuracy, consistency, and overall creative capabilities.
First unveiled in December 2024, Gemini 2.0 Flash combines multimodal input, advanced reasoning, and natural language understanding to generate images directly alongside text. The newly available experimental version enhances how developers can create and refine visual content, offering several standout features:
Users have already been testing out the new models, and it's getting rave reviews because of all that it can do. Here are a few examples shared by users on X (formerly Twitter):
One user asked Gemini 2.0 Flash to make the model wear a different set of clothes by providing an image of a jacket, and it looks like Gemini did a great job!
Another user tried out a scenario where he uploaded two separate images of a man and a perfume bottle, and asked Gemini to make the man hold the bottle, and as expected, Gemini complied brilliantly.
Some users have already called it the end of image-editing apps and platforms like Photoshop and Canva, because of how good Gemini's output has been. Users have successfully tried changing the colours of their clothes with Gemini. Here's an example.
In another bizarre use case, a user who was running late for work asked Gemini to convert their selfie into a picture of them waiting for a subway train. The user wrote out a very detailed prompt, and Gemini complied, albeit not perfectly.
Eagle-eyed X users spotted that the man behind the user looks like a fictional character, moreover the user's thumb is uniquely-shaped as well, giving away the artificial nature of the image.
One interesting use case that people have found out for the new Gemini model is removing watermarks from images. Users have been using Gemini to remove iStock or Getty watermarks from images, and the experimental model is doing it perfectly.
Normally, to get images without watermarks, users are made to pay a premium price, either a one-time cost or a subscription to the service, however it appears that the Gemini 2.0 Flash is incredibly adept at removing these watermarks for free.