Gemini 3.1 Flash-Lite arrives as Google’s cheapest Gemini 3 model yet

After launching Gemini 3.1 Pro last month, Google is now adding Gemini 3.1 Flash-Lite, a new model aimed at developers who need lower costs and quicker responses more than they need the company’s highest-end reasoning model. It is rolling out in preview starting today through the Gemini API in Google AI Studio and Vertex AI.
Google describes 3.1 Flash-Lite as the fastest and cheapest model in its Gemini 3 family so far. Pricing starts at $0.25 per 1 million input tokens and $1.50 per 1 million output tokens, with the company pitching it for high-volume tasks like translation, moderation, and other jobs where speed and cost matter more than anything else.
According to Google, Gemini 3.1 Flash-Lite delivers a 2.5 times faster time to first token than Gemini 2.5 Flash and a 45% boost in output speed, while still holding up on quality. Google also says the model beats other systems in the same tier on reasoning and multimodal benchmarks, which is the bigger point here. This is meant to be the model developers can use constantly without paying for a heavier system.
Flash-Lite also includes adjustable thinking levels, so developers can decide how much reasoning they want the model to use for a task. That gives it a bit more range than the “Lite” name might suggest. Google is clearly pitching it for repetitive, high-frequency work, but also for tasks that need stronger instruction following without moving up to a more expensive model.
On the API side, Gemini 3.1 Flash-Lite supports a 1 million token context window and can take in text, code, images, audio, video, and PDFs, while returning text output. That helps place it pretty clearly in Google’s lineup. It is not the flagship model, but it is the one built for scale.
Google also says early testers including Latitude, Cartwheel, and Whering are already using the model. The broader pitch is fairly straightforward. Gemini 3.1 Flash-Lite is supposed to be the model developers reach for when they want something fast, inexpensive, and reliable enough to run all day, with Google also highlighting output speed advantages over Claude 4.5 Haiku in its launch comparison chart.
Y. Anush Reddy is a contributor to this blog.



