Gemini API — model lineup
The Gemini model family as of mid-2026 — Gemini 3 (current docs default, mostly preview), Gemini 2.5 (stable but scheduled for shutdown 16 October 2026), Gemini 2.0 (deprecated), naming patterns (stable / preview / latest aliases), pricing, and free-tier availability.
The lay of the land in mid-2026#
Three live generations, in order from newest to oldest:
| Generation | What’s stable | What’s preview | What’s deprecated |
|---|---|---|---|
| Gemini 3 | gemini-3.1-flash-lite (stable since 7 May 2026) | gemini-3-flash-preview, gemini-3.1-pro-preview, gemini-3-pro-image-preview, gemini-3.1-flash-image-preview | — |
| Gemini 2.5 | gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite | gemini-2.5-flash-live-preview, gemini-2.5-flash-tts-preview, gemini-2.5-computer-use-preview-10-2025, gemini-2.5-flash-image-preview (“Nano Banana”) | All scheduled for 16 October 2026 shutdown |
| Gemini 2.0 | — | — | gemini-2.0-flash, gemini-2.0-flash-001, gemini-2.0-flash-lite (shutdown 1 June 2026) |
Read first: every single code example in the official docs as of mid-2026 uses
gemini-3-flash-previewas the example model. Gemini 3 is the docs default for new code. Gemini 2.5 remains GA-stable until 16 October 2026 — that’s the safe baseline for production code shipping today, with Gemini 3 as the upgrade target.
Gemini 3 (current docs default)#
| Model | Context in / out | Status | Knowledge cutoff | Input pricing per 1M | Output pricing per 1M |
|---|---|---|---|---|---|
gemini-3.1-pro-preview | 1M / 64K | Preview, no shutdown date | Jan 2025 | $2.00 (≤200K) / $4.00 (>200K) | $12 / $18 |
gemini-3-flash-preview | 1M / 64K | Preview, no shutdown date | Jan 2025 | $0.50 (text/image/video) / $1.00 (audio) | $3.00 |
gemini-3.1-flash-lite | 1M / 64K | Stable since 7 May 2026 | Jan 2025 | $0.25 (text/image/video) / $0.50 (audio) | $1.50 |
gemini-3.1-flash-lite-preview | 1M / 64K | Preview → shutdown 25 May 2026 | Jan 2025 | $0.25 (text/image/video) / $0.50 (audio) | $1.50 |
gemini-3-pro-image-preview | 65K / 32K | Preview | Jan 2025 | Image pricing varies | |
gemini-3.1-flash-image-preview | 128K / 32K | Preview | Jan 2025 | Image pricing varies |
What’s new in Gemini 3 vs 2.5:
- New
thinking_levelparameter (replaces 2.5’sthinking_budgettoken cap) - Function call IDs are mandatory — every
functionCallreturns a uniqueidyou must echo back infunctionResponse(the SDK handles this for you, raw REST users must take care) - Thought signatures returned in response parts; SDK preserves them automatically across multi-turn function calls
Gemini 2.5 (still stable, all scheduled for 16 October 2026 shutdown)#
| Model | Context in / out | Inputs accepted | Input pricing per 1M | Output pricing per 1M | Free tier? |
|---|---|---|---|---|---|
gemini-2.5-pro | 1,048,576 / 65,536 | Audio, image, video, text, PDF | $1.25 (≤200K) / $2.50 (>200K) | $10.00 (≤200K) / $15.00 (>200K) | ❌ |
gemini-2.5-flash | 1,048,576 / 65,536 | Text, image, video, audio | $0.30 (text/image/video) / $1.00 (audio) | $2.50 | ✅ |
gemini-2.5-flash-lite | 1,048,576 / 65,536 | Same as Flash | $0.10 (text/image/video) / $0.30 (audio) | $0.40 | ✅ |
All three support: Batch API, Caching, Code Execution, File Search, Flex Inference, Function Calling, Grounding (Search + Maps), Priority Inference, Structured Output, Thinking, URL Context.
For a stable baseline today, write against gemini-2.5-flash. For more capability with structured reasoning, switch to gemini-2.5-pro. Plan to migrate to Gemini 3 before October 2026.
Gemini 2.0 (deprecated)#
| Model | Status |
|---|---|
gemini-2.0-flash | Image generation shut down; text/audio/video still appears on the pricing page as billable as of mid-2026. Verify before relying on it for new code — Google has been winding the 2.0 series down. |
gemini-2.0-flash-001 | Same status as above |
gemini-2.0-flash-lite | Same status as above |
If you have code on these, migrate now to gemini-2.5-flash (drop-in for most cases) or to gemini-3.1-flash-lite (stable Gemini 3, similar pricing, longer runway).
Naming patterns (so model IDs make sense)#
Google uses four patterns in model IDs:
| Pattern | Example | What it means |
|---|---|---|
| Stable | gemini-2.5-flash | Fixed; won’t change underneath you. Production-safe. |
| Stable with date | gemini-2.5-flash-09-2025 | A specific snapshot pinned by month. Most stable. |
| Preview | gemini-2.5-flash-preview-09-2025 | Billing enabled, ≥2 weeks deprecation notice. |
| Latest alias | gemini-flash-latest | Points at the current best Flash. Updated with 2 weeks email notice. |
| Experimental | gemini-2.5-computer-use-preview-10-2025 | Research / preview, stricter rate limits, not production-ready. |
For production, prefer the “stable with date” form (e.g. gemini-2.5-flash-09-2025) so a behaviour change in a future “Flash” doesn’t surprise you. For prototypes, the bare gemini-2.5-flash or gemini-flash-latest is fine.
Free tier (who gets what for nothing)#
Free on the pricing page for direct ai.google.dev API key access:
- ✅
gemini-2.5-flashandgemini-2.5-flash-lite - ✅
gemini-2.5-flash-live-preview(Live API) - ✅
gemini-2.5-flash-tts-preview - ✅
gemini-2.5-flash-image-preview(“Nano Banana”) - ✅
gemini-3-flash-previewandgemini-3.1-flash-lite - ✅
gemini-3.1-flash-tts-preview
Not free (paid tier required):
- ❌
gemini-2.5-pro - ❌
gemini-3.1-pro-preview - ❌
gemini-3.1-flash-image-preview - ❌
veo-3.1andveo-3.1-lite(video generation)
Free-tier RPM/TPM specifics aren’t published in the docs — Google directs you to aistudio.google.com/rate-limit where you can see your account-specific limits. They depend on the model and your usage tier.
Usage tiers (rate limits scale with spend)#
| Tier | Qualifies when | Spend cap |
|---|---|---|
| Free | Active project, no billing | N/A |
| Tier 1 | Billing account linked | $250 |
| Tier 2 | $100 paid + 3 days elapsed | $2,000 |
| Tier 3 | $1,000 paid + 30 days elapsed | $20,000–$100,000+ |
Higher tiers get higher RPM, TPM, and RPD across all models.
5. Search grounding pricing differs by generation#
Worth its own line because it’s billed differently from tokens, and the rate varies by model generation:
Gemini 3 models:
- 5,000 grounded prompts per month free (shared across Gemini 3 models)
- After that: $14 per 1,000 search queries
Gemini 2.5 models (paid tier):
- 1,500 requests/day free (shared between Flash and Flash-Lite RPD)
- After that: $35 per 1,000 grounded prompts (different rate to Gemini 3)
groundingMetadata on every response shows the queries used and the URLs cited.
Mixed-modality capabilities at a glance#
| Capability | gemini-2.5-pro | gemini-2.5-flash | gemini-3-flash-preview | gemini-3.1-pro-preview |
|---|---|---|---|---|
| Text | ✅ | ✅ | ✅ | ✅ |
| Image input | ✅ | ✅ | ✅ | ✅ |
| Video input | ✅ | ✅ | ✅ | ✅ |
| Audio input | ✅ | ✅ | ✅ | ✅ |
| PDF input | ✅ | ✅ | ✅ | ✅ |
| Image output | (separate model) | (separate model) | (separate model) | (separate model) |
| Audio output (TTS) | (separate model) | (separate model) | (separate model) | (separate model) |
Image / audio / video generation uses dedicated models — gemini-2.5-flash-image-preview for images, *-tts-preview for speech, Veo for video, Lyria for music.
What’s next#
- §GAPI.4 Patterns — multi-turn chat, streaming, system instructions, files API, context caching, safety settings
- §GAPI.5 Tool use — function calling, Google Search grounding, code execution
- §GAPI.2 Getting started — install the SDK and make your first call
- §GCL.5 Gemini CLI pitfalls — note about README “Gemini 3” vs stable
gemini-2.5-pro