Claw field notebook
last updated 2026-05-15 edit on GitHub colophon
Google / Gemini API / GAPI.3 · 4 min read

Gemini API — model lineup

The Gemini model family as of mid-2026 — Gemini 3 (current docs default, mostly preview), Gemini 2.5 (stable but scheduled for shutdown 16 October 2026), Gemini 2.0 (deprecated), naming patterns (stable / preview / latest aliases), pricing, and free-tier availability.

The lay of the land in mid-2026#

Three live generations, in order from newest to oldest:

GenerationWhat’s stableWhat’s previewWhat’s deprecated
Gemini 3gemini-3.1-flash-lite (stable since 7 May 2026)gemini-3-flash-preview, gemini-3.1-pro-preview, gemini-3-pro-image-preview, gemini-3.1-flash-image-preview
Gemini 2.5gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-litegemini-2.5-flash-live-preview, gemini-2.5-flash-tts-preview, gemini-2.5-computer-use-preview-10-2025, gemini-2.5-flash-image-preview (“Nano Banana”)All scheduled for 16 October 2026 shutdown
Gemini 2.0gemini-2.0-flash, gemini-2.0-flash-001, gemini-2.0-flash-lite (shutdown 1 June 2026)

Read first: every single code example in the official docs as of mid-2026 uses gemini-3-flash-preview as the example model. Gemini 3 is the docs default for new code. Gemini 2.5 remains GA-stable until 16 October 2026 — that’s the safe baseline for production code shipping today, with Gemini 3 as the upgrade target.

Gemini 3 (current docs default)#

ModelContext in / outStatusKnowledge cutoffInput pricing per 1MOutput pricing per 1M
gemini-3.1-pro-preview1M / 64KPreview, no shutdown dateJan 2025$2.00 (≤200K) / $4.00 (>200K)$12 / $18
gemini-3-flash-preview1M / 64KPreview, no shutdown dateJan 2025$0.50 (text/image/video) / $1.00 (audio)$3.00
gemini-3.1-flash-lite1M / 64KStable since 7 May 2026Jan 2025$0.25 (text/image/video) / $0.50 (audio)$1.50
gemini-3.1-flash-lite-preview1M / 64KPreview → shutdown 25 May 2026Jan 2025$0.25 (text/image/video) / $0.50 (audio)$1.50
gemini-3-pro-image-preview65K / 32KPreviewJan 2025Image pricing varies
gemini-3.1-flash-image-preview128K / 32KPreviewJan 2025Image pricing varies

What’s new in Gemini 3 vs 2.5:

  • New thinking_level parameter (replaces 2.5’s thinking_budget token cap)
  • Function call IDs are mandatory — every functionCall returns a unique id you must echo back in functionResponse (the SDK handles this for you, raw REST users must take care)
  • Thought signatures returned in response parts; SDK preserves them automatically across multi-turn function calls

Gemini 2.5 (still stable, all scheduled for 16 October 2026 shutdown)#

ModelContext in / outInputs acceptedInput pricing per 1MOutput pricing per 1MFree tier?
gemini-2.5-pro1,048,576 / 65,536Audio, image, video, text, PDF$1.25 (≤200K) / $2.50 (>200K)$10.00 (≤200K) / $15.00 (>200K)
gemini-2.5-flash1,048,576 / 65,536Text, image, video, audio$0.30 (text/image/video) / $1.00 (audio)$2.50
gemini-2.5-flash-lite1,048,576 / 65,536Same as Flash$0.10 (text/image/video) / $0.30 (audio)$0.40

All three support: Batch API, Caching, Code Execution, File Search, Flex Inference, Function Calling, Grounding (Search + Maps), Priority Inference, Structured Output, Thinking, URL Context.

For a stable baseline today, write against gemini-2.5-flash. For more capability with structured reasoning, switch to gemini-2.5-pro. Plan to migrate to Gemini 3 before October 2026.

Gemini 2.0 (deprecated)#

ModelStatus
gemini-2.0-flashImage generation shut down; text/audio/video still appears on the pricing page as billable as of mid-2026. Verify before relying on it for new code — Google has been winding the 2.0 series down.
gemini-2.0-flash-001Same status as above
gemini-2.0-flash-liteSame status as above

If you have code on these, migrate now to gemini-2.5-flash (drop-in for most cases) or to gemini-3.1-flash-lite (stable Gemini 3, similar pricing, longer runway).

Naming patterns (so model IDs make sense)#

Google uses four patterns in model IDs:

PatternExampleWhat it means
Stablegemini-2.5-flashFixed; won’t change underneath you. Production-safe.
Stable with dategemini-2.5-flash-09-2025A specific snapshot pinned by month. Most stable.
Previewgemini-2.5-flash-preview-09-2025Billing enabled, ≥2 weeks deprecation notice.
Latest aliasgemini-flash-latestPoints at the current best Flash. Updated with 2 weeks email notice.
Experimentalgemini-2.5-computer-use-preview-10-2025Research / preview, stricter rate limits, not production-ready.

For production, prefer the “stable with date” form (e.g. gemini-2.5-flash-09-2025) so a behaviour change in a future “Flash” doesn’t surprise you. For prototypes, the bare gemini-2.5-flash or gemini-flash-latest is fine.

Free tier (who gets what for nothing)#

Free on the pricing page for direct ai.google.dev API key access:

  • gemini-2.5-flash and gemini-2.5-flash-lite
  • gemini-2.5-flash-live-preview (Live API)
  • gemini-2.5-flash-tts-preview
  • gemini-2.5-flash-image-preview (“Nano Banana”)
  • gemini-3-flash-preview and gemini-3.1-flash-lite
  • gemini-3.1-flash-tts-preview

Not free (paid tier required):

  • gemini-2.5-pro
  • gemini-3.1-pro-preview
  • gemini-3.1-flash-image-preview
  • veo-3.1 and veo-3.1-lite (video generation)

Free-tier RPM/TPM specifics aren’t published in the docs — Google directs you to aistudio.google.com/rate-limit where you can see your account-specific limits. They depend on the model and your usage tier.

Usage tiers (rate limits scale with spend)#

TierQualifies whenSpend cap
FreeActive project, no billingN/A
Tier 1Billing account linked$250
Tier 2$100 paid + 3 days elapsed$2,000
Tier 3$1,000 paid + 30 days elapsed$20,000–$100,000+

Higher tiers get higher RPM, TPM, and RPD across all models.

5. Search grounding pricing differs by generation#

Worth its own line because it’s billed differently from tokens, and the rate varies by model generation:

Gemini 3 models:

  • 5,000 grounded prompts per month free (shared across Gemini 3 models)
  • After that: $14 per 1,000 search queries

Gemini 2.5 models (paid tier):

  • 1,500 requests/day free (shared between Flash and Flash-Lite RPD)
  • After that: $35 per 1,000 grounded prompts (different rate to Gemini 3)

groundingMetadata on every response shows the queries used and the URLs cited.

Mixed-modality capabilities at a glance#

Capabilitygemini-2.5-progemini-2.5-flashgemini-3-flash-previewgemini-3.1-pro-preview
Text
Image input
Video input
Audio input
PDF input
Image output(separate model)(separate model)(separate model)(separate model)
Audio output (TTS)(separate model)(separate model)(separate model)(separate model)

Image / audio / video generation uses dedicated models — gemini-2.5-flash-image-preview for images, *-tts-preview for speech, Veo for video, Lyria for music.

What’s next#

Sources