Claude models — Sonnet, Opus, Haiku

The three-tier mental model#

Claude ships in three tiers — same family, different cost/quality balance:

Tier	What it’s for	Vibe
Haiku	Cheap, fast tasks. Classification, simple Q&A, batch jobs, lightweight tool calls.	”I need quick answers at scale.”
Sonnet	The default for production. Most coding, agentic work, complex Q&A.	”Pick this unless you have a reason not to.”
Opus	Hardest reasoning. Long planning, multi-step analysis, complex code refactors.	”Worth the cost when correctness matters.”

You pick the tier in your API request (model: "claude-sonnet-4-6" etc.). Each tier ships in multiple generations (3.5, 4, 4.5, 4.6, 4.7…). Newer generations are usually better at everything but cost more — for a while.

What changes between generations#

When Anthropic ships a new Sonnet (say, Sonnet 4.5 → Sonnet 5), you get some mix of:

Higher quality on hard tasks
Better instruction following (less prompting needed for the same output)
Wider context window — Opus 4.7 and Sonnet 4.6 ship with 1M-token context as standard; Sonnet 4.5 (and earlier Sonnet 4) ship at 200K standard with a 1M-token beta context available only on AWS Bedrock and Google Vertex AI (not on the direct Anthropic API or Microsoft Foundry). The 1M-as-standard story begins with Sonnet 4.6. Haiku 4.5 remains at 200K. Opus 4.5/4.6 are listed at 1M in the docs (verify before relying)
New capabilities (vision support, computer-use APIs, new tool-use modes)
Different pricing (often a small bump on the latest, with the previous one becoming a bargain)

The old generation usually stays available for at least 6–12 months. Plan upgrades; don’t be forced into them.

The matrix (as of 14 May 2026 — verify against anthropic.com/pricing)#

Model	Strengths	Context	Vision	Computer Use
claude-opus-4-7	Hardest reasoning · long horizon planning	1M	Yes	Yes (beta)
claude-sonnet-4-6	Default — production sweet spot	1M	Yes	Yes (beta)
claude-haiku-4-5	Fast + cheap · batch + classification	200K	Yes	Yes (beta)
claude-opus-4-6	Older Opus · still solid · cheaper	1M	Yes	Yes (beta)
claude-sonnet-4-5	Previous Sonnet · still active · legacy	200K (1M beta: Bedrock/Vertex only)	Yes	Yes (beta)
claude-opus-4-5	Earlier Opus generation · still active	1M	Yes	Yes (beta)

Recently retired (do not use on the direct Claude API):

claude-haiku-3-5 — retired on the API in February 2026. Still available on AWS Bedrock and Google Vertex AI.
claude-sonnet-4 — deprecated April 2026, retiring June 2026. Migrate to claude-sonnet-4-6 before retirement.

(Generation numbers drift; check docs.anthropic.com/en/docs/about-claude/models for the current list.)

How to pick#

Default = Sonnet#

If you don’t have a strong reason otherwise, start on Sonnet (currently claude-sonnet-4-6). It’s the safe default for most production work — fast enough, smart enough, mid-priced. Most apps live here unless cost, latency, or reasoning depth pushes you elsewhere.

Reach for Opus when…#

The task is inherently hard reasoning — proof generation, complex multi-step planning, deep code refactors across many files
You’re seeing bad output on Sonnet that isn’t fixable with a better prompt
You’re doing agentic work where decisions cascade — early-step errors compound across a long session
Correctness > cost for this particular call

Reach for Haiku when…#

The task is shallow — labelling tickets, extracting names, picking from N choices
You’re doing bulk work — thousands of small classifications a night
Latency matters more than reasoning depth (Haiku is meaningfully faster)
You can filter to Sonnet for the cases Haiku flags as uncertain

The pattern that emerges in production: Haiku for triage, Sonnet for work, Opus when stakes are high. Many apps use two or three tiers based on the call.

What “better at X” actually means#

Vendors publish benchmarks. They’re a starting signal, not the truth. Real-world quality differences show up in:

How often you have to retry or correct the output
How verbose the model is (more output = more cost + more parsing)
How well it follows format constraints (does it actually return valid JSON when you asked?)
How it handles ambiguity (does it ask a clarifying question, or guess?)

Build a small representative prompt suite for your real workload and compare on that. Leaderboards rank for general tasks; you care about your task.

Context windows#

claude-opus-4-7 and claude-sonnet-4-6 both ship with 1M-token context windows by default. claude-haiku-4-5 is 200K tokens. Rule of thumb: 1M tokens is roughly 2,500 pages of plain text; 200K is roughly 500 pages.

⚠️ Long-context performance ≠ short-context performance. A model can technically read 1M tokens but reasoning over the very late content tends to be weaker than over the front. If quality drops near the end of long prompts, try splitting the task — chunk the input, summarise, then ask the question over the summary.

Vision#

Current Sonnet, Opus, and Haiku tiers all accept images. Use cases:

OCR (read text from screenshots)
Chart analysis (“what does this graph say?”)
UI debugging (“why does this layout look broken?”)
Document understanding (PDFs as image arrays)

Image limits (on the API): up to 100 images per request on 200K-context models, up to 600 per request on the 1M-context models. Maximum pixel dimensions are 8000×8000 (or 2000×2000 if you’re sending more than 20 in one request). Formats: PNG, JPEG, GIF, WebP. Charged at a per-image token cost on top of the text tokens. (Note: the claude.ai web UI limits image input to 20 per message — that’s a UI cap, not the API cap.)

Computer Use#

Computer Use is supported across several Claude 4.x models via two beta header versions:

computer-use-2025-11-24 (newer): claude-opus-4-7, claude-opus-4-6, claude-sonnet-4-6, claude-opus-4-5
computer-use-2025-01-24 (older): claude-sonnet-4-5, claude-haiku-4-5, plus older Claude 4 + 3.7 models

Pick the beta header that matches the model you’re using. See §CU.1 Computer Use overview for what works today.

Model deprecation#

Anthropic deprecates older models on a published schedule (check the deprecations page for current dates). When a model is deprecated:

Announce — usually 6+ months out
Sunset — calls start returning errors or get auto-routed to the next tier

Your code should read the model name from a config, not hardcode it inline. When deprecation lands, you update one place.

What to do next#

§API.4 Common patterns — how to actually use these models well
§API.2 Getting started — if you haven’t made a call yet
§API.1 Claude API overview — the wider picture

`⌘` + `K` · `/`	open search
`j`	next entry (within section)
`k`	previous entry
`g` `h`	go to home
`g` `m`	go to methodology
`?`	show this help
`esc`	close any modal