Claude models — Sonnet, Opus, Haiku
What each Claude tier is good for, what they cost in relative terms, how to pick between them, and the always-changing context window + vision support matrix. Pricing and model names drift fast — verify before committing.
The three-tier mental model#
Claude ships in three tiers — same family, different cost/quality balance:
| Tier | What it’s for | Vibe |
|---|---|---|
| Haiku | Cheap, fast tasks. Classification, simple Q&A, batch jobs, lightweight tool calls. | ”I need quick answers at scale.” |
| Sonnet | The default for production. Most coding, agentic work, complex Q&A. | ”Pick this unless you have a reason not to.” |
| Opus | Hardest reasoning. Long planning, multi-step analysis, complex code refactors. | ”Worth the cost when correctness matters.” |
You pick the tier in your API request (model: "claude-sonnet-4-6" etc.). Each tier ships in multiple generations (3.5, 4, 4.5, 4.6, 4.7…). Newer generations are usually better at everything but cost more — for a while.
What changes between generations#
When Anthropic ships a new Sonnet (say, Sonnet 4.5 → Sonnet 5), you get some mix of:
- Higher quality on hard tasks
- Better instruction following (less prompting needed for the same output)
- Wider context window — Opus 4.7 and Sonnet 4.6 ship with 1M-token context as standard; Sonnet 4.5 (and earlier Sonnet 4) ship at 200K standard with a 1M-token beta context available only on AWS Bedrock and Google Vertex AI (not on the direct Anthropic API or Microsoft Foundry). The 1M-as-standard story begins with Sonnet 4.6. Haiku 4.5 remains at 200K. Opus 4.5/4.6 are listed at 1M in the docs (verify before relying)
- New capabilities (vision support, computer-use APIs, new tool-use modes)
- Different pricing (often a small bump on the latest, with the previous one becoming a bargain)
The old generation usually stays available for at least 6–12 months. Plan upgrades; don’t be forced into them.
The matrix (as of 14 May 2026 — verify against anthropic.com/pricing)#
| Model | Strengths | Context | Vision | Computer Use |
|---|---|---|---|---|
| claude-opus-4-7 | Hardest reasoning · long horizon planning | 1M | Yes | Yes (beta) |
| claude-sonnet-4-6 | Default — production sweet spot | 1M | Yes | Yes (beta) |
| claude-haiku-4-5 | Fast + cheap · batch + classification | 200K | Yes | Yes (beta) |
| claude-opus-4-6 | Older Opus · still solid · cheaper | 1M | Yes | Yes (beta) |
| claude-sonnet-4-5 | Previous Sonnet · still active · legacy | 200K (1M beta: Bedrock/Vertex only) | Yes | Yes (beta) |
| claude-opus-4-5 | Earlier Opus generation · still active | 1M | Yes | Yes (beta) |
Recently retired (do not use on the direct Claude API):
- claude-haiku-3-5 — retired on the API in February 2026. Still available on AWS Bedrock and Google Vertex AI.
- claude-sonnet-4 — deprecated April 2026, retiring June 2026. Migrate to
claude-sonnet-4-6before retirement.
(Generation numbers drift; check docs.anthropic.com/en/docs/about-claude/models for the current list.)
How to pick#
Default = Sonnet#
If you don’t have a strong reason otherwise, start on Sonnet (currently claude-sonnet-4-6). It’s the safe default for most production work — fast enough, smart enough, mid-priced. Most apps live here unless cost, latency, or reasoning depth pushes you elsewhere.
Reach for Opus when…#
- The task is inherently hard reasoning — proof generation, complex multi-step planning, deep code refactors across many files
- You’re seeing bad output on Sonnet that isn’t fixable with a better prompt
- You’re doing agentic work where decisions cascade — early-step errors compound across a long session
- Correctness > cost for this particular call
Reach for Haiku when…#
- The task is shallow — labelling tickets, extracting names, picking from N choices
- You’re doing bulk work — thousands of small classifications a night
- Latency matters more than reasoning depth (Haiku is meaningfully faster)
- You can filter to Sonnet for the cases Haiku flags as uncertain
The pattern that emerges in production: Haiku for triage, Sonnet for work, Opus when stakes are high. Many apps use two or three tiers based on the call.
What “better at X” actually means#
Vendors publish benchmarks. They’re a starting signal, not the truth. Real-world quality differences show up in:
- How often you have to retry or correct the output
- How verbose the model is (more output = more cost + more parsing)
- How well it follows format constraints (does it actually return valid JSON when you asked?)
- How it handles ambiguity (does it ask a clarifying question, or guess?)
Build a small representative prompt suite for your real workload and compare on that. Leaderboards rank for general tasks; you care about your task.
Context windows#
claude-opus-4-7 and claude-sonnet-4-6 both ship with 1M-token context windows by default. claude-haiku-4-5 is 200K tokens. Rule of thumb: 1M tokens is roughly 2,500 pages of plain text; 200K is roughly 500 pages.
⚠️ Long-context performance ≠ short-context performance. A model can technically read 1M tokens but reasoning over the very late content tends to be weaker than over the front. If quality drops near the end of long prompts, try splitting the task — chunk the input, summarise, then ask the question over the summary.
Vision#
Current Sonnet, Opus, and Haiku tiers all accept images. Use cases:
- OCR (read text from screenshots)
- Chart analysis (“what does this graph say?”)
- UI debugging (“why does this layout look broken?”)
- Document understanding (PDFs as image arrays)
Image limits (on the API): up to 100 images per request on 200K-context models, up to 600 per request on the 1M-context models. Maximum pixel dimensions are 8000×8000 (or 2000×2000 if you’re sending more than 20 in one request). Formats: PNG, JPEG, GIF, WebP. Charged at a per-image token cost on top of the text tokens. (Note: the claude.ai web UI limits image input to 20 per message — that’s a UI cap, not the API cap.)
Computer Use#
Computer Use is supported across several Claude 4.x models via two beta header versions:
computer-use-2025-11-24(newer):claude-opus-4-7,claude-opus-4-6,claude-sonnet-4-6,claude-opus-4-5computer-use-2025-01-24(older):claude-sonnet-4-5,claude-haiku-4-5, plus older Claude 4 + 3.7 models
Pick the beta header that matches the model you’re using. See §CU.1 Computer Use overview for what works today.
Model deprecation#
Anthropic deprecates older models on a published schedule (check the deprecations page for current dates). When a model is deprecated:
- Announce — usually 6+ months out
- Sunset — calls start returning errors or get auto-routed to the next tier
Your code should read the model name from a config, not hardcode it inline. When deprecation lands, you update one place.
What to do next#
- §API.4 Common patterns — how to actually use these models well
- §API.2 Getting started — if you haven’t made a call yet
- §API.1 Claude API overview — the wider picture