Home Concept Explainers Claude Platform Claude Opus vs Sonnet vs Haiku: Which Model When

Claude Platform Crawler graph 3 Slider

Claude Opus vs Sonnet vs Haiku: Which Model When

Three sizes, three roles. Opus for hard reasoning, Sonnet for daily production work, Haiku for cheap and fast. Pick by task, not by hype.

Apr 29, 2026 · 3 Min. Lesezeit

Zum Lab springen Keine Anmeldung · Für immer kostenlos

▸ Selbst ausprobieren

Zieh einen Slider — das Diagramm reagiert in Echtzeit.

Leertaste für Play · ←/→ zum Scrubben

Crawler graph

FR /100 SN-514

SPACE · ◄ ►

¶ Die Analogie

The senior–mid–junior analogy

A team has three engineers: a senior architect who sees the whole system but is expensive and not always around, a mid-level developer who handles 90% of daily work brilliantly, and a junior who's lightning fast on small, well-defined tickets.

Claude's lineup maps cleanly: Opus is the architect, Sonnet is the daily driver, Haiku is the junior. Most projects need all three at different points; the trick is matching task to engineer.

The lineup at a glance

Model	Speed	Cost	Best at	Watch out for
Opus	Slowest	Most expensive	Deep reasoning, hard refactors, novel problems	Overkill on simple tasks; slow at scale
Sonnet	Mid	Mid	Production agents, coding, content, RAG	Sometimes needs Opus assist on the trickiest 5%
Haiku	Fastest	Cheapest	Classification, extraction, routing, micro-agents	Stumbles on multi-step reasoning; weaker tool use

Generations matter — "Sonnet 4.5" beats older "Opus 3" on many tasks. Always benchmark against the current version, not a historical reputation.

Picking by task

Open-ended planning, novel architecture, hard debugging → Opus. Pay the latency tax for the reasoning.
Daily agent work — Claude Code, RAG bots, coding assistants → Sonnet. The sweet spot.
High-volume classification, routing, summarisation → Haiku. 10× cheaper, plenty for narrow tasks.
Real-time UX (autocomplete, live chat first reply) → Haiku for first response, optionally upgrade if the question is hard.

The router pattern

Production stacks rarely use one model. A common pattern:

Triage with Haiku — "is this trivial, normal, or hard?"
Route accordingly — trivial answers from Haiku, normal from Sonnet, hard escalate to Opus.
Cap the budget — never escalate to Opus for free; require an explicit signal.

Routing alone often cuts spend 60–80% with negligible quality loss.

Cost mental model

For long-context, output-heavy workloads, output tokens dominate. Picking a smaller model that produces shorter, sharper answers can save more than a tier-down on the input price. Measure on your traffic, not on synthetic benchmarks.

Common mistakes

Defaulting to the biggest model "just to be safe." It is rarely safer; it's just more expensive.
Defaulting to the smallest model to save money. If quality drops 5%, support load goes up 30%.
Comparing across generations. "Haiku 4.5" can outperform "Opus 3.5" on many tasks. Use names and versions.
Ignoring evals. Benchmarks rarely match your domain. Build a 50-prompt eval set; run it against each tier; pick the cheapest one that passes.

When to bring in Opus deliberately

New problem class your team has never solved.
Critical PR review where a single missed bug is expensive.
Long-form planning that downstream agents will execute (errors compound).
Reasoning over a large structured artifact (architecture docs, ledger, schema).

For the rest — and "the rest" is usually 90% of traffic — Sonnet is the answer.

In one line

Use Opus to think, Sonnet to ship, Haiku to scale. Benchmark your task; the cheapest model that passes wins.

From the field

My default is the mid tier (Sonnet-class) for almost everything, and I escalate to Opus only when an eval shows the mid model actually failing — not "to be safe," which is how budgets quietly triple. The other direction matters just as much: a surprising amount of production traffic (classification, extraction, routing, simple formatting) runs perfectly on the cheapest model, and moving it down is free money. I size the model to the task, measure, and adjust. Reaching for the biggest model by default is the most common and most expensive mistake I see teams make with the Claude lineup.

→ Wollen Sie das in Ihrem Stack?

Custom Claude Code AI Agents & Workflows

Stop doing the repetitive, multi-step work that eats your team's day. This service delivers a working AI agent system that handles tasks like lead processing, data enrichment, content pipelines, and r...

So kann ich helfen