Home Concept Explainers Reasoning Patterns Prompt Chaining: Breaking Complex Tasks Into Steps

Reasoning Patterns MCP handshake 3 sliders

Prompt Chaining: Breaking Complex Tasks Into Steps

Instead of one mega-prompt, chain N small prompts where each step's output feeds the next. Easier to debug, easier to evaluate, easier to evolve.

Apr 29, 2026 · 4 min lezen

Naar het lab Geen registratie · Voor altijd gratis

▸ Probeer het zelf

Sleep een slider — het diagram reageert in real time.

Spatie voor play · ←/→ om te scrubben

MCP handshake

FR /100 SN-312

SPACE · ◄ ►

¶ De analogie

The kanban-board analogy

A project that lives in one giant ticket — "build the new dashboard" — is impossible to manage. Split it into a board of small cards — spec, mock, build, test, ship — and each card has a clear input, output, and reviewer. Progress becomes visible. Debugging becomes possible.

Prompt chaining is the same move applied to LLM tasks. One prompt that says "do everything" is brittle. A chain of prompts — each focused, each with a clean input/output contract — is robust, debuggable, and incrementally improvable.

The pattern

input → [Prompt 1: extract] → structured X
        → [Prompt 2: enrich] → enriched X'
        → [Prompt 3: format] → final output

Each box is a separate LLM call with a small, focused prompt. The output of one is the input of the next.

Why it beats one mega-prompt

Each step is testable. You can eval prompt 2 in isolation with synthetic inputs.
Each step is replaceable. Swap a model per step (Haiku for extract, Sonnet for synthesis).
Failures are localised. When the output is wrong, you can see exactly which step broke.
Smaller context per step. Each call sees only what it needs — cheaper, less distraction.
Independent retries. Retry the failing step, not the whole chain.

A mega-prompt is one ball of mud where every change risks regressing every other behaviour.

Common chain shapes

Linear (most common)

extract → analyse → respond

Simple, easy to reason about, easy to debug.

Conditional / branching

classify → if X then path A; if Y then path B

Different downstream chains for different input types. Routing is itself an LLM call (or a classifier).

Map-reduce

split into chunks → run prompt on each → merge

Excellent for long inputs (summarise 100 reviews) where a mega-prompt would blow context.

draft → critique → revise → critique → revise

The output gets better with each round. Cap the rounds (3 is usually enough).

Validate and route

first attempt → validate → if invalid, retry with feedback

A specific case of conditional. Cuts failure rates dramatically when validation is cheap (schema check, lint, test pass).

Designing chain steps

One job per step. "Extract the order details" is a step. "Extract the order details and write a follow-up email" is two steps in disguise.
Structured handoffs. The output between steps should be JSON, not prose. No regex parsing in your runtime.
Tight prompts. Each step's prompt is short, focused, and free of unrelated instructions.
Stateless steps. A step takes input → produces output. Side effects live outside, in your code.

Where chains beat agents

For workflows that are deterministic and fixed ("always extract → enrich → format"), a chain is simpler than an agent. Agents are right when the control flow itself depends on the input ("decide if you need to search, then maybe call this tool").

Heuristic: write the workflow as a flowchart. If the flowchart fits on a page with no loops or branches that depend on model judgement, you have a chain. Otherwise you have an agent.

Where chains stumble

Compounding errors. A 10-step chain with 95% per-step accuracy ends at 60%. Add validation between steps.
Verbose intermediates. Each step's output is some other step's context budget. Watch token totals.
Latency. Sequential calls add up. Parallelise where the DAG allows.
Snowballing prompts. Engineers pile instructions into one step instead of adding a step. Resist.

Tooling

LangChain has chain primitives baked in.
LangGraph for DAG-shaped chains with explicit nodes and edges.
Plain code is often the right answer — await step1(); await step2(). Frameworks help when the chain has branching, retries, and tracing needs.

In one line

One mega-prompt looks elegant in the demo and rots in production. A chain of small prompts is the boring, debuggable, shippable version of the same idea.

From the field

My rule for chain-versus-agent: if I know the steps in advance, I hard-code a chain; if the path depends on what's discovered along the way, I use an agent. Most "we need an agent" problems are actually fixed pipelines wearing a costume — extract, then transform, then format — and a deterministic chain is cheaper, faster, and far easier to debug because each step's input and output are inspectable. Agents earn their unpredictability only when the branching is genuinely dynamic. I start every design by asking whether the control flow is known; if it is, no agent, just a chain of well-scoped calls.

→ Wilt u dit in uw stack?

Custom Claude Code AI Agents & Workflows

Stop doing the repetitive, multi-step work that eats your team's day. This service delivers a working AI agent system that handles tasks like lead processing, data enrichment, content pipelines, and r...

Zie hoe ik kan helpen