The kitchen-brigade analogy
A small café runs fine with one cook who plates everything. A Michelin kitchen runs a brigade: a chef who plans the menu, a sous-chef who coordinates, a saucier who only handles sauces, a pastry chef who only does dessert. More mouths, more coordination — but each station is excellent at one thing, and the head chef stitches the plate together.
Agentic workflows scale the same way. A single agent works for narrow jobs. Complex jobs benefit from a crew — specialised agents, a supervisor, and a clean way to pass work between them.
Single agent first
Before you reach for a crew, ask: can one agent with good tools do this? The answer is yes more often than the demos suggest.
Single agent wins when:
- The task is mostly linear (search → read → write).
- The tool set is < ~10 well-named tools.
- Latency budget is tight.
Single agent breaks when:
- The task needs distinct skill sets (research + coding + reviewing).
- The tool surface explodes past ~15 and the model hesitates.
- You want parallel work (three searches, one synthesis).
Crew patterns that earn their keep
1. Supervisor + workers
A planning agent decomposes the task and routes subtasks to specialised workers (researcher, writer, reviewer). The supervisor stitches results together. Best for: multi-skill goals like "research, draft, fact-check."
2. Pipeline (sequential)
Agent A's output is Agent B's input. No back-and-forth, just a chain. Best for: ETL-shaped work where each stage transforms the artifact.
3. Parallel fan-out
Spawn N agents on independent sub-problems, then a reducer agent merges. Best for: "summarise these 20 PDFs," "search five sources at once."
4. Debate / critic
One agent proposes, another critiques, a third decides. Best for: high-stakes decisions where you want adversarial review.
What gets harder with multiple agents
- State sharing — what each agent sees vs hides. Pass artifacts by reference (a file, a row id), not by stuffing everything into prompts.
- Cost — every agent burns tokens. A 4-agent crew is roughly 4× the spend of a solo. Make the quality lift worth it.
- Debugging — when the answer is wrong, which agent failed? Add per-agent logging and traces from day one.
- Loops between agents — A asks B, B asks A, infinite. Cap turns and force progress.
Decision rule
Start with one agent and good tools. Add a second agent only when you have a concrete reason — distinct skill, parallelism, or adversarial review — that a single agent measurably cannot deliver.
Crew complexity is a tax. Pay it deliberately.