Home Concept Explainers Reasoning Patterns Reflexion and Self-Critique: AI That Reviews Its Own Work

Reasoning Patterns Agent loop 3 sliders

Reflexion and Self-Critique: AI That Reviews Its Own Work

Reflexion adds a critique-and-revise loop. The model produces output, criticises it, revises. A few cents extra; meaningful quality gain on the right tasks.

Apr 29, 2026 · 4 min de leitura

Ir para o laboratório Sem cadastro · Grátis para sempre

▸ Experimente você mesmo

Arraste um slider — o diagrama reage em tempo real.

Espaço para play · ←/→ para scrubar

Agent loop

FR /100 SN-74A

SPACE · ◄ ►

¶ A analogia

The proofreader analogy

A first draft is rarely the best draft. Writers re-read, spot what's weak, and revise. Some of the best work happens in the second pass, not the first. The skill is not always producing perfect prose first try — it's noticing what's broken and fixing it.

Reflexion gives an LLM that proofreader habit. The model produces an answer, critiques itself ("did I follow the spec? are there bugs? could this be simpler?"), and revises. The first pass is rarely the last.

The basic loop

draft = model.generate(task)
for round in range(N):
    critique = model.critique(draft, task)
    if critique.is_done:
        break
    draft = model.revise(draft, critique)
return draft

Three roles played by the same model (or different models): generator, critic, reviser. The loop runs 1–3 times in practice.

Why it works

Critique is easier than generation. Spotting a bug in code is often easier than writing the bug-free code in one pass. The critic exploits this asymmetry.
The revision sees what the draft missed. New information at revise time = better output.
Self-consistency on hard examples. The model tends to converge on better answers across rounds, especially when the critique surfaces concrete issues.

Variants

Reflexion (canonical) — the model writes an explicit "lesson" after a failed attempt and reuses it on the next try. Good for agents that retry tasks.
Plain self-critique — one critique → one revision. Simpler, often enough.
Critic separate from generator — different model, different prompt, sometimes a smaller and cheaper model.
Debate / multi-critic — two critics with different perspectives; model reconciles. Useful for high-stakes outputs.

Where it shines

Code generation — "does this compile? does it match the spec? are there edge cases I missed?"
Long-form writing — structure, clarity, factual claims.
Complex reasoning — sanity-check the math, the steps, the conclusion.
Multi-constraint outputs — schema validity + tone + length all at once.
Agent retries — after a failed action, write a lesson and try a different approach next time.

Where it doesn't help

Simple tasks — overhead beats the gain.
Latency-critical UX — N× the calls means N× the wait (parallelise where possible, but the loop is inherently sequential).
Cases where the model can't self-critique — narrow domains where the model has no purchase. A reflective wrong answer is just a longer wrong answer.
When you have ground-truth verification — if you can run tests, the test result is a better critic than the model's prose.

Engineering it

Cap the rounds. 1–3 max. Past that you get diminishing returns and runaway cost.
Force structured critique. { issues: [{severity, location, fix}], ready_to_ship: bool }. Trivially parseable; loop exits cleanly.
Different model for critic vs generator — sometimes. A bigger critic checking a smaller generator can be the right cost / quality trade-off.
Diff-based revision. Ask the reviser to apply specific changes from the critique, not regenerate from scratch. Preserves what was already good.
Track ready-to-ship rate — your eval signal. If round 1 ships 60% and round 2 ships 80%, you know the loop is working.

Where reflexion beats simpler tricks

Reflexion's superpower is learning from a failure. After an action fails:

Model writes a lesson ("when the API returns 422, also include the email field").
The lesson is appended to the next attempt's prompt.
The model retries and now avoids the same failure.

Across many runs, those lessons accumulate into a memory the agent uses to perform better on the same task class. This compounds in long-running agent systems in ways one-shot critique does not.

Common pitfalls

Sycophantic critic. "Looks great!" on every output. Force structure: explicit issue list with severity.
Critic and generator share a blind spot. If the same model is wrong about the same thing in both roles, the loop endorses it. Mix in tools or a separate critic model when this matters.
Loop never ends. Always cap rounds; don't trust the model's own "is_done" signal as the only stop.
Critique cost ignored. If your critic call is as long as your generator call, you've doubled latency and cost. Make the critic small and tight.

In one line

Reflexion teaches a model to be its own first reviewer — cheap, often surprisingly effective, especially in agent loops where you want the system to learn from failures across runs.

From the field

Self-critique gives the biggest lift exactly where spotting a problem is easier than avoiding it first time — code that has to compile and pass tests is the perfect case, because the critic has an objective signal to point at. Where it disappoints is subjective prose: a model critiquing its own tone often just reshuffles words. Two things make it work in practice: cap the loop at one or two rounds (returns diminish fast and cost climbs), and where I can, use a separate, cheaper model as the critic so it isn't grading its own homework with the same blind spots. Give the critic something concrete to check, or skip it.

→ Quer isso na sua stack?

Custom Claude Code AI Agents & Workflows

Stop doing the repetitive, multi-step work that eats your team's day. This service delivers a working AI agent system that handles tasks like lead processing, data enrichment, content pipelines, and r...

Ver como posso ajudar