Home Concept Explainers Reasoning Patterns ReAct Pattern: Reasoning + Acting in AI Agents

Reasoning Patterns Agent loop 3 sliders

ReAct Pattern: Reasoning + Acting in AI Agents

ReAct interleaves a Thought, an Action, and an Observation at each step. The "talk to yourself, then do, then look" loop powers most modern agents.

Apr 29, 2026 · 4 min de lectura

Ir al laboratorio Sin registro · Gratis para siempre

▸ Pruébalo tú mismo

Arrastra un slider — el diagrama reacciona en tiempo real.

Espacio para play · ←/→ para scrubear

Agent loop

FR /100 SN-74A

SPACE · ◄ ►

¶ La analogía

The detective analogy

A detective at a crime scene does not silently scan the room and announce a verdict. They think aloud — "the mud on the carpet means…" — act ("let me check the back door"), observe the result ("locked from inside"), and feed it into the next thought. The pattern is so natural we don't even name it.

ReAct (Reasoning + Acting) names it. Each agent step is a Thought → Action → Observation triple. Reasoning conditions the action; the action's result conditions the next reasoning. The detective's logic, in code.

The shape of one step

Thought:  I should check the user's recent orders to see why they're asking.
Action:   get_orders(user_id="u_42", limit=10)
Observation: 3 orders, latest "shipped" yesterday, tracking failed to update.
Thought:  The tracking is the issue. Let me look up the carrier next.
Action:   get_tracking(order_id="o_991")
...

Repeat until the model emits a final answer (or hits a stop condition). The agent's transcript is literally readable as a thought process — which makes debugging dramatically easier than opaque function-call sequences.

Why ReAct beats both extremes

Pure reasoning (CoT only) — model "thinks" but cannot fetch facts. Hallucinations explode on knowledge-heavy tasks.
Pure acting (tool calls without reasoning) — model fires tools blindly, picks badly, never reflects on results.
ReAct — reasoning grounds the next action; observation grounds the next reasoning. The two halves correct each other.

What to put in each slot

Thought

Short, plan-shaped, in plain language.
One sentence is often enough. Long-thought is a tax.
Should reference what the next action will be — "I'll do X to find Y."

Action

A structured tool call. Function name + typed arguments.
One action per step. Multiple parallel actions are a different pattern (parallel ReAct or fan-out).

Observation

The tool's output, possibly truncated to fit context.
Big outputs (10MB log files) need summarisation or pagination — never just paste raw.
Errors are observations too — a 500 from an API gives the model real information.

Stopping conditions

Final answer emitted — the model's prose says "Final Answer: …".
Confidence threshold reached — your code looks at the latest thought and decides "good enough."
Max steps hit — hard ceiling. Always set this. 8–12 is sane for most tasks.
Loop detected — same action twice in a row? Break the loop deliberately.

Engineering ReAct in practice

Templates with clear delimiters. <thought>, <action>, <observation> tags or similar. Makes parsing trivial.
Truncate observations. Long tool outputs eat your context. Summarise once, store the full thing for audit.
Carry minimal state across steps. Don't re-paste the entire history if you can help it; use prompt caching for stable prefixes.
Separate the model's reasoning model from the tool runner. Two services, two responsibilities; easier to evolve, easier to debug.

Variants and friends

ReAct + reflection — at the end of an episode, the model reviews what it did and writes a "lesson" stored for next time. Compounds quality over runs.
Plan-and-act — produce a multi-step plan first, then execute steps with smaller per-step thinking. Reduces drift on long tasks.
Multi-agent ReAct — different agents run their own ReAct loops; a supervisor stitches the results.

Where ReAct stumbles

Very long horizons. 50+ steps and the context becomes a sea of observations. Use planning + summarisation aggressively.
Ambiguous tools. "I don't know which tool to call" — solved by better tool descriptions, not more reasoning.
Repetitive thoughts. A model talking to itself in circles is a sign of poor tool design or unclear goal — don't paper over it with bigger context.

In one line

ReAct is the reasoning-agent default. The transcript is your debug log; the loop is your runtime.

From the field

ReAct is the loop most agent frameworks are quietly running under the hood, so understanding it pays off when you debug one. The failure I hit most isn't bad reasoning — it's no progress: the agent reasons, acts, gets a confusing observation, and reasons in a circle. So I always cap the steps and add a nudge that forces forward motion or a graceful give-up, because an agent that spins is worse than one that stops and asks. The other lever is observation quality — a clean, summarised tool result keeps the next step on track; a raw data dump derails it. Good agents are mostly good plumbing around this loop.

→ ¿Lo quieres en tu stack?

Custom Claude Code AI Agents & Workflows

Stop doing the repetitive, multi-step work that eats your team's day. This service delivers a working AI agent system that handles tasks like lead processing, data enrichment, content pipelines, and r...

Ver cómo puedo ayudar