Home Concept Explainers LLM APIs & Tooling Structured Outputs and JSON Mode: Reliable LLM Responses

LLM APIs & Tooling MCP handshake 3 Slider

Structured Outputs and JSON Mode: Reliable LLM Responses

Stop parsing prose. Constrain the model's output to a schema and your downstream code stops guessing. The single biggest reliability lever in LLM apps.

Apr 29, 2026 · 4 Min. Lesezeit

Zum Lab springen Keine Anmeldung · Für immer kostenlos

▸ Selbst ausprobieren

Zieh einen Slider — das Diagramm reagiert in Echtzeit.

Leertaste für Play · ←/→ zum Scrubben

MCP handshake

FR /100 SN-312

SPACE · ◄ ►

¶ Die Analogie

The customs-form analogy

At customs, you do not write a paragraph: you fill in a form. Name. Date of birth. Goods. Yes/No on agriculture. The agent processes a thousand travellers a day because the form makes their job mechanical.

A free-text LLM response is the paragraph. Structured output is the customs form. Your downstream code stops parsing and starts processing — the schema removes the guesswork on both sides.

What "structured output" actually means

You hand the model a schema (JSON Schema, Pydantic, Zod, etc.) and the API guarantees the response will validate against it. Modes you'll see:

JSON mode — output is some valid JSON, no schema enforced. Useful but weak.
Constrained / structured outputs — output validates against a specific schema. Token-level constraints during decoding ensure it. This is what you want.
Function calling output — same idea applied to tool arguments.

The implementation under the hood is constrained decoding: at every step, the decoder masks tokens that would break the schema's grammar. The model literally cannot produce invalid JSON.

What you stop doing

Regex parsing model output (a known-bad idea that quietly costs hours of debugging).
Asking "please respond in JSON" and hoping (the model usually obeys, occasionally adds prose, randomly trails a comma).
Retrying on parse failures with prompt scolding ("you forgot the closing brace") — slow and costs money.
Trying to fix malformed JSON with jsonrepair (works most of the time, fails when you most need it).

What you start doing

Define a schema close to the data shape your app needs.
Validate on the way out (defense in depth, even if the API guarantees it).
Use discriminated unions for branching outputs ("either an answer or a clarifying_question").
Stream and parse incrementally if latency matters (most APIs support partial-JSON streaming).

Schema design tips

Be specific. status: "approved" | "rejected" | "pending" beats status: string.
Bound numbers. score: number, minimum: 0, maximum: 1 — saves bug reports.
Keep it shallow. Three levels of nesting beats nine. The model is more reliable when the schema is digestible.
Use descriptions. Each field should have a one-sentence description; the model reads them.
Include enums for closed sets. Free-text where enums would do is a future bug.

Common pitfalls

Schemas that are too rigid. Real-world data varies; reject paths the model has no graceful answer for. Provide an unknown enum value or an optional notes field for the messy edge.
Free-text inside structured output. A description: string field still hallucinates. Constrain what you can; eval what you can't.
Confusing function calling with structured outputs. They share the schema mechanism; the use case is different.
Massive schemas. A 1000-field schema is a context burner and a model confuser. Split the call.

Where structured outputs shine

Form filling, data extraction. Resume → JSON. Receipt → JSON. PDF → JSON.
Classification with rationale. {label: "spam", confidence: 0.92, reason: "..." }.
Multi-output decisions. Score, category, suggested action — all atomic.
Pipelines. Output of one model is input to the next; structure prevents drift.

When not to use them

Open-ended creative writing. A poem in a text field defeats the point.
Conversational chat where flow matters. Forcing structure kills feel.
When schema brittleness will hurt more than free-text errors will — rare, but real.

In one line

Free-text output is a demo. Structured output is a product. The schema is the API.

From the field

Structured outputs turned LLM responses from something I parsed and prayed over into something I could rely on, but schema design matters more than people expect. Flat beats deeply nested — the model fills a shallow schema far more reliably — and enums beat free-text anywhere you have known options, because they remove a whole class of "creative" answers. I also keep a field for the model to signal it couldn't comply, so failure is a value I branch on, not a parse error at 3am. If schema-constrained decoding is available, use it over plain JSON mode; "usually valid JSON" is still a pager waiting to go off.

→ Wollen Sie das in Ihrem Stack?

Custom SaaS App, AI Dashboard & Web Application Development — Full-Stack Engineer

Need a SaaS app, AI dashboard, or web application built fast and production-ready?I build full-stack AI-powered products using vibe coding with Lovable AI, React, Next.js, Tailwind CSS, Supabase, Pyth...

So kann ich helfen