Skip to main content
AI Evaluation & Safety Agent loop 3 Slider

Hallucinations: Why LLMs Make Stuff Up Confidently

Hallucinations are not bugs — they are the model doing exactly what it was trained to do. Plausibility is the loss; truth is not. Understand the trap, then engineer around it.

· 3 Min. Lesezeit
Zum Lab springen
▸ Selbst ausprobieren

Zieh einen Slider — das Diagramm reagiert in Echtzeit.

FR /100
¶ Die Analogie

The improv-actor analogy

An improv actor on stage is rewarded for making the scene flow. Pause too long, look unsure, break character — that kills the show. So they confidently invent a name, an address, a backstory. It does not have to be true. It has to be believable.

LLMs are trained the same way. The reward signal during pretraining is "predict a plausible next token." Plausibility is everything. Truth is not in the loss. When the model has nothing to say, the training pressure says "say something that fits." That something is a hallucination.

Two kinds of hallucination

  • Intrinsic — the answer contradicts the prompt or the model's own context (it makes up a detail you literally just gave it).
  • Extrinsic — the answer is unmoored from any reference at all (a citation that does not exist, a person that does not exist, a function that is not in the API).

Intrinsic ones are usually fixable with better prompting. Extrinsic ones often need RAG or tool use.

Why models do not "know they don't know"

Calibration of uncertainty is hard. The model produces token probabilities, not confidence in factuality. A model can be 99% sure of a token that is part of a confidently wrong sentence. There is no internal "hold on, I'm guessing" signal — that signal has to be engineered in.

The five engineering moves

  1. Ground in retrieval (RAG). If the model can cite a real chunk, it is less likely to invent one.
  2. Force structured output with verification. Schema-constrained outputs catch many hallucinations as parse errors before users see them.
  3. Self-consistency / multiple samples. Generate N answers; if they disagree wildly, flag it. Cheap and effective.
  4. Tool use for facts. Calculator for math. Database for lookups. Code execution for code. Hallucinations rarely survive a real tool.
  5. Refusal training. Teach the model to say "I do not know" instead of guessing. Hardest because it must be calibrated — refuse too often and the model becomes useless.

What does not fix hallucinations

  • More parameters. Bigger models hallucinate less, but never zero. The qualitative failure mode persists.
  • Lower temperature. Reduces variance, not factuality. A confident wrong answer at temperature 0 is still wrong.
  • Tougher system prompts. "Do not hallucinate" is the AI equivalent of "do not be wrong." It does little.
  • Bigger context. Stuffing more text into a prompt does not change the loss objective. Models can still confabulate against their own context.

Measurement is the unsung hero

You cannot fix what you do not measure. Build a hallucination eval specific to your domain:

  • Curate a set of factual prompts where you know the correct answer.
  • Score: exact-match for factual claims, faithfulness against the source for RAG, schema-validity for structured outputs.
  • Track per release. Hallucination rate is a metric like latency — only valuable when continuously monitored.

A useful mental model

Treat the LLM as a confident, fluent intern with no internet access and a fading memory. They will absolutely produce an answer if you ask. Whether that answer is true is your job to engineer for — through retrieval, tools, structured outputs, and evals. The intern is not lying. They are doing the only thing they were trained to do.

Engr Mejba Ahmed

Engr Mejba Ahmed

Claude Code Expert · Online

👋

Hey there!

Quick Actions

WhatsApp Instant reply

Chat on WhatsApp

+880 1723 741224 · Instant reply

Popular Questions

Engr Mejba Ahmed is connected
Engr Mejba Ahmed is typing...
Engr Mejba Ahmed avatar

✉ Want me to follow up? Drop your email

Engr Mejba Ahmed avatar

📞 Connect Directly

Choose how you'd like to reach me

WhatsApp

+880 1723 741224

Email

[email protected]

✓ Details sent! I'll get back to you shortly.

Powered by OpenAI

335+

Blog Posts

25

AI Courses

63

Projects

Services & Expertise

Pricing & Process

Learning & Resources

Connect & Support