Hallucinations: why AI lies to you, politely

The model is not lying. It is doing the only thing it knows how to do — guessing — when it should be saying 'I don't know.'

Press → to begin

←Previous partWhy training data is everything

← prev·next →·or swipe

Same essay · long form

You have probably seen it. You ask an AI a question, it gives you a confident answer, and the answer is completely, perfectly, hilariously wrong. A made-up court case. A book that does not exist. A quote from a famous scientist who never said it.

That is a hallucination. The word is misleading — the model is not having a vision. It is doing exactly what it was trained to do. The problem is that what it was trained to do is generate plausible-sounding text, not be correct.

Why it happens

Remember from part 1 — the model is autocomplete. When you ask it "what was the verdict in Smith v. Jones, 2008?", it does not have a database of court cases. It has a sense of what court verdicts tend to sound like. So it generates a court verdict that sounds right. Fluent, confident, formatted correctly. It is also entirely fictional.

The model has no built-in concept of "I do not know this." That is not a value it has access to.

A model that has been trained to always give an answer will always give an answer.

What helps (a little)

Retrieval. Give the model real documents to look at, and tell it to quote from those. This is what tools like search-augmented chatbots do.
Calibration. Some newer models are trained to express uncertainty. They are still not great at it.
Your skepticism. Treat every specific factual claim — names, dates, numbers, quotes — as a hypothesis to verify, not a fact to trust.

What does not help

Asking the model "are you sure?" It will say yes. Then it will say no if you push. It is autocomplete.
Trusting it more because the answer was confidently formatted. Confident formatting is the easy part.

The bottom line

Hallucinations are not a bug. They are the default behavior of a system that was trained to always sound right, with no separate training to know when it does not know. Plan your usage around this. The model is a draft generator, not a fact source.

Next, the final part: how to actually use AI day-to-day, without the wheels coming off.