ChatRaj
Application & chatbot architecture

Hallucination

An AI hallucination is text from a large language model that sounds fluent and confident but is factually wrong or unsupported by any source.

Bottom line
An AI hallucination is text from a large language model that sounds fluent and confident but is factually wrong or unsupported by any source. Ji et al. (2023) split them into intrinsic hallucinations, which contradict the input context, and extrinsic hallucinations, which invent information that was never in any input.
Reviewed by ··5 min read
Jump to section

What an AI hallucination actually is

A hallucination is text generated by a large language model that reads as confident and fluent but is factually wrong, fabricated, or unsupported by any verifiable source. The model is not lying in any intentional sense. It is doing the only thing it knows how to do, which is predict the next plausible-sounding token. When plausibility and truth diverge, the model picks plausibility.

The term gets used loosely to mean "the AI got something wrong," but a clean definition matters. A hallucination is specifically a confident-sounding output presented as fact when the model had no grounded reason to produce it. A retrieval pipeline that returns the wrong passage and a model that faithfully summarizes that wrong passage is not really hallucinating, it is being mis-fed. A model that, given the correct passage, then invents a quote or a statistic that is not in that passage is hallucinating in the strict sense.

This distinction matters because the mitigations are completely different. Bad retrieval is fixed by better retrieval. True hallucination is fixed by output-side controls like citation grounding, refusal prompts, and verifiers.

The two main types: intrinsic and extrinsic

The Ji et al. (2023) survey on hallucination in natural language generation introduced the taxonomy that most of the field now uses. Both types are worth knowing because they fail in different ways and respond to different defenses.

Intrinsic hallucination is when the generated output contradicts the input context. You hand the model a source passage that says "the refund window is 30 days," and the model produces "the refund window is 60 days." The truth was sitting right there in the prompt. The model overrode it with something more familiar or more fluent. Intrinsic hallucinations are particularly damaging for retrieval-augmented generation systems because they break the core promise: if the retrieved passage is correct but the answer contradicts it, the whole grounding chain has failed.

Extrinsic hallucination is when the generated output introduces information that was not in any input at all. Fabricated citations, invented case law, fake URLs, made-up quotes from real people, and statistics that do not exist anywhere. Extrinsic hallucinations are the ones that make headlines, partly because they are the easiest to spot once a human checks the references.

Most production chatbot failures are a mix. A user asks about a feature. The retrieval layer surfaces a partially relevant doc. The model fills the gap with extrinsic invention while contradicting the small piece that was correctly retrieved.

Why hallucinations happen for AI chatbots

The root cause is the training objective itself. A language model is rewarded for producing text that looks like text from the training distribution. "I do not know" is statistically rare in the training data. Confident, specific, helpful-sounding answers are common. So the model learns to sound confident and specific even when it should not.

There is no fact-checking layer inside the model. The weights encode patterns of co-occurrence, not a truth database. When the model writes "the CEO of Acme Corp is Jane Smith," nothing internal verified that claim. The sentence is produced because, given the preceding tokens, those continuations had high probability.

A second contributor is the long-tail problem. Common facts repeated thousands of times in pretraining tend to come out reliably. Rare facts seen once or twice get blended with similar-shaped facts. This is why hallucinations cluster around obscure entities, specific dates, exact numbers, and citation strings, the parts of an answer that demand precision the model cannot guarantee.

Decoding settings amplify this. Higher temperature gives more creative phrasing but more variance in factual claims. Long-form generation gives more opportunities to drift. Anything that asks the model to invent structure (a list of five sources, three case studies, four objections) creates pressure to confabulate to meet the requested shape.

How to reduce hallucinations: RAG, refusal, citations

You cannot eliminate hallucinations from a generative model. You can stack mitigations until the residual rate is tolerable for the use case.

Retrieval-augmented generation. Instead of relying on the model's parametric memory, fetch relevant passages from a trusted knowledge base and ask the model to answer using those passages. RAG addresses extrinsic hallucination directly: if the answer must come from the retrieved chunk, there is less room to invent. RAG does not stop intrinsic hallucination on its own, which is why grounding alone is not enough.

Citation grounding. Require the model to attach an inline reference to each substantive claim, then validate that those references actually point to passages that support the claim. Done well, this turns the model into a "quote and link" engine rather than a freestyle writer. See the separate citation grounding entry for the implementation details.

Refusal prompts. Explicitly instruct the model to say "I do not know" when the retrieved context does not contain a clear answer, rather than reaching into parametric memory. This is the cheapest single intervention with the biggest payoff and is often skipped because builders assume the model will refuse by default. It will not.

Output verifiers. A second model call, sometimes a smaller model, reads the generated answer alongside the retrieved sources and flags claims that are not supported. Verifiers add latency and cost but catch the residual cases the primary model gets wrong. They also pair well with AI guardrails, which gate the output before it ever reaches the user.

ChatRaj's refusal pattern explicitly tells the model to say "I do not know" rather than guess, and grounds every answer in retrieved passages with inline citations. Combined with operator-side analytics that surface low-confidence answers, this keeps the hallucination rate low enough for customer-facing deployments without removing the model's ability to actually be helpful.

FAQ

Common Hallucination questions

The training objective rewards plausible-sounding completions, not verified ones, and there is no fact-checking layer inside the model itself. When fluency and truth diverge, the model picks fluency.

Was this helpful?

Ship your first chatbot in 60 seconds.

Sign in with Google and you'll be answering visitor questions before your coffee gets cold.

60-second setup · One-line install · Works on any site

Works on any website
SShopify
WWebflow
WPWordPress
SqSquarespace
FFramer
</>Plain HTML