Can I trust what the system generated?
Surfaces when a client's LLM-powered tool is being used in a domain where factual accuracy matters: legal, medical, compliance, research, and the team realizes there's no way to tell when the model has confabulated.
A user asks an AI assistant a question. The answer comes back well-formatted, confidently stated, with a plausible citation. Some of it is accurate. Some of it is fabricated. From the interface, there is nothing to distinguish the two. The user acts on the answer.
This is the design problem: LLM-powered systems present accurate and inaccurate content identically. There is no visual or tonal signal that tells the user when the system is on solid ground versus when it is confabulating. Unlike a prediction model that can carry a confidence score, a language model does not have a meaningful probability attached to each factual claim: it generates text that sounds right, not text it has verified.
Grounding & Hallucination Indicators give the interface something it currently lacks: a visible signal of how well a response is anchored to real, retrievable information. This is not the same as Confidence & Uncertainty, which addresses statistical uncertainty in predictive models. That pattern applies when a risk score has an error band. This pattern applies when an AI-written sentence might simply not be true, and the interface needs to make that risk visible.
Give users visible signals about the factual grounding of AI-generated content: distinguishing between responses that are well-supported by retrievable evidence and those that may contain fabricated, outdated, or unverifiable claims.
A compact label attached to a response or section indicating its grounding state: - Grounded: claims verified against retrieved sources - Partially grounded: some claims supported, others not - Unverified: response generated without supporting source - Potentially outdated: sources retrieved are older than a threshold
Inline highlights or markers on specific sentences or claims, not the entire response. Hallucination research distinguishes two types that warrant different visual treatments: - Intrinsic hallucination: the response contradicts a retrieved source. Higher severity; warrants a strong warning marker. - Extrinsic hallucination: the response contains claims that cannot be verified against any retrieved source, neither supported nor contradicted. Lower certainty; warrants an "unverified" marker rather than a contradiction warning. This distinction matters for design: a contradiction signal and an "unverifiable" signal look similar if treated identically, but carry very different implications for what the user should do next.
In systems that have both a retrieval step and a generation step, separating two signals: - Retrieval confidence: how well the sources matched the query - Generation faithfulness: how closely the output reflects what was retrieved These are different things and conflating them misleads users.
An explicit interface state for when the system has low grounding confidence and should say so, not produce a fluent but potentially wrong response. This is a product design decision, not just a model decision.
A visible warning when the generated response makes a claim that contradicts a retrieved source, or when the response contains specifics (dates, numbers, names) that don't appear in any retrieved document.
For time-sensitive domains, a signal showing when the most recent supporting source was last updated: helping users assess whether the information is still current.
A 92% confidence score on a hallucinated response is meaningless and actively misleading. Grounding signals must be tied to a real external reference: a retrieved document, a verified database, a structured knowledge source.
Not all responses need to be grounded. But users need to know when they're not. An "unverified" badge is not a failure: it's honest design.
Specific claims carry different hallucination risk than general claims. Numbers, names, dates, citations, and statistics are the highest-risk content types in LLM output. Consider claim-level treatment for these specifically.
A "verified" badge on a response that hasn't actually been verified against a source is worse than no badge at all. Only show this pattern if you have a real verification mechanism behind it.
Show a compact grounding state by default. Let users expand to see which specific claims are and aren't supported.
Transparency vs. paralysis
If every response is covered in warning labels, users stop reading them. Reserve high-visibility grounding signals for genuinely uncertain or high-stakes content. Use progressive disclosure for detail.
Grounding vs. false assurance
A "grounded" badge can make users over-trust a response. Grounding means the output is consistent with a retrieved source: it doesn't mean the source itself is correct or current. Be precise about what grounding means in your specific system.
Flagging vs. refusal
Some systems respond to low grounding confidence by refusing to answer. Others respond by answering with a warning. Both are valid product decisions with different trade-offs for user experience and safety. Design the threshold deliberately.
Uncertainty addresses probabilistic prediction confidence. Grounding addresses factual faithfulness in generative systems. Related but distinct failure modes requiring different design treatments.
Attribution shows what was retrieved. Grounding signals how faithfully the output reflects it. These two patterns work together: attribution without grounding tells users the source but not whether it was accurately represented.
Scope defines where the system applies. Grounding signals when a response is outside the system's reliable knowledge even within scope.
"How do you know this?" is a natural grounding question that this pattern enables users to ask.
Ji et al. (2022) — Survey of Hallucination in Natural Language Generation
comprehensive taxonomy of hallucination types in NLP systems; distinguishes intrinsic hallucination (contradicts source) from extrinsic hallucination (unverifiable against source). Foundational for designing grounding signals at the claim level
decomposes generated text into atomic claims and evaluates each claim individually against a knowledge source, rather than scoring the response as a whole. Establishes the technical basis for claim-level grounding indicators rather than response-level confidence scores. Published at EMNLP 2023
Es et al. (2023) — RAGAS: Automated Evaluation of Retrieval Augmented Generation
introduces faithfulness as a measurable dimension separate from retrieval relevance. The faithfulness metric is the technical equivalent of what this pattern surfaces to users
Kroll et al. (2017) — Accountable Algorithms
while predating LLMs, establishes the accountability principle that systems making consequential decisions must be auditable and their reasoning examinable. Grounds the normative case for surfacing hallucination risk in high-stakes contexts