ChatGPT's False Outputs Mirror Human Confabulation, Not Hallucination

It fills gaps with plausible material, much like a person with memory loss might do

ChatGPT's errors resemble human confabulation rather than hallucination, suggesting new approaches to improving AI reliability.

When artificial minds fill gaps with confident falsehoods, they echo something ancient in human cognition — the brain's tendency to construct plausible stories where memory fails. A perspective published in NPP–Digital Psychiatry and Neuroscience argues that AI errors are not random noise but patterned failures that mirror human confabulation and auditory hallucination, each arising when predictive systems must act on incomplete information. The distinction matters not as a philosophical curiosity but as a practical guide: understanding what kind of error is occurring may be the first step toward knowing how to prevent it.

AI systems embedded in healthcare, education, and research are generating confident misinformation at scale, and the word 'hallucination' has been masking two very different failure modes.
ChatGPT fills knowledge gaps with internally consistent but factually wrong responses — a pattern that mirrors human confabulation, where the mind constructs plausible narratives to cover what it cannot recall.
Whisper, the speech recognition system, produces errors that skew toward harmful content — stereotypes, violence, sexual material — with an unsettling resemblance to the threatening character of human auditory hallucinations.
Researchers caution that behavioral parallels do not imply shared consciousness, and that the underlying mechanisms remain fundamentally distinct between biological and computational systems.
Mitigation strategies borrowed from psychiatry — uncertainty estimation, internal consistency checks, multi-pass verification — are being proposed as paths toward more reliable AI, with human oversight remaining essential throughout.

When ChatGPT tells you something confidently wrong, it is not hallucinating in any clinical sense — it is confabulating. A recent perspective article in NPP–Digital Psychiatry and Neuroscience draws this distinction carefully, arguing that applying psychiatric concepts to AI errors could help engineers build more trustworthy systems.

The comparison is grounded in observable behavior. Large language models, when faced with vague prompts or questions that exceed their training, generate responses that are logically coherent but factually detached — much as a person with Korsakoff's syndrome might construct detailed, convincing narratives to fill gaps in memory, with no intent to deceive. Both human and machine confabulations are context-sensitive: leading questions make humans more likely to fabricate plausibly, and ChatGPT similarly drifts toward error when prompts nudge it toward particular assumptions.

Speech recognition errors follow a different pattern. Whisper's transcription failures include repetitive looping and, in more than a third of cases, explicitly harmful content — a distribution that echoes the threatening character of human auditory hallucinations. Both systems appear most vulnerable when input signals are weak or degraded, though the mechanisms diverge sharply: one involves aberrant neural signaling, the other probabilistic pattern completion over acoustic data.

The researchers are careful not to overreach. These parallels are behavioral and provisional — ChatGPT has no self-modeling consciousness, no episodic memory, no inner life. But the functional similarities suggest a practical path forward. Cognitive-behavioral principles could inform uncertainty estimation tools and consistency checks within AI systems. Multi-pass verification, retrieval-augmented generation, and semantic entropy methods all reflect the same underlying logic: slower, more thorough assessment reduces error. The deeper lesson may be that strengthening systemic capacity — rather than patching individual symptoms — is the more durable solution, for artificial and human minds alike.

When ChatGPT confidently tells you something that sounds reasonable but is entirely false, it is not hallucinating in the way a psychiatrist would use that word. It is confabulating—filling gaps with plausible-sounding material, much like a person with memory loss might do without any intention to deceive. This distinction, laid out in a recent perspective article published in NPP–Digital Psychiatry and Neuroscience, matters more than it might seem. It suggests that understanding AI errors through the lens of human psychiatric phenomena could help engineers build more reliable systems.

The comparison begins with a simple observation: large language models and speech recognition tools have become woven into education, business, healthcare, and research, yet they remain prone to producing misinformation. ChatGPT generates text that appears coherent and contextually appropriate while being factually wrong. Whisper, an automatic speech recognition system, produces transcription errors that can range from nonsensical to actively harmful. Both phenomena are often called hallucinations, but the researchers argue this terminology obscures what is actually happening in each case.

Confabulation, in human neurology, describes false memories that fill gaps in recollection. A person with Korsakoff's syndrome or dementia might construct detailed, internally consistent narratives to account for missing information, with no conscious effort to mislead. The fabrications feel real to the person experiencing them. ChatGPT operates in a structurally similar way. When faced with vague prompts, ambiguous questions, or multi-step reasoning tasks that exceed its training data or contextual understanding, the model generates responses that are logically consistent and confident but detached from fact. It fills functional memory gaps—gaps in training data, parameter encoding, or context—the way a human brain fills episodic memory gaps. The parallels extend further: both human and machine confabulations are context-dependent. Emotional states and leading questions make humans more likely to fill gaps with plausible falsehoods. Similarly, ChatGPT generates incorrect responses more readily when prompted with assumptions that nudge it toward particular answers.

But the researchers are careful to note that the mechanisms are fundamentally different. ChatGPT lacks self-modeling consciousness, executive control, or episodic memory. It does not experience the world. Newer models can store limited user-controlled information across sessions, but they do not incorporate it into a continuously self-updating model of self. The parallel is behavioral and provisional, not evidence of shared inner life.

Speech recognition errors tell a different story. More than one-third of Whisper's hallucinations are explicitly harmful—demographic stereotypes, descriptions of violence, sexual content. Human auditory verbal hallucinations, experienced by people in both clinical and non-clinical groups, similarly tend toward threatening and negative content. Both systems produce repetitive errors, looping phrases endlessly. Both appear vulnerable to error when perceptual signals are degraded or weak. Yet here too, the underlying mechanisms diverge. Human auditory hallucinations involve aberrant neural signaling, misattributed inner speech, and threat-processing systems gone awry. Whisper hallucinations reflect the completion of probabilistic patterns applied to acoustic features. The system computes odds over acoustic-text mappings. Repetitive or harmful content emerges from statistical regularities in training data, not from affective states or misattributed speech.

The practical implication is that understanding these distinctions could lead to better mitigation strategies. In psychiatry, cognitive-behavioral therapy helps people critically assess the validity of their hallucinations. Similar principles could be integrated into AI systems through uncertainty estimation methods and internal consistency checks. Increasing computational resources—allowing models to perform multi-pass verification or additional processing steps—can reduce error rates by enabling slower, more thorough assessment. Other approaches include retrieval-augmented generation, cross-model verification, multi-agent debate, and semantic entropy methods. The underlying principle applies to humans as well: error monitoring and cognitive performance depend on sufficient neurobiological resources, which sleep and rest restore. Strengthening systemic capacity, rather than targeting individual symptoms, may be the more effective path.

The researchers emphasize that these parallels are provisional and mechanistically limited. Confabulation-like and hallucination-like errors remain well-recognized problems in large language models and the software that depends on them, underscoring the continued need for human oversight. But by treating AI errors as mirrors of human psychiatric phenomena—not as evidence of shared consciousness, but as functional similarities in how predictive systems construct meaning when information is incomplete—researchers may find new ways to make both human and artificial minds more reliable.

Notable Quotes

Confabulation-like and hallucination-like errors are well-recognized problems in large language models, underscoring the need for continued human oversight

— NPP–Digital Psychiatry and Neuroscience perspective article authors

The Hearth Conversation Another angle on the story

Inventor

Why does it matter whether we call these errors confabulations instead of hallucinations?

Model

Because the name shapes how we try to fix them. If ChatGPT is hallucinating, we might look for ways to suppress false perceptions. If it's confabulating, we're looking at a gap-filling problem—something closer to improving memory or verification.

Inventor

But doesn't ChatGPT seem to be making things up from nothing, the way a hallucinating person does?

Model

It looks that way on the surface. But a hallucination is a false sensory experience—the brain generating perception without external input. ChatGPT isn't perceiving anything. It's predicting the next token based on patterns in training data. When information is missing, it fills the gap with something plausible. That's confabulation.

Inventor

And Whisper—the speech recognition tool—that's different?

Model

Functionally, yes. Whisper is working with degraded acoustic signals, trying to map sound to text. When the signal is weak or ambiguous, it generates words that weren't there. That's closer to how human auditory hallucinations work—both systems are vulnerable when the perceptual input is poor.

Inventor

So the harmful content in Whisper hallucinations—the stereotypes, the violence—where does that come from?

Model

From the training data. The model learned statistical patterns that associate certain acoustic features with certain words and phrases. Harmful content appears because harmful content was present in what it learned from. It's not coming from threat-processing or fear, the way it does in humans.

Inventor

If we understand these differences, what actually changes?

Model

How we build safeguards. For confabulation, you add verification steps, uncertainty checks, maybe retrieval systems that ground the model in actual sources. For hallucination-like errors, you might focus on signal quality or probabilistic thresholding. The diagnosis shapes the treatment.

Inventor

Is there a risk in making these comparisons at all? Does it anthropomorphize the machine?

Model

That's the researchers' concern too. They're explicit that these are behavioral parallels, not evidence of consciousness or shared experience. The mechanisms are fundamentally different. But the comparison is useful precisely because it's limited—it helps us see the structure of the problem without claiming the machine feels or thinks.

Want the full story? Read the original at News-Medical ↗

ChatGPT's False Outputs Mirror Human Confabulation, Not Hallucination

Notable Quotes

Get The Register in your inbox