Study Finds 146,900 AI-Generated Fake Citations Polluting Scientific Research

Fake citations undermine the scholarly record itself
A management professor warns that AI hallucinations are eroding trust in the foundation of scientific knowledge.

Science advances by building on what came before — a chain of citations that links each new claim to the evidence beneath it. A joint study from Cornell and UCLA has found that chain compromised at scale: nearly 147,000 fabricated references, conjured by AI systems that generate the appearance of knowledge without its substance, now inhabit four of the world's major scientific repositories. The rupture is not the work of a few bad actors but a diffuse consequence of a tool adopted widely and verified rarely, arriving at a moment when the boundary between assistance and invention has grown dangerously thin.

  • Nearly 147,000 citations pointing to sources that do not exist have been identified across arXiv, bioRxiv, SSRN, and PubMed Central — the platforms where science first goes public.
  • The surge tracks almost precisely to 2023, when large language models became widely accessible, suggesting the tools themselves are the vector rather than any isolated lapse in judgment.
  • The fabrications are scattered broadly across fields and institutions, meaning the failure is systemic — many researchers trusted AI-generated references without checking whether the sources were real.
  • Scholars inside academia are now voicing doubt about the reliability of papers they read, a corrosion of internal trust that may prove harder to repair than any single retraction.
  • ArXiv has moved to ban submissions containing hallucinated or unverified AI-generated citations, marking the first major institutional attempt to hold the line against the contamination of the scholarly record.

The unspoken contract of scientific research is straightforward: every citation points to something real. A new study from Cornell and UCLA has found that contract breaking quietly and at scale. Across four major preprint repositories — arXiv, bioRxiv, SSRN, and PubMed Central — nearly 147,000 fabricated references now sit embedded in the published record, produced by AI systems that generate plausible-sounding titles, authors, and journal names that correspond to nothing.

The mechanism is familiar to anyone who has used a large language model. A researcher drafting a paper asks an AI assistant to supply citations for a claim. The model responds with something that looks exactly right — structured, specific, confident. The researcher, moving quickly or simply trusting the output, includes it without verification. The paper is submitted. The ghost citation travels.

Analyzing 111 million references across 2.5 million papers, the research team found that citation error rates climbed sharply after 2023, the year AI writing tools became widely available. More troubling than the volume was the distribution: the fake citations were not clustered in a handful of careless papers but scattered across many works and many fields, pointing to a broad pattern of unverified AI use rather than isolated misconduct.

The stakes extend well beyond academic bookkeeping. Scientific papers are the substrate on which technology, medicine, and policy are built. That infrastructure depends on the assumption that citations are real — that any reader could, in principle, follow the chain of evidence back to its source. Usha Haley, a management professor at Wichita State University, described the phenomenon as a serious warning sign, noting that skepticism about research quality is now rising among early-career scholars who are beginning to question whether the literature they rely on rests on solid ground.

ArXiv, one of the largest and most influential repositories, announced this week that it will ban authors who submit work containing hallucinated or unverified AI-generated citations. It is a defensive measure — an acknowledgment that the problem has grown urgent enough to require institutional enforcement rather than individual conscience alone.

The foundation of scientific trust rests on a simple contract: when a researcher cites a source, that source exists. A new study from researchers at Cornell and UCLA has found that contract breaking at scale. Across four major scientific repositories—arXiv, bioRxiv, SSRN, and PubMed Central—nearly 147,000 fabricated citations now appear in the published record, generated by artificial intelligence systems that sound authoritative while inventing facts.

Large language models like ChatGPT and Gemini have a well-documented weakness: they hallucinate. They produce text that reads plausibly, that follows the grammar and structure of real information, but that has no basis in fact. A researcher working quickly, drafting a paper with an AI assistant's help, might ask the system to generate citations for a particular claim. The model obliges. It produces a title, an author, a journal name, a year. Everything looks right. The researcher, trusting the machine or simply moving fast, includes it in the paper without verification. The citation is published. It spreads through the scientific ecosystem.

The researchers analyzed 111 million references across 2.5 million papers, hunting for citations whose titles could not be matched to any actual publication. Some mismatches were simple typos. But many were not. When the team compared citation error rates before and after 2023—the year large language models became widely available—they found a sharp rise in non-existent references. The problem was not concentrated in a handful of careless papers. Instead, the fake citations were scattered across many different works, suggesting that numerous researchers, across different fields and institutions, had relied on AI-generated references without catching the fabrications.

This matters because scientific papers are not academic curiosities. The research that becomes a paper shapes the world. The internet, lithium-ion batteries, countless medical treatments—all began as ideas tested and documented in papers that other researchers could read, verify, and build upon. That chain of verification depends on trust. When a scientist reads a citation, she assumes it points to a real source she could find if she needed to. When that assumption breaks down, the entire system weakens.

Usha Haley, a management professor at Wichita State University, described the problem as a serious warning sign. Fake citations, she said, undermine the scholarly record itself—the accumulated body of verified knowledge on which peer review and scientific progress depend. What troubles her most is that skepticism about research quality is now coming from within academia itself, from early-career scholars who are beginning to question whether the papers they read are built on solid ground.

The four repositories where these citations were found play a crucial role in how science moves. Before a paper appears in a formal journal, researchers often upload it to one of these platforms, making it immediately available to the global scientific community. This speeds up the sharing of ideas. It also means that flawed work can spread quickly, before traditional peer review has had time to catch errors.

ArXiv, one of the largest repositories, has begun to respond. The organization announced this week that it will ban authors who submit papers containing hallucinated citations or any unchecked AI-generated content. It is a defensive measure, an attempt to hold the line. But it also signals that the problem is real enough, and urgent enough, that even the institutions that host scientific work are now forced to police the integrity of what they publish.

Fake or AI-generated citations undermine trust in the scholarly record that provides the foundation on which peer review and cumulative knowledge rest
— Usha Haley, management professor at Wichita State University
The Hearth Conversation Another angle on the story
Inventor

Why does a fake citation matter so much? Isn't the real test whether the actual science is sound?

Model

A fake citation is a lie embedded in the foundation. Even if the paper's own experiments are solid, it's claiming to build on prior work that doesn't exist. That breaks the chain of knowledge. Someone reading the paper might think a claim is well-established when it's actually new—or might miss a real contradiction because they're looking at a phantom source.

Inventor

So the researchers who generated these citations—are they knowingly committing fraud, or are they just trusting the AI?

Model

That's the unsettling part. We don't know. Some may be deliberately gaming the system. But many are probably just moving fast, asking ChatGPT to help draft a paper, and not realizing the model is confabulating. The AI is so confident, so fluent, that it's easy to miss.

Inventor

How many papers are actually affected? Is this a fringe problem or something that could undermine whole fields?

Model

The study found 146,900 fake citations across 2.5 million papers. That's not tiny. And the fact that they're spread across many papers rather than concentrated in a few suggests it's not a fringe problem—it's systemic. Researchers across different disciplines are doing this.

Inventor

What happens to a field when trust erodes like this?

Model

Everything slows down. Peer reviewers become more skeptical. People spend time chasing phantom sources instead of building on real work. And early-career researchers, who are already vulnerable, start to wonder if the entire foundation they're building on is solid.

Inventor

Can this be fixed?

Model

arXiv's ban is a start—it creates consequences for submitting unchecked AI content. But the real fix is cultural. Researchers need to treat AI-generated citations the way they'd treat any draft: verify everything before it goes into the record. The technology isn't going away. The question is whether we can build habits of verification fast enough.

Want the full story? Read the original at CNET ↗
Contact Us FAQ