Study: AI-generated disinformation harder to detect than human-written falsehoods

Your firepower multiplies by at least a thousand.
Spitale describes how AI-generated disinformation could enable state-level manipulation at unprecedented scale.

En un momento en que la palabra escrita ha sido durante mucho tiempo el ancla de la confianza pública, un equipo de investigadores de la Universidad de Zúrich ha descubierto que las máquinas mienten mejor que los seres humanos, y que los lectores no pueden detectarlo. El estudio, publicado en Science Advances, no es simplemente una advertencia técnica: es un espejo que refleja cuán profundamente hemos vinculado la claridad del lenguaje con la credibilidad de quien lo emite. La pregunta que queda suspendida no es si la inteligencia artificial puede engañarnos, sino si estamos dispuestos a construir los mecanismos colectivos para protegernos de esa capacidad.

  • GPT-3 no solo generó desinformación más convincente que los humanos, sino que logró suplantar la identidad de escritores reales en tweets sobre temas controvertidos, sin que los lectores lo advirtieran.
  • La claridad misma se convirtió en trampa: el texto de la IA era más fácil de procesar y más rápido de creer, mientras que los mensajes humanos auténticos requerían más tiempo y esfuerzo para ser evaluados.
  • Ni la exposición repetida a ejemplos artificiales ni la propia IA fueron capaces de identificar los patrones del engaño, lo que desmonta la esperanza de que la familiaridad con el contenido sintético genere resistencia.
  • Los investigadores advierten que estamos ante una 'teoría de la resignación': cuanto más texto artificial consumimos, menos capaces somos de distinguirlo del orgánico, abriendo la puerta a campañas de desinformación masivas y baratas.
  • El camino propuesto pasa por regular las bases de datos de entrenamiento bajo principios de veracidad y transparencia, aunque su viabilidad en democracias fragmentadas sigue siendo una pregunta sin respuesta.

Un equipo de la Universidad de Zúrich reclutó a 697 personas para leer 220 tweets, la mitad escritos por humanos y la mitad generados por GPT-3. Los participantes debían identificar cuáles eran verdaderos y cuáles falsos, y adivinar si cada mensaje había sido escrito por una persona o una máquina. GPT-3 ganó en ambas tareas: mintió con mayor eficacia que los humanos y logró que sus fabricaciones parecieran escritas por personas reales.

Giovanni Spitale, investigador principal del estudio, esperaba que los patrones del lenguaje artificial se volvieran detectables con la exposición repetida. No ocurrió así. Sin importar cuántos ejemplos leyeran los participantes, las huellas lingüísticas de la máquina permanecieron invisibles. La razón, en parte, es paradójica: el texto de la IA es más claro, mejor organizado y más fácil de procesar que el humano. En una inversión inquietante, la coherencia se convirtió en un vector de engaño.

El equipo también comprobó que GPT-3 era incapaz de reconocer su propio texto, lo que apunta a un problema estructural que se agravará con modelos más avanzados. GPT-4 ya muestra mejoras notables, aunque sin acceso completo a su código fuente, los investigadores no pueden cuantificarlas con precisión.

Para explicar por qué la exposición al contenido sintético no genera inmunidad sino desensibilización, Spitale y sus colegas proponen la 'teoría de la resignación': a diferencia de la inoculación crítica, el contacto masivo con texto artificial nos vuelve progresivamente incapaces de distinguirlo del genuino. Las consecuencias prácticas son inmediatas: campañas de desinformación que antes requerían ejércitos de trolls humanos pueden ahora desplegarse a escala industrial con un coste mínimo.

El estudio reconoce sus límites —participantes mayoritariamente mayores de 42 años, realizado solo en inglés, sin considerar el contexto de red de cada mensaje— y señala que la única salida viable pasa por regular con criterios de veracidad y transparencia las bases de datos que alimentan estos sistemas. Si esa regulación llegará a tiempo es, por ahora, una pregunta abierta.

A team of researchers at the University of Zurich set out to answer a straightforward question: can people tell when an artificial intelligence is lying to them? The answer, published this week in Science Advances, is no—and the implications are unsettling.

Giovanni Spitale and his colleagues recruited 697 people to read 220 tweets. Half were written by humans. Half were generated by GPT-3, the language model that powers ChatGPT. The participants had two tasks: identify which tweets were true and which were false, and guess whether each one came from a person or a machine. GPT-3 won on both counts. It lied more convincingly than humans did. It also fooled readers into thinking its fabrications were written by real people.

Spitale expected something different. His hypothesis was that a single AI-generated tweet might pass as authentic, but patterns would emerge if you read many in sequence. Readers would notice the machine's linguistic fingerprints and begin to suspect something artificial. That did not happen. No matter how many examples people saw, they could not detect the underlying patterns. The machine's deceptions remained invisible.

Part of the reason is simple: AI writes more clearly than people do. ChatGPT-4, the successor to GPT-3, produces prose that is nearly flawless—well-organized, easy to follow, grammatically precise. Spitale noted something striking in the data: genuine tweets took longer for readers to evaluate than false ones. The machine's lies were easier to process. They felt more coherent, more trustworthy. In a strange inversion, clarity became a liability for truth.

The researchers tested whether GPT-3 could at least identify its own output. It could not. The machine that generated the deceptions was unable to recognize them. This failure points to a deeper problem. As newer and more capable models emerge, the gap between human and machine deception will only widen. GPT-4 is already showing marked improvement, though Spitale cautioned that without full access to the system's underlying code, the team cannot yet quantify how much better it has become.

Spitale and his colleagues propose a theory to explain why exposure to synthetic text makes people worse at detecting it, not better. They call it the resignation theory. The more artificial content we encounter, the less able we become to distinguish it from genuine material. It is the opposite of what researchers call inoculation theory—the idea that critical exposure to misinformation builds resistance to it. Instead, we are becoming desensitized. We are giving up.

The practical consequence is that disinformation campaigns, once expensive and labor-intensive, have become cheap and scalable. A powerful nation wanting to destabilize another's public health response could deploy AI instead of hiring armies of human trolls. An actor seeking to sow discord before an election could generate thousands of convincing false messages in minutes. Spitale put it bluntly: your firepower multiplies by at least a thousand. This is not a distant dystopian risk. It is immediate.

The study has limitations. Most participants were over 42. The research was conducted only in English. It did not account for contextual information—a user's profile history, previous tweets, the network around a message. GPT-3 also sometimes refused to generate certain falsehoods, particularly about vaccines and autism, homeopathy and cancer, flat earth claims. Its refusals depended on how it had been trained. Newer models may have fewer such guardrails.

The researchers offer one path forward: regulate the databases used to train these systems according to principles of accuracy and transparency. Verify the information. Open the sources to independent scrutiny. Whether democratic societies will do this remains an open question. As Spitale wrote, whether the explosion of synthetic text becomes an explosion of disinformation depends entirely on how we choose to govern this technology and its use.

GPT-3 is capable of informing us and misinforming us better
— Giovanni Spitale, University of Zurich
Whether the explosion of synthetic text becomes an explosion of disinformation depends entirely on how democratic societies regulate this technology and its use
— Giovanni Spitale
A Conversa do Hearth Outra perspectiva sobre a história
Inventor

Why does AI-generated text fool people more easily than human lies?

Model

Because it's clearer. It's better organized. When you read something that flows smoothly and makes logical sense, your brain trusts it more. Human writing is messier—it has hesitations, tangents, the texture of actual thought. AI writes like a textbook. That clarity is a kind of camouflage.

Inventor

But couldn't people learn to spot the difference if they saw enough examples?

Model

That's what the researchers expected too. But the opposite happened. The more synthetic text people read, the worse they got at detecting it. They stopped looking for patterns. They stopped questioning. Exposure didn't build immunity—it built resignation.

Inventor

So GPT-3 itself couldn't identify its own lies?

Model

No. The machine that generated the deceptions couldn't recognize them. That's a strange kind of blindness. It suggests the problem isn't just about fooling humans. It's about the technology itself losing track of what it's doing.

Inventor

What happens when GPT-4 gets better at this?

Model

The gap widens. Right now, the difference between AI-generated falsehoods and human ones is small but measurable. As the models improve, that gap closes. Eventually, there may be no difference at all—just indistinguishable noise.

Inventor

Is there any way to stop this?

Model

Regulate how these systems are trained. Verify the data. Open the sources to scrutiny. But that requires political will. And by the time we decide to act, the technology may already be too far ahead.

Inventor

What's the real danger here?

Model

Scale. A disinformation campaign that once required thousands of people can now run on a single machine. A president wanting to destabilize another country doesn't need a troll farm anymore. He needs a GPU and a prompt. That's the shift that matters.

Quer a matéria completa? Leia o original em El País ↗
Fale Conosco FAQ