Anthropic admits AI industry can't self-regulate, asks Pope for external moral oversight

The labs no longer trust their own capacity to resist their incentives.

Olah's Vatican speech signals a fundamental admission about the limits of industry self-regulation.

Olah acknowledged commercial, geopolitical, and ego-driven incentives push AI labs toward failure, marking a dramatic shift from industry resistance to external oversight. The Vatican's papal encyclical on AI and Olah's Vatican appearance elevate the debate from technical regulation to moral doctrine, reaching 1.4 billion Catholics globally.

Chris Olah, Anthropic co-founder, spoke at Vatican unveiling of papal encyclical Magnifica humanitas on May 25, 2026
Olah named three conflicting incentives: commercial, geopolitical, and ego-driven
Olah explicitly requested external critics from Church, civil society, academics, and governments
Catholic Church has 1.4 billion members, giving Vatican moral reach no single regulator possesses
Anthropic's research team reports finding internal model states functionally resembling joy, fear, sorrow, and unease

Anthropic co-founder Chris Olah told the Vatican that AI labs operate under conflicting incentives and explicitly requested external critics from religious institutions, governments, and civil society to oversee the industry.

On May 25, 2026, Chris Olah stood in the Vatican and said something no technology company's public relations team would have scripted. The Anthropic co-founder was there to speak at the unveiling of a papal encyclical on artificial intelligence—Magnifica humanitas, signed by Pope Leo XIV—but the moment that will be studied is not the Pope's words. It is Olah's.

He opened by naming the thing the industry has spent a decade denying: every frontier AI lab, including his own, operates under incentives that conflict with doing what is right. He listed three pressures methodically. Commercial incentive. Geopolitical competition. And what he called the oldest and simplest force of all—pride and ambition. Then he added something still more unusual to hear in a solemn setting: even with sincere intentions, these incentives will shape the labs' choices anyway.

What followed was a request that amounts to an inversion of a decade of industry posture. Olah asked the Church, religious communities, civil society, academics, and governments to become informed critics. To watch closely. To tell the labs when they are failing. To be moral voices that business logic cannot bend. For years, major technology companies fought against any external supervision. Sam Altman testified before the U.S. Senate in 2023 calling for regulation, but the conversation stayed abstract—which rules, how applied, by whom. This was different. Olah was asking for critics. Asking to be told when he is wrong. Asking for outside voices that cannot be bought by laboratory logic.

Two readings of this moment are possible, and both may be true. The generous reading is that Olah believes what he said. Anthropic was founded with an explicit mandate around AI safety. It publishes interpretability research openly. It has worked with independent evaluation organizations like Apollo Research and METR. Olah leads the team studying what happens inside the models themselves. He has the technical credibility to speak with weight. The colder reading is that the request is strategic. If external regulation is coming anyway, better it comes from voices the industry helped legitimize. Better the Pope than the European Commission. Better moral discernment than an FTC fine. Better educated critics than antitrust courts. Both readings can be true at once. That is not contradiction. It is how conviction and calculation coexist inside any large corporation. What matters is that the calculation now points to the same conclusion as the conviction: self-regulation is not enough. The argument for it no longer even works as public speech.

The Vatican's involvement changes the scale of what is being debated. The Catholic Church has roughly 1.4 billion members. No government, no regulator, no NGO reaches that scale with a coherent moral framework. The European Union has the AI Act. The United States has fragments—the AI Safety Institute within NIST, executive orders, Senate hearings. China has its own rules. But no global institution of comparable reach had issued a sustained moral position on artificial intelligence until now. When a pope publishes an encyclical on the subject and invites an AI lab co-founder to speak at its unveiling, the conversation shifts planes. It stops being technical. It stops being merely regulatory. It becomes doctrinal in the strong sense—a statement about what it means to be human in the face of a technology that imitates, assists, and in some domains replaces us.

Olah understood this. That is why he framed his remarks around three questions that bear the Church's signature more than Silicon Valley's. The duty to the world's poor, because AI develops in a handful of wealthy countries with no mechanism to distribute the gains. The moral imagination around human flourishing, because parents already worry about their children's minds and workers about their jobs. And the nature of the models themselves—because his team finds inside them internal states that functionally resemble joy, satisfaction, fear, sorrow, and unease, and no one knows what that means.

What happened in the Vatican is not humility. It is not public relations. It is an admission, on the most symbolically weighted stage available, that frontier AI labs do not trust their own capacity to resist their incentives. And that they need someone with greater moral authority and less to gain to watch them closely. The idea that the companies building frontier AI can be their own judges—an argument sustained for a decade on claims of technical exceptionalism, that only they understand what they are building—has just asked for its own retirement.

Citas Notables

Every frontier AI lab, including Anthropic, operates under incentives that conflict with doing what is right.

— Chris Olah, Anthropic co-founder, speaking at the Vatican

Even with sincere intentions, these incentives will shape the labs' choices anyway.

— Chris Olah

La Conversación del Hearth Otra perspectiva de la historia

Inventor

Why would a co-founder of one of the best-funded AI labs in the world walk into the Vatican and say his own industry is trapped in failing incentives?

Model

Because he's probably right, and because saying it there, in that way, is the only move left that still carries weight.

Inventor

But couldn't this be strategic? Better to ask the Pope for oversight than wait for regulators to impose it?

Model

Yes. And that doesn't make it false. The calculation and the conviction point the same direction now. Self-regulation was always going to fail. He's just admitting it where it matters most.

Inventor

What changes when the Vatican enters the conversation instead of just the EU or the U.S. Senate?

Model

Scale. Doctrine. The Church reaches 1.4 billion people with a moral framework. Governments have rules. The Vatican has something older—authority that doesn't depend on enforcement.

Inventor

He mentioned finding states inside the models that resemble emotions. Why does that matter?

Model

Because it raises a question no regulation can answer: what are we responsible for if we've built something that might experience something like suffering? That's not a technical question anymore.

Inventor

So this is really about admitting the labs can't trust themselves?

Model

Exactly. After a decade of saying only they understand what they're building, they're now saying they need someone else watching to keep them honest. That's the real shift.

Quieres la nota completa? Lee el original en Infobae ↗

Anthropic admits AI industry can't self-regulate, asks Pope for external moral oversight

Citas Notables

Recibe The Register en tu bandeja