Anthropic releases 'too powerful' Claude AI tool to public with safeguards

the AI industry has a gas pedal, but it doesn't have a brake pedal

Anthropic co-founder Jack Clark on why the company struggles to slow AI advancement despite safety concerns.

Anthropic has released Claude Fable 5, an AI system it once deemed too dangerous for public use, marking a quiet but consequential turning point in how the industry weighs capability against caution. The tool's power to uncover hidden vulnerabilities in digital systems was demonstrated vividly during a private preview, when over 150 organizations discovered more than 10,000 critical security flaws in their own infrastructure. The release arrives not as a declaration of solved problems, but as an admission that in a world of accelerating competition, restraint may no longer be a choice any single company can make alone.

A system once flagged as too dangerous to release is now publicly available, with Anthropic acknowledging the risks even as it proceeds.
During private testing, the AI exposed over 10,000 critical security flaws — a demonstration of power that alarmed as much as it impressed.
The company has structured access in tiers, reserving a less-restricted version for a narrow group of cybersecurity and infrastructure specialists, while signaling plans to expand further.
Both public and restricted versions can operate independently on complex tasks for extended periods, raising the stakes for unintended consequences.
Co-founder Jack Clark has publicly warned that the AI industry has a gas pedal but no brake — a striking admission from within the company driving the acceleration.

Anthropic has released Claude Fable 5 to the public — the same AI system the company had previously declared too dangerous for widespread deployment. When the model first circulated privately in April under the name Claude Mythos, it drew alarm from technology executives, financial regulators, and government officials. The concern was concrete: the system's ability to identify and exploit computer vulnerabilities posed a credible threat to digital infrastructure.

Anthropologic acknowledged the risk openly, yet proceeded anyway, citing new safeguards and user limitations. During the private preview period, roughly 150 organizations used the model to uncover more than 10,000 critical security flaws in their own systems — flaws their human teams had missed. For some observers, the discovery rate was proof of capability; for others, it raised questions about whether the public release was driven by genuine safety progress or by competitive pressure.

The release follows a tiered structure. Preview organizations will gain access to Claude Mythos 5, a less-restricted version capable of deeper work in cybersecurity and biological research. The public-facing Fable 5 carries its own safeguards, though their mechanics were not detailed. Notably, both versions can operate unattended on complex tasks for longer than any previous Claude model — amplifying both their usefulness and their risk.

The deeper tension was named by co-founder Jack Clark, who told BBC Newsnight that the AI industry has accelerated beyond society's ability to govern it. "Right now, it's like the AI industry has a gas pedal, but it doesn't have a brake pedal," he said. His words suggest a company that is not entirely comfortable with its own momentum — aware of the risks, uncertain of the alternatives, and releasing Fable 5 less out of confidence than out of a quiet recognition that holding back may no longer be possible.

Anthropic has released Claude Fable 5 to the public—the same artificial intelligence system the company had previously declared too dangerous for widespread use. The move marks a significant shift in the company's approach to deploying its most capable model, one that sparked alarm among technology executives, financial regulators, and government officials when it first circulated privately in April under the name Claude Mythos.

The concern was straightforward: the system's power to identify and exploit computer vulnerabilities made it a potential threat to digital infrastructure. Anthropic acknowledged this risk explicitly, noting that "releasing a model this capable comes with risks." Yet the company proceeded anyway, arguing that new safeguards and user limitations would contain the danger. "Fable's capabilities exceed those of any model we've ever made generally available," Anthropic stated in its Tuesday announcement, a phrasing that underscores just how far the technology has advanced.

What happened during the private preview period offers a window into both the tool's power and the company's confidence in managing it. Roughly 150 organizations were given access to Claude Mythos for testing and evaluation. In that time, these groups identified more than 10,000 critical security flaws in their own systems—flaws the AI had uncovered. The discovery rate itself became evidence of capability: the model could see what human security teams had missed. Some observers questioned whether the subsequent public release was driven by genuine safety improvements or by the momentum of hype and competitive pressure in the AI industry.

The release strategy reveals a tiered approach to access. The roughly 150 preview organizations will now gain entry to Claude Mythos 5, a version without the restrictions that previously limited its ability to assist with cybersecurity and biological research. Anthropic framed this expanded access as limited to "a small group of cyberdefenders and infrastructure providers," though the company signaled plans to broaden the program further. The public version, Claude Fable 5, carries its own safeguards, though the company did not detail their specific mechanics.

One technical distinction matters: both Fable and Mythos can operate "unattended" on human instructions for longer stretches than any previous Claude iteration. This means the AI can work independently on complex tasks without constant human intervention—a capability that amplifies both its utility and its potential for unintended consequences.

Jack Clark, one of Anthropic's co-founders, recently articulated the deeper tension at play. Speaking to BBC Newsnight, he suggested that the AI industry has accelerated so rapidly that society needs mechanisms to slow it down. "You want the option to be able to take your foot off the gas and put your foot on the brake," Clark said. "Right now, it's like the AI industry has a gas pedal, but it doesn't have a brake pedal." His comment hints at a company grappling with its own power—aware that releasing advanced tools carries risks, yet uncertain whether restraint is even possible in a competitive landscape where other labs are building similar systems. The public release of Fable 5, then, may reflect not confidence that the safeguards are sufficient, but resignation that the alternative—holding back—is no longer viable.

Citas Notables

Fable's capabilities exceed those of any model we've ever made generally available

— Anthropic, in Tuesday announcement

You want the option to be able to take your foot off the gas and put your foot on the brake. Right now, it's like the AI industry has a gas pedal, but it doesn't have a brake pedal.

— Jack Clark, Anthropic co-founder

La Conversación del Hearth Otra perspectiva de la historia

Inventor

Why release something the company itself said was too dangerous?

Model

Because the private preview showed the tool could actually help find security flaws. The organizations testing it discovered over 10,000 vulnerabilities. That's powerful evidence the AI could be useful in the right hands.

Inventor

But that's also evidence it could cause harm in the wrong hands.

Model

Exactly. That's the tension Anthropic is sitting in. They've added safeguards and user limits, but they're also planning to expand access further. It's not clear those safeguards are airtight.

Inventor

What does Jack Clark mean about the brake pedal?

Model

He's saying the entire AI industry is moving so fast that nobody has built in mechanisms to slow down. If Anthropic holds back, competitors won't. So releasing becomes almost inevitable, even if it feels risky.

Inventor

Is this just marketing—making it sound dangerous to make it sound impressive?

Model

Some of the hype probably is marketing. But 10,000 security flaws found by a single tool? That's not spin. That's real capability, and it's genuinely hard to know what to do with it.

Inventor

So what happens next?

Model

They're expanding access to a broader group. The question is whether the safeguards hold, and whether the security benefits outweigh the risks. We won't know for a while.

Quieres la nota completa? Lee el original en BBC News ↗

Anthropic releases 'too powerful' Claude AI tool to public with safeguards

Citas Notables

Cobertura Relacionada

Recibe The Register en tu bandeja