In cardiac arrest, good is not good enough.
Each year, hundreds of thousands of Americans suffer cardiac arrest far from hospital walls, and most do not survive — not for lack of medicine, but for lack of a steady voice in the critical first minutes. Researchers from UC San Diego, the University of Pittsburgh, and Johns Hopkins have now demonstrated that an AI system called ChatCPR can guide bystanders through resuscitation with a precision that consistently surpasses human 911 dispatchers, achieving perfect adherence to lifesaving guidelines where trained professionals averaged scores in the sixties and eighties. The finding does not indict the dispatchers — it illuminates the impossible conditions under which they work — and the researchers are careful to position this technology not as a replacement for human judgment, but as a floor beneath it.
- With only 9% of out-of-hospital cardiac arrest victims surviving, every second of imprecise instruction is a second the heart cannot afford.
- Human 911 dispatchers — managing chaos, stress, and multiple demands at once — met only 63% of advanced CPR guideline requirements in real emergency calls, a gap the study calls potentially fatal.
- ChatCPR, built from dispatcher training materials and current resuscitation guidelines, scored 100% on basic steps and 99% on advanced techniques when measured against those same real-world calls.
- The AI was released as open-source software, inviting developers and health systems to adapt it — but researchers warn that real-world chaos, panicked bystanders, and 911 infrastructure must all be tested before deployment.
- A legal frontier is opening alongside the clinical one: bystander CPR protections are well established, but whether those protections extend to AI-guided resuscitation remains an unanswered regulatory question.
Every year, more than 350,000 Americans suffer cardiac arrest outside a hospital. Only nine percent survive. The margin between those outcomes often comes down to the first few minutes — whether someone nearby can perform CPR, and whether the voice on the other end of a 911 call can guide them precisely enough to matter.
A multi-institutional research team set out to test whether artificial intelligence could close that gap. They began by benchmarking several major AI models against a checklist of CPR best practices, finding that while most performed adequately on the basics, performance dropped to around 70% on advanced techniques — the kind that meaningfully improve survival odds. From there, they built ChatCPR from the ground up, grounding it in dispatcher training materials and iteratively correcting the failures they observed in other systems.
In simulated scenarios, ChatCPR scored perfectly. But the more revealing test came when the team compared its outputs against de-identified recordings of real 911 calls where human dispatchers had already provided CPR guidance. The results were unambiguous: dispatchers met 85% of basic guideline requirements and 63% of advanced ones. ChatCPR met 100% and 99%, respectively. The AI was most superior in precisely the areas where stressed, multitasking dispatchers most often fell short — chest compression depth, rate, and the critical instruction to allow full chest recoil between compressions.
The researchers, published in JAMA Internal Medicine, were deliberate in framing this as augmentation rather than replacement. The goal, as one lead researcher described it, is to raise the floor of performance in moments where human judgment remains irreplaceable. Real-world testing is still needed — the system must function amid the noise and panic of actual emergencies and integrate with existing infrastructure.
To accelerate that work, the team released ChatCPR as free, open-source software. They also flagged an unresolved legal question: bystanders enjoy strong protections when performing CPR, but whether those protections extend to AI-assisted resuscitation is unclear. Regulatory frameworks, the researchers concluded, must be built before the technology can fulfill its promise — closing the deadly distance between collapse and care.
Every year, more than 350,000 Americans collapse from cardiac arrest outside a hospital. Nine percent survive. The difference between life and death often comes down to those first minutes—whether someone nearby knows CPR, whether a dispatcher can talk them through it clearly, whether the instructions are precise enough to matter.
A team of researchers from UC San Diego, the University of Pittsburgh, Johns Hopkins, and other institutions set out to test whether an artificial intelligence system could do better than the human beings currently on the other end of 911 calls. What they found was striking: an AI agent called ChatCPR scored perfectly on guideline-based CPR instruction, while the dispatchers it was measured against fell short by significant margins.
The study, published in JAMA Internal Medicine, began by benchmarking popular AI models—ChatGPT, Claude, Grok, Gemini, Llama, and Mixtral—against a checklist of CPR best practices. These models performed reasonably well on the fundamentals. On average, they scored 90 percent on essential steps like where to press on the chest and how fast to compress. But when the researchers tested them on more advanced techniques—the kind that actually improve survival odds, like ensuring the chest fully recoils between compressions—performance dropped to 70 percent on average. In cardiac arrest, that gap can be fatal.
The researchers then built ChatCPR from scratch, grounding it in 911 dispatcher training materials and current CPR guidelines, and iteratively fixing the places where other AI systems had stumbled. In simulated scenarios, ChatCPR achieved 100 percent on both basic and advanced steps. But simulation is not reality. The real test came when the team took a separate set of de-identified 911 calls—actual emergencies where dispatchers had already provided CPR instruction—and compared what those dispatchers said to what ChatCPR would have said.
ChatCPR outperformed human dispatchers in every single comparison. On basic CPR steps, dispatchers met 85 percent of guideline requirements; ChatCPR met 100 percent. On advanced techniques, the gap widened dramatically: dispatchers scored 63 percent while ChatCPR scored 99 percent. The AI excelled in areas where stressed, multitasking dispatchers most often faltered—patient assessment, precise instructions on chest compression depth and rate, and guidance on chest recoil. It was not about style or tone. It was about adherence to the steps that save lives.
But the researchers were careful to frame this not as a replacement for human responders but as a complement to them. Christopher Horvat, director of medical emergency response teams at UPMC Children's Hospital and associate director of the Safar Center for Resuscitation Research at the University of Pittsburgh, emphasized that the goal was to raise the floor of performance in high-stakes situations where human judgment remains essential. The team stressed that real-world testing is still needed—the system must work in chaotic settings, be easy for panicked bystanders to follow, and integrate safely with existing 911 infrastructure.
To that end, the researchers made ChatCPR open-source and free. Any developer or organization can use it, adapt it, and study it. The team also flagged a legal question that will need answering as the technology moves from research to deployment: bystanders have strong legal protections when they perform CPR, but how those protections extend to AI-enabled CPR remains unclear. Regulatory frameworks will be essential. The real promise, the researchers concluded, lies in closing the deadly gap between the moment someone collapses and the moment lifesaving care actually begins.
Notable Quotes
If AI is going to earn its place in medicine, it should start by helping people save the person right in front of them.— John W. Ayers, UC San Diego Qualcomm Institute scientist and study coauthor
This is about supporting people in high-stakes situations where human judgment is essential. The goal is to raise the floor of performance, not to replace trained professionals.— Christopher M. Horvat, Director of Medical Emergency Response Teams at UPMC Children's Hospital
The Hearth Conversation Another angle on the story
Why does a 15-percentage-point gap on basic steps matter so much? Dispatchers are already doing pretty well at 85 percent.
Because in cardiac arrest, you don't get a second chance. Missing one instruction—say, the right compression depth—can mean the difference between brain damage and recovery. When you're talking about 350,000 people a year, that gap scales to thousands of lives.
So this is really about the advanced steps, then. The 36-point gap on those.
That's where the real difference shows up. Dispatchers under stress miss things like making sure the chest fully recoils. ChatCPR doesn't get stressed. It doesn't have five calls in the queue. It focuses entirely on that one person.
Does that mean we should replace dispatchers with AI?
No. The researchers were explicit about that. Dispatchers do things AI can't—they assess the scene, they coordinate with paramedics, they handle the human side of a crisis. ChatCPR is meant to be a tool they use, or that a bystander uses while waiting for help.
What's the biggest risk here?
That people trust it blindly, or that it gets deployed before we understand how it actually performs in the chaos of a real emergency. A simulated call is clean. A real one is someone screaming, a child crying, a person who doesn't know CPR trying to follow instructions while their hands are shaking.
And legally?
That's unsolved. Right now, if you try CPR and it doesn't work, you're protected. But if an AI tells you to do something and it goes wrong, who's liable? That question has to be answered before this goes mainstream.