University of Kansas develops AI model to shield patient privacy in ECG data

Preserve what matters medically, strip away what shouldn't be there
The core challenge the KU team tackled: keeping clinical utility while removing personal information from shared ECG data.

The heart has long been medicine's most intimate map, but modern AI has learned to read that map in ways its makers never intended — inferring identity, age, and race from the same electrical signals meant only to reveal disease. Researchers at the University of Kansas have answered this quiet crisis with PP-VAE, a model designed to hold two obligations in balance: the clinical need to share cardiac data across institutions, and the ethical imperative to protect the people behind that data. Their work, published in Scientific Reports and supported by the American Heart Association, suggests that privacy and medical progress need not be adversaries.

  • AI systems can now extract a patient's age, sex, race, and even identity from ECG signals — a capability that turns routine cardiac data into an unexpected privacy liability.
  • Hospitals that share heart data for research risk exposing sensitive personal attributes, creating ethical and legal tensions that have quietly slowed medical collaboration.
  • KU doctoral researcher Fairaz Shadmani Shishir led the development of PP-VAE, using independent convolutional neural networks to suppress biographical signals while preserving clinically vital predictions like heart failure risk and early mortality.
  • The model matched state-of-the-art performance on cardiac diagnostics while significantly reducing the personal information that could be inferred — a competitive result that validates the privacy-utility tradeoff as achievable.
  • The team built in protections against diagnostic bias by balancing male and female representation and including diverse racial groups, addressing a long-standing harm to marginalized patients and women.
  • A planned public release of the model aims to lower the barrier for hospitals worldwide to collaborate safely — though broader validation across global populations remains the next frontier.

An electrocardiogram records the heart's electrical activity — but modern AI has learned to read far more into those signals than any cardiologist intended. Age, sex, race, and even individual identity can be inferred from the same data meant only to reveal cardiac health. For hospitals and research institutions that routinely share ECG data across organizations, this hidden capacity creates a genuine ethical problem.

Researchers at the University of Kansas built PP-VAE to resolve the tension: an AI model that preserves the clinical information doctors need while stripping away the biographical details that shouldn't travel with it. Led by doctoral student Fairaz Shadmani Shishir alongside collaborators from KU Medical Center, the team used independent convolutional neural networks to suppress personal attributes embedded in ECG signals without sacrificing diagnostic accuracy. The model was trained to assess left ventricular ejection fraction and detect left ventricular hypertrophy — two critical markers of heart disease and mortality risk — while learning to obscure demographic and biometric clues at the same time.

Tested against other leading approaches, PP-VAE performed competitively on cardiac prediction tasks while revealing significantly less personal information. The findings appeared in Scientific Reports. Shishir framed the work in plainly practical terms: institutions need to collaborate and advance AI without unnecessarily exposing sensitive patient attributes. The team also designed the model with equity in mind, ensuring balanced representation across sex and race in the training data — a deliberate response to the diagnostic bias that has historically disadvantaged women and marginalized communities.

The researchers acknowledged that their training data came primarily from KU Medical Center, which may limit how well the model generalizes elsewhere. Expanding to datasets from diverse global populations is the next step. To accelerate adoption, the team plans to release PP-VAE publicly, allowing hospitals to use it directly or adapt it to their own data — an open approach the American Heart Association-backed team sees as essential to their broader mission of making medical collaboration both safer and more just.

An electrocardiogram seems like a straightforward thing—a recording of the heart's electrical activity, nothing more. But modern AI systems have learned to read far more into those squiggly lines than cardiologists ever intended. They can infer a patient's age, sex, race, and even identity from the same signals that show whether the heart is healthy. This hidden capacity to extract personal information creates a real problem for hospitals and research institutions that want to share ECG data across organizations without exposing their patients' identities.

Researchers at the University of Kansas set out to solve this problem by building an AI model that could have it both ways: preserve the clinical information doctors need while stripping away the biographical details that shouldn't be there. The result, called PP-VAE, represents a deliberate attempt to separate what matters medically from what matters only to privacy.

Fairaz Shadmani Shishir, a doctoral student in electrical engineering and computer science at KU, led the work alongside collaborators from KU Medical Center. Their approach used independent convolutional neural networks to reduce how much personal information could be extracted from ECG signals while keeping the model's ability to predict clinically important outcomes intact. Specifically, they trained the system to accurately assess left ventricular ejection fraction—a key measure of heart function and early mortality risk—and to detect left ventricular hypertrophy, a thickening of the heart muscle that signals serious disease. At the same time, the model learned to obscure the demographic and biometric clues embedded in the data.

The team tested their model against other state-of-the-art approaches and found it performed competitively. It could predict heart disease and mortality risk as well as existing methods while revealing significantly less personal information from the ECG signals themselves. The findings appeared in Scientific Reports, and they matter because hospitals and medical institutions routinely share cardiac data across organizations for research and collaboration. Without a way to protect privacy, that sharing becomes ethically fraught.

Shishir emphasized that the research was driven by a practical need. "Protecting patient privacy is essential when sharing medical data," he said. The goal was to enable institutions to collaborate and advance AI development without unnecessarily exposing sensitive personal attributes. The team also built the model with an eye toward reducing bias in medical care. They ensured balanced representation of male and female patients and included diverse racial groups in the training data, recognizing that bias in diagnosis and treatment has historically harmed marginalized communities and women.

But the researchers acknowledged that training data came primarily from KU Medical Center, which limits how well the model might generalize to other populations. Future work will involve training on datasets from different regions around the world to better understand and address bias across diverse patient populations.

Before the technology can be widely adopted, the team said, building trust and accessibility will be crucial. They plan to release the model publicly so that hospitals and research institutions can use it directly or train their own versions on their own data. That open approach follows common practice in AI research and could accelerate adoption. The American Heart Association supported the work, and the researchers see the public release as essential to their mission: enabling safer collaboration on medical data while protecting the people whose hearts those ECGs represent.

Our goal was to develop a method that preserves clinically useful information in ECGs while reducing the exposure of sensitive personal attributes such as age, sex and demographic details.
— Fairuz Shadmani Shishir, doctoral student in electrical engineering & computer science at KU
Protecting patient privacy is essential when sharing medical data. Our goal was to enable secure sharing of clinically useful ECG information without unnecessarily exposing sensitive personal attributes.
— Fairuz Shadmani Shishir
A Conversa do Hearth Outra perspectiva sobre a história
Inventor

Why does an ECG contain personal information at all? Isn't it just a picture of the heart?

Model

That's the intuitive assumption, but modern AI is much better at pattern recognition than we are. An ECG captures electrical signals from the heart, and those signals are shaped by the body they're running through—age changes the signal, sex changes it, even race can influence the pattern. AI learns to read those subtle differences.

Inventor

So the privacy risk is that someone could identify me from my ECG?

Model

Not just identify you, but infer things about you without your knowledge. A hospital might share your ECG for research thinking they're only sharing heart data, but an AI could extract your approximate age, sex, or demographic information from the same signal.

Inventor

How does PP-VAE actually prevent that?

Model

It uses neural networks to learn which parts of the ECG signal are clinically useful—the things doctors need to know about your heart—and which parts encode personal information. Then it suppresses the personal information while keeping the clinical signal intact.

Inventor

Does that make the model less accurate at detecting heart disease?

Model

That's the key question they tested. They found it performs as well as existing models at predicting mortality risk and detecting serious heart conditions, even while hiding demographic information. You don't have to choose between privacy and accuracy.

Inventor

Why does bias in ECG analysis matter?

Model

If the model is trained mostly on data from one population, it might miss disease in people from other backgrounds. By deliberately including diverse patients in training, they're trying to make sure the model works equally well for everyone.

Inventor

What happens next?

Model

They plan to release the model publicly so hospitals everywhere can use it. That's how you actually change practice—by making the tool available and trustworthy enough that institutions adopt it.

Quer a matéria completa? Leia o original em News-Medical ↗
Fale Conosco FAQ