Madrid launches DxGPT, AI assistant that diagnoses rare diseases in minutes

The AI generates the full list; the doctor decides.
How DxGPT works: it expands diagnostic possibilities rather than narrowing them, leaving final judgment to physicians.

En Madrid, la medicina se encuentra con una de sus fronteras más antiguas: las enfermedades raras, que durante décadas han condenado a los pacientes a años de incertidumbre diagnóstica. El sistema público de salud de la región ha puesto en marcha DxGPT, una herramienta de inteligencia artificial basada en GPT-4, desarrollada junto a Microsoft y la Fundación 29, que en cuestión de minutos genera listas de posibles diagnósticos a partir de descripciones clínicas. No se trata de reemplazar al médico, sino de ofrecerle un segundo par de ojos formado en una literatura médica que ningún ser humano podría abarcar solo.

  • Miles de pacientes con enfermedades raras esperan años —a veces décadas— para recibir un diagnóstico correcto, un retraso que puede costar vidas o calidad de vida irreversible.
  • DxGPT irrumpe en los hospitales madrileños como el primer despliegue mundial de IA generativa para el diagnóstico de enfermedades raras, elevando la presión sobre sistemas de salud de todo el mundo para que respondan.
  • Los médicos introducen descripciones clínicas y el modelo GPT-4 devuelve una lista jerarquizada de posibles enfermedades, acortando el camino hacia el especialista adecuado.
  • Investigaciones paralelas muestran que ChatGPT coincide con médicos de urgencias experimentados en un 60% de los diagnósticos, lo que refuerza —pero también matiza— las expectativas sobre estas herramientas.
  • El piloto comenzó en septiembre de 2023 con pacientes y médicos reales, convirtiendo a Madrid en un laboratorio clínico cuyas conclusiones podrían redefinir la adopción global de la IA diagnóstica.

El sistema público de salud de Madrid ha comenzado a probar DxGPT, una herramienta de inteligencia artificial que promete reducir de meses a minutos el tiempo necesario para identificar enfermedades raras. Desarrollada en colaboración con Microsoft y la Fundación 29, la herramienta funciona como un asistente conversacional basado en GPT-4: el médico describe los síntomas, el historial y los resultados de pruebas de un paciente, y el sistema devuelve una lista ordenada de posibles diagnósticos. El objetivo no es sustituir el criterio clínico, sino amplificarlo, ofreciendo a los especialistas un punto de partida más informado para derivar a los pacientes.

Miguel López Valverde, responsable de digitalización de la región, subrayó que el despliegue se realiza bajo estrictos estándares de seguridad, y destacó el valor especial de la detección temprana en enfermedades que suelen pasar desapercibidas durante años. La herramienta ya estaba disponible en línea para el público antes de que comenzara el piloto hospitalario en septiembre de 2023.

El lanzamiento coincide con investigaciones que arrojan luz sobre las capacidades diagnósticas de la IA. Un estudio del Hospital Jeroen Bosch, en los Países Bajos, publicado en Annals of Emergency Medicine, reveló que ChatGPT coincidió con médicos de urgencias experimentados en aproximadamente el 60% de los diagnósticos al analizar casos anonimizados. Los investigadores advirtieron que la herramienta no está homologada como dispositivo médico, pero reconocieron su potencial para apoyar a los médicos —especialmente a los menos experimentados— reduciendo la carga cognitiva y ayudando a detectar condiciones que de otro modo podrían pasarse por alto.

Para las enfermedades raras, donde un médico puede no ver más que un puñado de casos en toda su carrera, este tipo de apoyo podría ser verdaderamente transformador. Madrid ha decidido no esperar a la tecnología perfecta ni a la claridad regulatoria: está probando la herramienta con pacientes reales, y los resultados de ese experimento podrían marcar el camino para sistemas de salud de todo el mundo.

Madrid's public health system has begun testing an artificial intelligence tool designed to help doctors identify rare diseases in minutes rather than months. The system, called DxGPT, represents the first deployment of generative AI for rare disease diagnosis in Spain and possibly the world. It was developed through a partnership between Madrid's regional health authority, Microsoft, and Foundation 29, with backing from the regional government's digitalization office.

The tool works as a conversational assistant built on OpenAI's GPT-4 model, running through Microsoft's Azure cloud platform. A doctor enters a brief clinical description of a patient—symptoms, test results, medical history—and the system generates a ranked list of possible diagnoses. This list helps physicians decide which specialists to refer patients to, potentially accelerating treatment for conditions that are often diagnosed only after years of misdiagnosis or uncertainty. The system is not meant to replace medical judgment but to augment it, offering doctors a second set of eyes trained on vast medical literature.

Miguel López Valverde, the regional official overseeing digitalization, emphasized that the technology is being deployed with strict safety and responsibility standards. He noted that early detection of rare diseases is particularly valuable, since these conditions often go unrecognized for long periods. The system began limited testing in Madrid hospitals in late September 2023, though members of the public could already interact with it online to see how it works.

The timing of DxGPT's launch coincides with emerging research on AI's diagnostic capabilities. A study published in the journal Annals of Emergency Medicine examined how ChatGPT performed when given anonymized information about thirty emergency room patients. Researchers from Jeroen Bosch Hospital in the Netherlands found that the AI's suggested diagnoses overlapped with those of experienced emergency physicians about 60 percent of the time. Hidde ten Berg, one of the study's authors, found that ChatGPT was particularly effective at generating lists of probable diagnoses and identifying the most likely condition. The AI reasoned through cases in ways that resembled how human doctors think through differential diagnosis.

However, researchers cautioned that ChatGPT itself is not a medical device cleared for clinical use. Its potential lies in supporting physicians—particularly those early in their careers—by providing a structured list of possibilities to consider. The tool could reduce the cognitive load on doctors and help catch diagnoses that might otherwise be missed. For rare diseases, where individual doctors may see only a handful of cases in their entire career, this kind of support could be genuinely transformative.

The Madrid initiative represents a deliberate effort to move AI from research papers into actual clinical practice. Rather than waiting for perfect technology or regulatory clarity, the health system is testing the tool with real patients and real doctors, gathering data on whether it actually improves outcomes. The results will likely influence how other health systems around the world approach AI-assisted diagnosis, particularly for the diseases that remain most difficult to identify.

The AI is not a substitute for doctors, but rather enhances their capabilities and helps them work more efficiently.
— Microsoft, on DxGPT's role
ChatGPT was effective at generating lists of probable diagnoses and suggesting the most likely condition, reasoning through cases much as a human physician would.
— Hidde ten Berg, Jeroen Bosch Hospital
La Conversación del Hearth Otra perspectiva de la historia
Inventor

Why focus on rare diseases specifically? Wouldn't AI be more useful for common conditions that affect millions?

Model

Rare diseases are actually where AI shines most. A doctor might see thousands of common cases but only a handful of rare ones in their career. The AI has access to the entire medical literature instantly. For a disease that affects a few hundred people worldwide, that's invaluable.

Inventor

But the study showed only 60 percent overlap with experienced doctors. That seems like it could miss things.

Model

Sixty percent overlap is actually quite good—it means the AI is thinking like a trained physician, not just pattern-matching. The real value isn't replacing the doctor's judgment but expanding what they consider. A doctor might anchor on one diagnosis and miss others; the AI generates the full list.

Inventor

Are doctors worried this will deskill them? That they'll stop thinking carefully?

Model

That's a real concern, but it cuts both ways. A junior doctor using this tool might actually learn faster, seeing how the system reasons through cases. The danger is if doctors stop thinking altogether and just accept whatever the AI suggests. That's why Madrid is clear: this augments, it doesn't replace.

Inventor

What happens when the AI gets it wrong?

Model

The doctor is still responsible. The AI is a tool, like an X-ray machine. If a doctor relies on it blindly and misses something, that's on them. The system works best when doctors treat it as a collaborator, not an oracle.

Inventor

So this is really about speed—getting patients to the right specialist faster?

Model

Partly that. But for rare diseases, it's also about confidence. A doctor who's unsure might hesitate to refer a patient. The AI gives them permission to trust their instinct and move forward.

Quieres la nota completa? Lee el original en 20 Minutos ↗
Contáctanos FAQ