cross-posted from: https://piefed.world/c/tech/p/1090825/study-finds-ai-outperforms-doctors-in-emergency-diagnoses
In one of the largest studies to compare artificial intelligence and physicians on a wide array of clinical reasoning tasks including real emergency department data, a team of physicians and computer scientists at Harvard Medical School and Beth Israel Deaconess Medical Center evaluated whether an AI system could do what physicians do every day: review a messy patient chart and use that information to determine diagnosis and next steps.
In a new study published April 30, 2026 in Science, co-senior authors Arjun (Raj) Manrai, assistant professor of biomedical informatics at HMS and Adam Rodman, MD, MPH, a hospitalist and clinical researcher at BIDMC and team report that a large language model (LLM) outperformed physicians across many common clinical reasoning tasks including emergency room decisions, identifying likely diagnoses, and choosing next steps in management.
Competing interests:
A.R. is a Visiting Researcher at Google DeepMind. E.H. is employed by Microsoft. J.C. is cofounder of Reaction Explorer LLC, a paid medical expert witness from Elite Experts, and received one-time honoraria or travel expenses for invited presentations by Insitro, General Reinsurance Corporation, AASCIF, and other industry conferences, academic institutions, and health systems. Z.K. discloses royalties from Oakstone Publishing and Wolters Kluwer. A.O. discloses employment of his spouse by Exact Sciences. R.-E.A. is employed by the Massachusetts Medical Society and has consulted for Lumeris.
Harvard told us in the 70s sugar was not harmful, funded by sugar companies.
some pretty big error bars here…

A google and Microsoft affiliated scientist. Jesus fucking Christ.
I read about this previously and I believe the most common criticism was that both humans and LLM only used the EMR for decision making. I’m not a doctor, but I don’t think they only go by the medical record in making decisions in real life.
I world love to hear from actual ED doctors what their take is.
ER doctors take into account testing results, patient interview and longitudinal change while in ER, and of course the physical exam.
Sure, tie one hand behind their back, put on a blindfold and LM’s are just as good. From the eLetters:
the study demonstrates that LLMs can outperform physicians on structured, text‑based reasoning tasks. It does not show that they exceed clinicians in real‑world diagnostic practice, nor that they can safely operate in the noisy, multimodal, time‑sensitive environment of modern hospitals.
AAAS really publishes garbage sometimes. Of course, they won’t show us the peer review comments. Given NIH is shut down, this is the future of Science Magazine.
And how many times was it catastrophically wrong? How many times were the humans catastrophically wrong? Yeah, I thought so.
How well does AI perform compared to human corporate executives?
That’s where the real salary savings could come from.
4 out of 5 AIs agree!!





