Improving Liver Organ Transplantation Allocation with Artificial Intelligence

AM Gomez-Orellana et al. Clin Gastroenterol Hepatol 2025; 23: 2187-2196. Open Access! Gender-Equity Model for Liver Allocation Using Artificial Intelligence (GEMA-AI) for Waiting List Liver Transplant Prioritization

Background: “The current gold standard for ranking patients in the waiting list according to their mortality risk is the Model for End-Stage Liver Disease corrected by serum sodium (MELD-Na), which combines 4 serum analytic and objective parameters, namely bilirubin, international normalized ratio (INR), creatinine, and sodium…2

“The Model for End-Stage Liver Disease (MELD) 3.0 was developed and internally validated in the United States,4 and the gender-equity model for liver allocation corrected by serum sodium (GEMA-Na) was trained and internally validated in the United Kingdom and externally validated in Australia.5… GEMA-Na was associated with a more pronounced discrimination benefit than MELD 3.0, probably owing to the replacement of serum creatinine with the Royal Free Hospital cirrhosis glomerular filtration rate (RFH-GFR)6 in the formula.5

Methods:

Key findings:

  • GEMA-AI made more accurate predictions of waiting list outcomes than the currently available models, and could alleviate gender disparities for accessing LT

Discussion Points:

  • The components of the current scores available for waiting list prioritization provide objective and reproducible information…which in turn are associated with the probability of mortality or clinical deterioration resulting in transplant unsuitability.18 However, this relationship is nonlinear…at a certain point, for the highest values typically found in the sickest patients, the relationship with the outcome risk becomes exponential.5 …GEMA-AI was the only adequately calibrated model and showed the greatest advantage on discrimination”
  • An “advantage of nonlinear methodologies, and particularly of ANNs [artificial neural network], is their ability to identify patterns of combinations of values that are associated with an increased risk of death or delisting due to clinical worsening. While linear models give a fixed weight to each variable irrespective of its value or the value of other variables in the model, ANNs could capture specific combinations to modulate the weighting.19

My take: In the movie, iRobot, Detective Spooner instructs the robot: “Sonny, save Calvin.” While things worked out in the movie, it turns out that the robot would usually make a better decision. This study shows that AI has the potential to reduce waiting list mortality by taking advantage of weighing non-linear variables.

Related blog posts:

Dr. Jennifer Lee: AI for Peds GI

Recently, Dr. Jennifer Lee gave our group an excellent update on artificial intelligence (AI) for pediatric gastroenterology.  My notes below may contain errors in transcription and in omission.

  • AI is ubiquitous -it helps you login into your phone, helps with traffic apps, filters spam from email, and even edits Bowel Sounds (gets rid of the ‘umms’)
  • AI can help and AI can harm
  • Dr. Lee thinks that AI is not going to replace doctors and may help doctors in their clinical work
  • AI is teaching computers to think and predict problems. This can include analyzing radiology images, endoscopic findings (eg. polyps), interpreting EKGs, help with voice recognition, and scribe office visits (still in early stages)
  • For EoE, it was hypothesized that PPI-REE was different than EoE. However, it turned out that no significant differences were found. Thus, diagnosis of EoE no longer requires exclusion of EoE. (Related blog posts: Do We Still Need PPI-REE?, Updated Consensus Guidelines for Eosinophilic Esophagitis)
  • For colonoscopy, AI may aid polyp detection but whether this is clinically meaningful is unclear
  • With more complex analysis, AI is less transparent
  • AI algorithms can increase bias
  • Reliance on AI could lead to skill deterioration. MIT did a study showing less brain activity when using ChatGPT
  • Generative AI can create a summary of a patient chart. EHRs are partnering with AI
  • Agentic AI is when AI is set up to act autonomously like reminding patients to get vaccines, reminding to make appointments, or helping schedule appointments
  • AI in the clinic and hospital may help reduce documentation burden, improve satisfaction and improve safety for patients
  • AI does have a problem of hallucination (‘making stuff up’) (my comment: so can people). Case report of man admitted to the hospital after following ChatGPT advice in substituting sodium bromide to reduce salt intake (Eichenberger et al. Annal Internal Medicine, 2025. A Case of Bromism Influenced by Use of Artificial Intelligence)
  • AI tools are still in early stages; however, ChatGPT uptake has been much quicker than previous internet tools

Related blog posts:


Related article: A Soroush et al. Clin Gastroenterol Hepatol 2025; 23: 1472-1476. Impact of Artificial Intelligence on the Gastroenterology Workforce and Practice

AI for GI

This month’s Gastroenterology issue is devoted solely to the use/expected uses as well as risks of artificial intelligence (AI) for gastroenterology and hepatology.

DL Shung, M Iacucci. Gastroenterol 2025; 169: 391-392. Artificial Intelligence in Gastroenterology and Hepatology: Potential and Perils

An excerpt:

“AI is reshaping the landscape of gastroenterology and hepatology with the promise of better, faster, more objective, and standardized care of delivery. However, behind the algorithms lies a more insidious risk: the erosion of trust in human providers…Information risk …include both error commission (ie, when the models generate false statements, introduce nonsensical concepts, or fabricate sources) and error omission (ie, summaries that omit critical information)…

When AI becomes the center of care, patients may perceive their doctors as intermediaries…diminishing the therapeutic effect of the patient-physician relationship…This arrangement can dilute clinical training, increase physician burnout, and lead to medicolegal implications…Other risks include perpetuating bias from nonrepresentative training data and amplifying uncertainty of AI due to lack of real-world validation…

We hope that AI systems will allow us to spend more, not less, time with patients and empower us to provide personalized care by leveraging high-quality multimodal data.”

Most of the articles are behind a paywall in this issue. There are five that are open access articles:

My take: These articles provide a good deal of information about the applications and risks of AI. In my view, physicians will be needed more than ever to help interpret/manage the huge amount of information available.

Related blog posts:

The Future of Medicine: AI’s Role vs Human Judgment

Pranav Rajpurkar and Eric J. Topol. NY Times Feb 2, 2025: The Robot Doctor Will See You Now

An excerpt:

“The medical community largely anticipated that combining the abilities of doctors and A.I. would be the best of both worlds, leading to more accurate diagnoses and more efficient care…That assumption might prove to be incorrect. A growing body of research suggests that A.I. is outperforming doctors, even when they use it as a tool…

Simply giving physicians A.I. tools and expecting automatic improvements doesn’t work. Physicians aren’t completely comfortable with A.I. and still doubt its utility, even if it could demonstrably improve patient care…

Medical training will need to adapt to help doctors understand not just how to use A.I., but when to rely on it and when to trust their own judgment…But the promise for patients is obvious: fewer bottlenecks, shorter waits and potentially better outcomes. For doctors, there’s potential for A.I. to alleviate the routine burdens so that health care might become more accurate, efficient and — paradoxically — more human.”

In a related Substack article (open access: Opinion | When A.I. Alone Outperforms the Human-A.I. Partnership), Dr. Topol and Dr. Rajpurkar list several studies showing that AI alone may be better than physicians with AI under some circumstances:

My take (borrowed from the authors): “The future of medicine won’t be shaped by a simple choice between human or artificial intelligence, but by our ability to understand their respective strengths and limitations, and to orchestrate their collaboration in ways that truly benefit patient care.”

Related blog posts:

Dr. Sana Syed: AI Advancements in Pediatric Gastroenterology

Recently, Dr. Sana Syed gave Children’s Healthcare of Atlanta Grand Rounds. She provided an excellent update on the development of artificial intelligence (AI) to select targeted therapies for pediatric gastroenterology diseases.  My notes below may contain errors in transcription and in omission. Along with my notes, I have included many of her slides.

Key points:

  • One of the goals of using AI is to identify the right therapy at the time of diagnosis. Currently, diseases like eosinophilic esophagitis (EoE) and Crohn’s disease have multiple treatment options. However, many patients do not respond to first-line treatments; many develop complications due to not responding to treatment.
  • Currently we are lacking adequate biomarkers for individualized therapy. AI has the potential to sort through massive amounts of data (histologic, genetic, pharmacokinetics, transcriptome, metabolomics, etc) to allow for precision therapy.
  • For EoE, machine-learning has already identified three subtypes that may affect clinical management. EoE1 is associated with a normal-appearing esophagus. EoE2 is associated with being steroid refractory. EoE3, when compared to the other two endotypes, is associated with adult-onset and narrow-caliber esophagus or stricturing.
  • For Crohn’s disease, research has shown that younger age has been associated with an increased risk of not responding to anti-TNF therapy
This is a quote from President Obama when his administration announced massive funding
toward precision medicine in January of 2015, that the promise of precision medicine is
”delivering the right treatments at the right time, every time to the right person.” This figure illustrates some of the kinds of data that Dr. Syed had access to as faculty at UVA, including
genomics, epigenome, transcriptomics, proteomics, metabolomics, etc.
Shoda and colleagues, used a combination of histology data, endoscopic features, histologic and endoscopic scoring indices, and transcripts that make up the eosinophilic esophagitis diagnostic
panel, a quantitative PCR assay with 96 EoE representative genes. The key message from all of those visualizations is that they found that EoE can be divided into three distinct endotypes after analyzing transcriptomics changes via partition-around-medoid clustering, a machine-learning method.
In this project, the researcher intend to curate a novel metabolic network specific to the ileum,
which is relevant to Crohn’s disease, link metabolic processes with Crohn’s disease phenotypes
using in silico metabolic network modeling and ‘omics and characterize and target metabolic
pathways in an organoid model generated from patient-derived Crohn’s disease tissue.
In CoMPAS, the researchers aim to leverage artificial intelligence methods (AI) methods to build predictive
models for CD using histology slides and single-cell RNA sequencing, allowing for risk
stratification of B1 patients who will respond to anti-TNF therapy
The goal of our project is to create a multi-omics reference dataset with scRNA-seq data
coupled with contextual data on tissue morphology, ancestry, social determinants of health, and
the environment. The cohort for this study is enrolling patients who have clinical indications for endoscopy like foreign body removal, reflux, abdominal pain

My take: This work is necessary to identify the right treatments for each patient and will lead to better outcomes. We are already seeing the early stages of machine-learning’s impact on clinical care. In many other fields, AI work is much further along (especially in oncology). A recent study in Nature identified JAK inhibitors as potential life-saving therapy with toxic epidermal necrolysis (TEN).

Reference: Nordmann, T.M., Anderton, H., Hasegawa, A. et al. Spatial proteomics identifies JAKi as treatment for a lethal skin disease. Nature (2024). https://doi.org/10.1038/s41586-024-08061-0

Summary from Eric Topol (Ground Truths) focusing on spatial omics: Thierry Nordmann, Matthias Mann and their international consortium, used deep visual proteomics from 3μm PPFE sections of skin biopsies in patients affected by TEN…

More than 5,000 proteins were quantified from single cells—keratinocyte and immune cells—using mass spec, for the 4 different skin conditions (proteome cluster in Figure below, left panel). This led to the finding that the TEN patients had marked increased in Type 1 and 2 interferon signaling and activation of phosphorylated STAT1, which invoked the janus kinase (JAK/STAT) pathway. Subsequent steps were to test JAK inhibitors in cell culture (with live cell imaging) and in two different mouse models, all showing highly potent, dose-dependent impact on inhibition of the intense inflammatory process and disease severity…

They went on to treat seven patients at Fuian Medical University, the course of one patient shown below, treated with a JAKi on day 4 after diagnosis, and manifesting a marked response starting within 48 hours. All 7 patients fully resolved, with no side effects…

For spatial medicine, there are multiple analytical challenges that invoke the need for machine learning and A.I., including segmentation of cell types, automated capture of cells of microdissection, extracting useful information from the >5,000 proteins quantified per cell, and ultimately, as we’ll see more in the future, A.I. powering the construction of high-resolution 3D maps.

Related blog posts:

Will Future Pathology Reports Include Likely Therapeutic Recommendations?

X Liu et al. Gastroenterol 2024; 166:921-924. Machine Learning–Based Prediction of Pediatric Ulcerative Colitis Treatment Response Using Diagnostic Histopathology

M Iannucci et al. Gastroenterology 2024: 166: 730-732. Editorial. Open Access! A Baby Step or a Real Giant Stride: Histomic Enabled by Artificial Intelligence to Predict Treatment Response in Pediatric Patients With Ulcerative Colitis

In this article, their machine learning algorithm was trained on 187,571 informative patches from rectal hematoxylin and eosin biopsy samples from 292 treatment-naive pediatric patients with UC in a multicenter inception cohort (PROTECT) 2) study.

Key findings (summarized by editorial):

  • The authors first trained the machine learning models on 250 histomic features at the patch level and achieved an area under the receiver operating characteristic curve (AUROC) of 0.87 (95% confidence interval [CI], 0.73–1.00) and an accuracy of 0.90 (95% CI, 0.80–1.00) at the WSI (whole slide images) level in predicting treatment response.
  • A subset of 18 histomic features exhibited comparable performance with an AUROC of 0.89 (95% CI, 0.71–0.96) and accuracy of 0.90 (95% CI, 0.80–1.00) to model using the full set of 250 features, indicating the potential for standardized practical application in clinical settings.
  • The authors confirmed that the set of 18 histomic features demonstrated comparable performance on the real-world independent SickKids cohort7 (University of Toronto) with an AUROC of 0.85 (95% CI, 0.75–0.95) and accuracy of 0.85 (95% CI, 0.75–0.95) at the WSI level.

An important limitation on this study was that the population was 83% white, indicating that it may have less applicability in other cohorts.

My take (borrowed from editorial): Although this study focused solely on the therapeutic outcomes of mesalamine on treatment-naïve patients, it is anticipated that a comparable methodology (based on the fusion of machine learning and digital histopathology) could be applied in subsequent research to elucidate the necessity for colectomy, evaluate responses to biological agents, and optimize drug selection from the current armamentarium. In other words, this approach utilizing routine biopsy specimens may offer a pathway toward a precision personalized approach to managing pediatric-onset UC (and possibly IBD more broadly).

Torrey, Utah

Medical Diagnostic Errors

Eric Topol 1/28/24: Toward the eradication of medical diagnostic error

Key points/excerpts:

  • There is little evidence that we are reducing diagnostic errors despite more lab testing and more imaging. “One of the important reasons for these errors is failure to consider the diagnosis when evaluating the patient.” This, in turn, may be related to brief office visits.
  • There are a few ways that artificial intelligence (AI) is emerging to make a difference to diagnostic accuracy. ..A systematic analysis of 33 randomized trials of colonoscopy, with or without real-time AI machine vision, indicated there was more than a 50% reduction in missing polyps and adenomas, and the inspection time added by AI to achieve this enhanced accuracy averaged only 10 s. 
  • AI support to radiologists for a large mammography study “showed improvement in accuracy with a considerable 44% reduction of screen-reading workload.” The cancer detection rate was 6.1 per 1000 compared to 5.1 per 1000 in the control group.
  • In difficult NEJM CPC cases, large language AI model (LLM) outperformed clinicians (see slide below).” The LLM was nearly twice as accurate as physicians for accuracy of diagnosis, 59.1 versus 33.6%, respectively.”
  • “Likewise, the cofounder of OpenAI, Ilya Sutskever, was emphatic about AI’s future medical superintelligence: ‘If you have an intelligent computer, an AGI [artificial general intelligence], that is built to be a doctor, it will have complete and exhaustive knowledge of all medical literature, it will have billions of hours of clinical experience.’ “

My take (borrowed from Dr. Topol): “We are certainly not there yet. But in the years ahead, …it will become increasingly likely that AI will play an invaluable role in providing second opinions with automated, System 2 machine-thinking, to help us move toward the unattainable but worthy goal of eradicating diagnostic errors.”

Related blog posts:

ChatGPT for Colonoscopy Questions Plus One

T-C Lee et al. Gastroenterol 2023; 165: 509-511. Open Access! ChatGPT Answers Common Patient Questions About Colonoscopy

In this study, ChatGPT answers to questions about colonoscopy were compared to publicly available webpages of 3 randomly selected hospitals from the top-20 list of the US News & World Report Best Hospitals for Gastroenterology and GI Surgery.

Methods: To objectively interpret the quality of ChatGPT-generated answers, 4 gastroenterologists (2 senior gastroenterologists and 2 fellows) rated 36 pairs of CQs and answers, randomly displayed, for the following quality indicators on a 7-point Likert scale: (1) ease of understanding, (2) scientific adequacy, and (3) satisfaction with the answer (Table 1) Raters were also requested to interpret whether the answers were AI generated or not.

Key findings:

  • ChatGPT answers were similar to non-AI answers, but had higher mean scores with regard to ease of understanding, scientific adequacy, and satisfaction.
  • The physician raters demonstrated only 48% accuracy in identifying ChatGPT generated answers

My take:  This is yet another study, this time focused on gastroenterology, that show how physicians/patients may benefit from leveraging chatbots to improve communication.

Related blog posts:

Also this:

Answering Patient Questions: AI Does Better Than Doctors

Eric Topol (4/28/23): When Patient Questions Are Answered With Higher Quality and Empathy by ChatGPT than Physicians

RE: JW Ayers et al. JAMA Intern Med 2023; doi:10.1001/jamainternmed.2023.1838 Open Access! Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum

Thanks to Jeff Lewis for sharing Eric Topol’s article. Here are some excerpts:

In JAMA Internal Medicine, a new report compared the quality and empathy of responses to patient questions for doctors vs ChatGPT, a generative AI model that has already been superseded by GPT4…

Researchers at UCSD used the Reddit social media platform (Reddit’s/AskDocs) to randomly selected 195 patient questions that had been answered by verified, volunteer physicians and also posed them to ChatGPT in an identical fashion. The answers were reviewed by a panel of 3 health care professionals blinded to whether the response was from a doctor or the chatbot…

The results were pretty striking, as shown in the graph below. For quality, the evaluators preferred the ChatGPT response 79% of the time; the quality score of good and very good was 79% for the chatbot vs 22% for physicians, a near 4-fold difference. That gap was even greater for empathy rating of the responses, nearly 10-fold higher favoring the chatbot, for proportion of empathetic or very empathic responses (45.1 vs 4.6%). As would be expected, the length of responses by doctors was significantly reduced (average of 52 vs 211 words, respectively). Of course, doctors have less time available and machines can generate such long-form output in seconds…

Several examples were presented, and I show one below that is representative. This actually happened to my mother-in-law several years ago and it was quite a frightening incident with extensive corneal injury.There were definite limitations of the study, since the questions were derived from a social medial platform, not from physicians who were actually involved in the care of the patient asking questions. And, of course, this was not comparing the additivity of a chatbot plus a physician..

My take: This study indicates that physicians/patients may benefit from leveraging chatbots to improve communication.  Artificial intelligence assistants can aid in drafting responses to patient questions. Though physicians, at this time, are more capable of providing patient-specific information. For more general questions, chatbots appear to do quite well. My personal office-related experience with ChatGPT includes drafting an appeal letter, providing educational material on encopresis, helping with a condolence letter, and researching a rare condition. Just two days ago (at the time of drafting this post), I asked ChatGPT to explain ulcerative colitis and treatments, including mesalamine and steroids, at an 8th grade reading level in Spanish. I found this was a good way to start a conversation with a family.

This study was covered in many news outlets including WSJ, CNN, Yahoo, Daily Mall…

Related blog posts:

Have you tried out ChatGPT?

There has been a lot written about ChatGPT (see some NY Times links below). Earlier this week, one of my sons placed the following prompt:

“Write me a funny poem about a doctor from Atlanta who deals with diarrhea all day long”

This is the ChatGPT response:

There once was a doctor from Atlanta
Whose patients all suffered from diarrhea
They came in by the dozens
Leaving behind a stench that was quite humbling
But this doctor was quite unflappable

He had a cure for every case
And a smile on his face
As he worked tirelessly
To help each patient feel less queasy

His office was filled with the sounds of relief
As patients left feeling much less grief
They praised his skill and his bedside manner
And promised to spread the word about this Atlanta diarrhea-banner

So if you’re in Atlanta and feeling quite ill
And your stomach is acting quite shrill
Don’t hesitate, don’t be shy
Just make your way to this doctor’s side
And say goodbye to that diarrhea for good!

NY Times Articles:

“It can extrapolate and take ideas from different contexts and merge them together.” The new chatbots do this with what seems like complete confidence. But they do not always tell the truth. Sometimes, they even fail at simple arithmetic. They blend fact with fiction. And as they continue to improve, people could use them to generate and spread untruths.

An example cited in this article:

My take: ChatGPT (& similar programs) is a huge technological advance. In educational settings, it could help explain complicated topics. Of course, there is a concern that it could undermine education if students do not go through the process of doing their own work. In healthcare, it is possible that these programs could advance patient education and help in the diagnostic process.

Another example (on 12/16/22) -this took less than 30 seconds: