Health Advice From AI Chatbots Frequently Wrong

T Rosenbluth, NY Times 2/9/26: Health Advice From A.I. Chatbots Is Frequently Wrong, Study Shows

An excerpt:

new study published Monday provided a sobering look at whether A.I. chatbots, which have fast become a major source of health information…

The experiment found that the chatbots were no better than Google — already a flawed source of health information — at guiding users toward the correct diagnoses or helping them determine what they should do next. And the technology posed unique risks, sometimes presenting false information or dramatically changing its advice depending on slight changes in the wording of the questions…

The models have passed medical licensing exams and have outperformed doctors on challenging diagnostic problems.

But Adam Mahdi, a professor at the Oxford Internet Institute and senior author of the new Nature Medicine study, suspected that these clean, straightforward medical questions were not a good proxy for how well they worked for real patients…

So he and his colleagues set up an experiment. More than 1,200 British participants, most of whom had no medical training, were given a detailed medical scenario, complete with symptoms, general lifestyle details and medical history. The researchers told the participants to chat with the bot to figure out the appropriate next steps, like whether to call an ambulance or self-treat at home. They tested commercially available chatbots like OpenAI’s ChatGPT and Meta’s Llama.

The researchers found that participants chose the “right” course of action — predetermined by a panel of doctors — less than half of the time…They were no better than the control group, who were told to perform the same task using any research method they would normally use at home, mainly Googling…

Participants didn’t enter enough information or the most relevant symptoms, and the chatbots were left to give advice with an incomplete picture of the problem…By contrast, when researchers entered the full medical scenario directly into the chatbots, they correctly diagnosed the problem 94 percent of the time…

Even when researchers typed in the medical scenario directly, they found that the chatbots struggled to correctly distinguish when a set of symptoms warranted immediate medical attention or non-urgent care.

My take: AI and chatbots can be quite helpful and continue to improve. This study and the summary by NY Times show some of the limitations. Even small changes in wording/prompts can alter the advice from chatbots considerably.

Related blog posts:

Iguazu Falls

Taking Away the Keys from Older Clinicians

DB Kramer et al. NEJM 2026; 394: 402-407. Promoting Fairness in Screening Programs for Late-Career Practitioners

This is an interesting article regarding screening late-career physicians (LCPs) to assure competency.

An excerpt:

Late-career physicians (LCPs) are an integral part of the U.S. medical workforce. Nearly a quarter of practicing physicians in the United States are over 65 years of age, and they are serving at a time of overall physician scarcity.1,2 Older physicians bring valuable wisdom and expertise to patient care, but many will experience cognitive and physical decline that may affect their clinical skills.3,4 Interest has grown among hospitals in mandatory screening programs that could proactively identify physicians whose ability to deliver safe care may be compromised, before patient harm occurs.5

Yet physicians also have interests related to being screened that deserve respect, and LCP programs can and should protect these interests by ensuring procedural fairness…

Evidence that LCPs can pose risk to patients has motivated health care institutional leaders to develop mandatory screening programs.5,12 LCP policies may require testing of the aspects of physicians’ cognitive and physical functioning that are relevant to clinical activities; such testing is usually triggered when a physician reaches an age threshold (commonly 70 years) and is tied to renewal of privileges….Among physicians’ objections are that tests have imperfect predictive accuracy and that erroneous results could threaten their reputation and livelihood…

 Fair assessment requires that the screening tests and processes employed provide an accurate, impartial assessment of relevant skills. An appeals process should be developed that gives physicians a meaningful opportunity to contest any restrictions on their privileges based on test results. Physicians who participate in LCP programs, like employees in other industries, retain recourse to the Equal Employment Opportunity Commission and the courts to contest wrongful termination, age discrimination, and disability discrimination…

The other key component considered by courts evaluating individual burden is known as least infringement. In the context of LCP programs, such an inquiry would center on whether an adverse action taken on the basis of test results is the least-restrictive option that is commensurate with the goal of protecting patient safety…

Care should be taken not to mistakenly hold LCPs to a higher standard than younger physicians simply by applying greater scrutiny to their practice.26 

My take: In theory, screening of late-career physicians makes a lot of sense to protect patient welfare. In practice, it may be difficult to design tests that have adequate sensitivity and specificity with regard to physician capability. This is true for both older and younger physicians.

Related blog post: “You Still Going to be Doing This?”

Iguazu Falls

Dr. Ajay Kaul: Intestinal Pseudo-Obstruction

Dr. Ajay Kaul gave our group a terrific update on chronic intestinal pseudo-obstruction (CIPO). My notes below may contain errors in transcription and in omission. Along with my notes, I have included many of his slides.

Key points:

CIPO = Chronic Intestinal Pseudo-Obstruction; PIPO = Pediatric Intestinal Pseudo-Obstruction
  • Several subtypes of intestinal pseudo-obstruction: myopathy, mesenchymopathy, neuropathy
  • Also, pseudo-obstruction could be inflammatory versus non-inflammatory.  In those with active inflammation, immunosuppression medications may be helpful. However, routine intestinal biopsy is not recommended
  • Gene panel can help with diagnosis
ACTG2 is mutation associated with Megacystis-Microcolon-Intestinal Hypoperistalsis Syndrome (MMIHS)
  • Anesthesia is associated with delayed recover of bowel function
  • Malrotaion is associated with myopathic CIPO
  • Myopathic CIPO affects all bowel regions along with bladder/uterus.  Myopathic CIPO patients are often good candidates for intestinal transplantation.  Neuropathic CIPO can be isolated to one region of the bowel
  • Flares of CIPO are well-recognized but poorly described.  Often, these last for a few days and can be managed with supportive care
  • Myopathic CIPO is characterized by low amplitude phase III MMC on manometry
  • Monitoring of nutrient deficiencies with CIPO is similar to monitoring for other causes of short bowel syndrome
  • Ileostomy prolapse and diversion colitis are frequent complications.  Diversion colitis can be managed with refeeding into mucus fistula
MIDs = Mitochondrial diseases
  • Prokinetics are not very effective. Prucalopride may help some.  Dr. Kaul often will recommend a 4-week trial and continue if helping.  However, prucalopride may contribute to suicidal ideation and families need to be aware of this
  • Intestinal transplantation is being used much less often due to better management of intestinal failure.  CCHMC only had one child undergo ITx last year.  ITx in U.S. now has an estimated 5-year survival of 60%
  • GT placement and ileostomy are frequently needed, especially if trouble tolerating full oral diet
  • Several emerging treatments including the use of intestinal organoids are being studied

Related blog posts:

Upadacitinib vs Risankizumab for Crohn’s Disease

RS Dalal et al. Clin Gastroenterol Hepatol 2026; 24: 255-257. One-Year Comparative Effectiveness and Safety of Upadacitinib vs Risankizumab for Crohn’s Disease

This was a retrospective single-center study (n=219) assessing upadacitinib (n=67) or risankizumab (n=152) for active Crohn’s disease (CD). Treatment initiation as post-operative prevention or for non-CD indication were excluded.

**The patients receiving upadacitinib were generally younger, had more anti-TNF/ustekinumab failures, higher CRPs, and higher HBSs compared to risankizumab-treated patients.

Key findings:

  • After inverse probability of treatment-weighted (IPTW) analysis, most outcomes were similar between groups. However, upadacitinib-treated patients had more surgeries, adverse events, and treatment discontinuation.
Fractions include nonintegers due to weighting, and denominators vary due to missing data.

My take: While this study favors risankizumab over upadacitinib, most of the outcomes were fairly similar. Risankizumab may have better long-term durability. However, the observational design limits the conclusions, particularly as the upadactinib-treated patients appeared to be more refractory at baseline. A prospective head-to-head study would be more definitive.

Related blog posts:

Case Report: Car-T for Refractory Ulcerative Colitis

 F Muller et al. NEJM 2025;393:1239-1241. CD19 CAR T-Cell Therapy in Multidrug-Resistant Ulcerative Colitis

This case study involved the use of “autologous chimeric antigen receptor (CAR) T cells targeting CD19 in a 21-year-old woman with severe multidrug-resistant ulcerative colitis, who had declined colectomy. Previous treatments with prednisolone, mesalamine, infliximab, ustekinumab, ozanimod, filgotinib, vedolizumab, upadacitinib, and cyclosporine combined with mirikizumab had not induced clinical remission.”

“Clinical and biochemical remission occurred and were maintained over the 14-week follow-up period… without the use of concomitant therapy. Endoscopic, histologic, and ultrasonographic assessments showed signs of mucosal healing over time….These data suggest the possibility that CD19 CAR T-cell therapy can induce rapid drug-free remission in refractory ulcerative colitis, a disease that was previously thought to be largely B-cell–independent, given that rituximab treatment showed no efficacy..”

My take: This is only a single case report. However, it shows that modulation of the immune system could potentially cure ulcerative colitis. At the same time, long term adverse effects of CAR-T therapy will need to be monitored.

Related blog posts:


Steroids After Kasai Procedure for Biliary Atresia

MA Colak et al. J Pediatr Gastroenterol Nutr. 2026;82:358–365. Improvement in bile drainage after Kasai portoenterostomy with a tailored steroid protocol

In this retrospective study, 28 infants underwent Kasai portoenterostomy (KPE) between 2015 and 2025. Group A had 16 infants managed without steroids between 2015 and 2021, while Group B included 12 infants managed under the new tailored steroid protocol between 2021 and 2025.

Determination of bile drainage: Postoperative stool color is monitored closely and collaboratively by hepatologists and surgeons according to the Japanese Tochigi Prefecture 3rd Edition stool card to assess bile drainage over the first five postoperative days.23 Patients with ≥50% of stools at color ≤3 are considered to have poor bile drainage, while those with >50% of stools at color ≥4 are considered to have good bile drainage.

Tailored steroid protocol: “If patients have poor bile drainage, further management depends on age at time of operation. Patients ≤45 days old at operation are started on a combined steroid and antibiotic treatment immediately after bile leak is ruled out using abdominal ultrasound. Patients >45 days old at operation are started on the steroid and antibiotic treatment only if the liver biopsy obtained during operation demonstrated acute inflammation on histology.”

Key findings:

  • The 3-month post-KPE TB levels were significantly lower in Group B compared to Group A (0.9 [0.3, 1.9] mg/dL vs. 6.5 [0.6, 10.4] mg/dL, p = 0.036)
  • The 2-year native liver survival (NLS) was also significantly higher in Group B (72.9% vs. 37.5%, p = 0.046)
  • LOS, readmissions, reoperations, and complications in the 90-day postoperative period were not different between both groups
Kaplan–Meier curve of native liver survival at 2 years of age following Kasai portoenterostomy

In their discussion, the authors note that the “multicenter, placebo-controlled, double-blinded steroids in biliary atresia randomized trial (START) included 140 patients from the United States and assessed the effect of high-dose steroids (4 mg/kg/day).16 There was no significant difference in jaundice clearance at 6 months after operation (58.6% vs. 48.6%), nor significant difference in NLS at 2 years of age (58.7% vs. 59.4%) between the steroid and placebo groups.”

Subsequently, “similar to our study, Pandurangi et al. also reported a significant increase in the ratio of patients who had a TB level of <2 mg/dL at 3 months after operation in the customized steroid protocol cohort. However, although the steroid protocol cohort had greater 2-year NLS (68.8% vs. 50%), the difference did not reach statistical significance in their study.”

My take: The START study (n=140), which was powered to detect a 25% absolute treatment difference in TB levels, cannot exclude modest benefits from steroids. This current study, despite its limitations, showed that a tailored protocol for use of steroids may improve outcomes.

Related blog posts:

Disclaimer: This blog, gutsandgrowth, assumes no responsibility for any use or operation of any method, product, instruction, concept or idea contained in the material herein or for any injury or damage to persons or property (whether products liability, negligence or otherwise) resulting from such use or operation. These blog posts are for educational purposes only. Specific dosing of medications (along with potential adverse effects) should be confirmed by prescribing physician. Because of rapid advances in the medical sciences, the gutsandgrowth blog cautions that independent verification should be made of diagnosis and drug dosages. The reader is solely responsible for the conduct of any suggested test or procedure. This content is not a substitute for medical advice, diagnosis or treatment provided by a qualified healthcare provider. Always seek the advice of your physician or other qualified health provider with any questions you may have regarding a condition.

Does Dolichocolon (Colonic Redundancy) Matter?

  • D Simon et al. J Pediatr Gastroenterol Nutr. 2026;82:407–414. Dolichocolon is common in pediatric gastroenterology patients with constipation and associated complaints
  • L Dorfman, A Kaul. J Pediatr Gastroenterol Nutr. 2026;82:320–322 Commentary. Dolichocolon in pediatric patients with constipation—The chicken or the egg?

Methods: In this retrospective study, a total of 155 contrast enemas were administered and then assessed for features of colonic redundancy consistent with dolichocolon (DC), based on a priori imaging (adult) criteria.

“DC was defined as: any portion of the sigmoid colon reaching above the iliac crest line (Type 1), and/or any portion of the transverse colon reaching below the iliac crestline with or without redundant flexures (Type 2)…We decided not to study Type 3 DC (i.e., redundant loops at the hepatic or splenic flexure, example shown in Figure 1A*) separately because that category was deemed to be arbitrary/imprecise.”

Key findings:

  • Consensus‐based identification (i.e., independent agreement among all three reviewers) of dolichocolon (DC) was observed in 74.1% of children under 2 years old and 88.6% of those aged 2–4 years presenting with constipation
  • The prevalence subsequently significantly decreased with age, with 68.8% in children aged 5–10 years and 47.6% in adolescents aged 11–17 years. “The pattern of decreasing prevalence of DC with age after 5 years is in contrast to findings in adult patients over 40 years with constipation, where DC frequency was found to increase significantly with age”
  • The vast majority (95.6%) of DC was Type 1; 3.5% was Type 2. 0.9% was both Type 1 and Type 2
The dashed line marks the iliac crest line [IC]; the gray arrow
highlights the sigmoid colon reaching above the IC
The blue arrow highlights the transverse colon falling below the IC

The editorial by Dorfman et al. notes that “dolichocolon has a long history in medical literature, but its exact role remains uncertain, presenting a classic “chicken or the egg” dilemma…Until more stringent pediatric-specific definitions and longitudinal evidence are acquired, clinicians should exercise caution in solely attributing symptoms to dolichocolon…While dolichocolon may play a role, it is unlikely to be the sole cause.” 

My take: I had to read the article because I was not familiar with the term “dolichocolon.” The authors, though, summarize the key point: “the clinical relevance of this radiologic finding is not completely understood.” As a separate matter, a pediatric study on how a dolichocolon affects colonoscopy would be interesting; presumably, it would make it more difficult with longer duration and lower rates of TI intubation.

Related blog posts:

Statin Use Associated with Lower Risk of Inflammatory Bowel Disease

Gastroenterology & Endoscopy News, January 2026: Statin Use for Primary CVD Prevention Linked to IBD Risk Reduction

An excerpt:

Danish residents with elevated lipids and CVD risk factors who were taking statins for CVD prevention saw a 16% lower risk per unit time of incident IBD, the researchers found (AS Faye et al. J Intern Med 2025;298[6]:686-696. Statin use for primary prevention of cardiovascular disease reduces the risk of incident IBD: A population-based cohort study)…

The study was a population-based, prospective cohort design drawing on the Danish National Registries. Participants were over 40 years of age and had undergone low-density lipoprotein (LDL) measurement between 2008 and 2022…Each of 110,961 people who picked up statin prescriptions within six months of LDL measurement was matched to five others (n=554,805) not prescribed statins by age, sex, calendar year, and CVD risk factors…

The aHR of developing IBD for statin users versus nonusers was 0.84 (95% CI 0.72-0.97)…The five-year number needed to treat (NNT) with statins was 2,881 to prevent one additional IBD case…

In addition to lipid-lowering properties, statins have anti-inflammatory and immunomodulating actions.

My take: This study suggests that statins have an “off target” beneficial effect in reducing the risk of inflammatory bowel disease. However, it is possible that statin use is not directly beneficial but an epiphenomenon. For example, individuals taking statins may have modified their diet to lower their risk as well.

Related blog posts:

Devil’s Throat, Iguazu Falls

Longer Adalimumab Dosing Intervals Associated with Worse Outcomes for Crohn’s Patients in Remission (LADI Trial)

LMA Van Lierop et al. Gastroenterol 2026; 170: 404-407. Open Access! Long-Term Outcomes of Increased Versus Conventional Adalimumab Dose Interval for Patients With Crohn’s Disease in Stable Remission: 3-Year Follow-Up of the Randomized Controlled LADI Trial

Methods: “The LADI trial enrolled adults with luminal CD in corticosteroid-free clinical (CFCR) and biochemical remission, on adalimumab, 40 mg every 2 weeks. After randomization in a 2:1 ratio, the intervention group started on a 3-week interval and increased to 4 weeks, if in clinical and biochemical remission at week 24. The control group remained on adalimumab biweekly…The primary end point in this long-term follow-up (LTFU) study was the proportion of patients in CFCR (Harvey Bradshaw Index [HBI] <5 or remission per Physician Global Assessment [PGA] without systemic corticosteroids) without complications at year 3, on the assigned adalimumab interval.”

Key findings:

  • The proportion of patients achieving the primary end point was 34 of 95 (35.8%, intervention) vs 41 of 48 (85.4%, control; P < 0.001).
  • At year 3, 39 of 95 (41.1%) in the intervention group remained on the randomized or further de-escalated adalimumab regimen
  • Kaplan-Meier analyses of secondary end points showed the following probabilities at year 3 (intervention vs control) (Figure 1): remaining on the assigned adalimumab dose, 41.4% vs 91.4% (P < .0001); remaining on adalimumab, 83.7% vs 95.8% (P = .026); corticosteroid-free survival, 87.4% vs 95.7% (P = .062); and complication-free survival, 83.2% vs 97.9% (P = .015)
Kaplan-Meier curves visualizing maintenance of assigned dosing at baseline, continued adalimumab therapy, and corticosteroid-free survival in both groups. (A) Probability of maintaining assigned adalimumab dosing interval of 3–4 weeks (intervention) vs 2 weeks (control).
(B) Probability of remaining on adalimumab therapy over time.
(C) Probability of remaining in corticosteroid-free remission.

My take: About 60% of patients were unable to de-escalate their adalimumab dosing interval. Suboptimal dosing increased the risk of complications and having adalimumab therapy become ineffective.

Related blog posts:

The Esophagus Works Better After Responding to Treatment for Eosinophilic Esophagitis

KV Kennedy et al. Gastroenterol 2026; 170: 287-297. Histologic Response Is Associated With Improved Esophageal Distensibility and Symptom Burden in Pediatric Eosinophilic Esophagitis

Methods: This was a prospective study with 300 endoscopies involving 112 patients with eosinophilic esophagitis (EoE).

Key findings:

  • “Participants exhibiting a histologic response to treatment showed the most significant improvement in distensibility over time (1.41 vs 0.16–0.53 mm/y; P = .003).”
  • “After adjusting for Eosinophilic Esophagitis Endoscopic Reference Score and age at symptom onset, lower esophageal distensibility was independently associated with increased odds of patient-reported dysphagia” (odds ratio, 0.85; P = .008).
  • “Baseline distensibility predicted the need for future stricture dilation (area under the curve, 0.757; P = .0003).”
  • At baseline, fibrostenotic features were noted in 26 (23%) and strictures in 16 (14%).

Discussion Points:

  • “Our results support the potential plasticity of esophageal remodeling based on the observed improvement in distensibility among patients with adequately controlled inflammation.”
  • “A recent cohort study of 105 adult patients with EoE with more than 10 years of pediatric followup…found that patients with longer periods of histologic control were less likely to develop esophageal strictures.”

My take: The esophagus works better when eosinophilic inflammation is treated.

Related blog posts: