ChatGPT for Colonoscopy Questions Plus One

T-C Lee et al. Gastroenterol 2023; 165: 509-511. Open Access! ChatGPT Answers Common Patient Questions About Colonoscopy

In this study, ChatGPT answers to questions about colonoscopy were compared to publicly available webpages of 3 randomly selected hospitals from the top-20 list of the US News & World Report Best Hospitals for Gastroenterology and GI Surgery.

Methods: To objectively interpret the quality of ChatGPT-generated answers, 4 gastroenterologists (2 senior gastroenterologists and 2 fellows) rated 36 pairs of CQs and answers, randomly displayed, for the following quality indicators on a 7-point Likert scale: (1) ease of understanding, (2) scientific adequacy, and (3) satisfaction with the answer (Table 1) Raters were also requested to interpret whether the answers were AI generated or not.

Key findings:

  • ChatGPT answers were similar to non-AI answers, but had higher mean scores with regard to ease of understanding, scientific adequacy, and satisfaction.
  • The physician raters demonstrated only 48% accuracy in identifying ChatGPT generated answers

My take:  This is yet another study, this time focused on gastroenterology, that show how physicians/patients may benefit from leveraging chatbots to improve communication.

Related blog posts:

Also this:

Prenatal Testing, Statistics, and Life-Altering Decisions

Much of my day is spent interpreting lab work.  Sometimes it is very easy but not always. Many families and health care professionals do not understand the concepts of sensitivity, specificity, positive predictive value and negative predictive value.  These values are affected greatly by the prevalence of the condition (or disease) that is being tested for in a specific population.

For many conditions, doctors prefer a highly sensitive test.  Tests that are highly sensitive will detect almost all of the individuals with the condition (or disease) being tested for and miss very few people (false-negative) with the condition. However, tests that are very sensitive often detect individuals who do not have the condition (false-positives). Therefore, when using tests with high sensitivity, more precise followup tests can determine conclusively if the condition (or disease) is present with much greater specificity.

A report from NBC news highlights how tests that are billed as “99 percent” accurate can be quite difficult to interpret and could lead to abortions of healthy fetuses.  Here’s the link: Sensitivity, Positive Predictive Value, and Prenatal Testing

Here’s an excerpt:

Positive results can be wrong 50 percent or more of the time…Noninvasive prenatal tests, or the “cell free DNA test,” are merely screening tests of placental DNA found in the mother’s blood…

The true likelihood that a positive test is positive depends on another calculation — the positive predictive value or PPV, which factors in other variables, such as a woman’s age and the prevalence of the disease in that population…

A woman over 35 where genetic disorders are more common — the likelihood of Trisomy 18 given a positive screening result is about 64 percent. For a younger woman, the PPV would be under 50 percent, according to the investigation.

Another example of understanding tests and statistics involves mammograms.  The relatively low reduction in averted cancer deaths related to mammograms has been discussed previously on this blog (see links below).  A good infographic and description is also available at NPR.  Here’s the link: What happens after your mammogram

 

Related blog posts:

Blue-footed Booby

Blue-footed Booby

The Bigger Picture -Mammography as an Example

This week, a commentary makes a strong case for eliminating mammography (N Engl J Med 2014; 370:1965-1967):  “Abolishing Mammography Screening Programs? A View from the Swiss Medical Board”

Here’s a link from the NEJM: nej.md/1hV8q0L

What is fascinating is how ingrained mammography has become in our medical culture and how most individuals believe that mammography is so beneficial.  Take a look at the figure in the link to get a better perspective.  While women think that mammography may save 80 lives out of a thousand screened, according to the commentary, the data suggest that 1 woman may be saved.  The main problem: “for every breast-cancer death prevented in U.S. women over a 10-year course of annual screening beginning at 50 years of age, 490 to 670 women are likely to have a false positive mammogram with repeat examination; 70 to 100, an unnecessary biopsy; and 3 to 14, an overdiagnosed breast cancer that would never have become clinically apparent.”

If a well-established screening measure like mammography is not so beneficial, what else could be on the chopping block?  As noted in a previous blog post (Do you know about the “Choosing Wisely gutsandgrowth), even the annual physical exam has been deemed a low-value service.

Another related blog post:

There is More to Life Than Death” | gutsandgrowth

 

“There is More to Life Than Death”

This commentary helps explain some of the reasons for recent recommendations to drop PSA screening for prostate cancer and to stop mammograms for women ages 40 to 49 while at the same time showing how these decisions are not in fact ‘no-brainers.’ (NEJM 2012; 987-89).

With both decisions, the U.S. preventive services task force (USPSTF) focused on mortality data.  For prostate cancer, the pivotal trial was the U.S. Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial (PLCO) that showed no difference in mortality between a PSA-screened group and a control group.  Besides detailing a few limitations of the study, the authors note that separate epidemiologic data show a 75% decrease in men presenting with advanced prostate cancer since the introduction of PSA screening. Furthermore, a European study showed advanced cancers were 40% more likely in the control group as well.

Patients with more advanced prostate cancer are prone to bone pain and urinary obstruction; whereas, patients who undergo unnecessary surgery (b/c prostate cancer was not going to kill them) may develop incontinence and impotence.

For breast cancer, similarly, identifying smaller breast cancers may allow more conservative therapy. This has to be weighed against increased anxiety, discomfort, and biopsies for those with false-positive mammograms.

Conclusions:

“Basing decisions on the outcome of death ignores vital dimensions of life that are not easily quantified…It is neither ignorant nor irrational to question the wisdom of expert recommendations that are sweeping and generic.  There is more to life than death.”

On a side note, one of the authors (Jerome Groopman) has written several books.  My favorite of his: “The Measure of Our Days.”