Health Disinformation Risks from AI Chatbots

MD Modi et al. Annals of Internal Medicine; 2025. https://doi.org/10.7326/ANNALS-24-0393. Abstract: Assessing the System-Instruction Vulnerabilities of Large Language Models to Malicious Conversion Into Health Disinformation Chatbots

Methods: This study assessed the effectiveness of safeguards in foundational LLMs against malicious instruction into health disinformation chatbots. Five foundational LLMs—OpenAI’s GPT-4o, Google’s Gemini 1.5 Pro, Anthropic’s Claude 3.5 Sonnet, Meta’s Llama 3.2-90B Vision, and xAI’s Grok Beta—were evaluated via their application programming interfaces (APIs). Each API received system-level instructions to produce incorrect responses to health queries, delivered in a formal, authoritative, convincing, and scientific tone.

Key findings:

  • Of the 100 health queries posed across the 5 customized LLM API chatbots, 88 (88%) responses were health disinformation

Examples of how AI systems can be used to create disinformation:

My take: This study shows how easy it is to get AI systems to provide misleading information in a convincing fashion. It might be interesting to include one of these systems to provide answers for the board game Balderdash.

Related blog posts: