In a recent “Fast Facts” article published in the journal BMJ, researchers discuss recent advances in generative artificial intelligence (AI), the importance of the technology in today’s world, and the potential dangers that need to be addressed before large language models (LLM) such as ChatGPT can become the reliable sources of factual information we believe they are.
Quick facts about the BMJ: Quality and security of health information generated by artificial intelligence. Image credit: The Panda/Shutterstock
What is Generative AI?
“Generative artificial intelligence (AI)” is a subset of AI models that create context-dependent content (text, images, audio and video) and form the basis of the natural language models that power assistants. AI (Google Assistant, Amazon Alexa and Siri) and productivity apps including ChatGPT and Grammarly AI. This technology represents one of the fastest growing sectors of digital computing and has the potential to significantly advance various aspects of society, including healthcare and medical research.
Unfortunately, advances in generative AI, particularly large language models (LLMs) like ChatGPT, have far outpaced ethical and security controls, introducing the risk of serious consequences, both accidental and deliberate (malicious). . Research estimates that more than 70% of people use the Internet as their primary source of health and medical information, and more individuals every day are accessing LLMs such as Gemini, ChatGPT, and Copilot for their requests. This article focuses on three vulnerable aspects of AI, namely AI errors, health misinformation, and privacy concerns. It highlights efforts by new disciplines, including AI security and ethical AI, to address these vulnerabilities.
AI errors
Errors in data processing are a challenge common to all AI technologies. As input data sets become larger and model outputs (text, audio, images, or video) become more sophisticated, false or misleading information becomes increasingly difficult to detect.
“The phenomenon of “AI hallucinations” has gained prominence with the widespread use of AI chatbots (e.g., ChatGPT) powered by LLMs. In the context of health information, AI hallucinations are of particular concern because individuals may receive incorrect or misleading health information from LLMs that is presented as fact.”
For laypeople unable to distinguish between factual and inaccurate information, these errors can very quickly become very costly, particularly in the case of incorrect medical information. Even trained healthcare professionals can suffer from these errors, given the growing number of research conducted using LLMs and generative AI for data analysis.
Fortunately, many technology strategies to mitigate AI errors are currently in development, the most promising of which involves developing generative AI models that “build” on information from credible, authoritative sources. Another method is to incorporate “uncertainty” into the output of the AI model – when presenting a result. The model will also present its degree of confidence in the validity of the information presented, thus allowing the user to reference credible information repositories in cases of high uncertainty. Some generative AI models already incorporate citations into their results, encouraging the user to inquire further before accepting the model’s output at face value.
Health misinformation
Disinformation is distinct from AI hallucinations in that the latter are accidental and unintentional, while the former are deliberate and malicious. While the practice of disinformation is as old as human society itself, generative AI presents an unprecedented platform for the generation of “diverse, high-quality, targeted disinformation at scale” without any financial cost to the malicious actor.
“One option for preventing AI-generated health misinformation involves refining models to align with human values and preferences, including avoiding the generation of known harmful responses or misinformation. An alternative is to create a specialized model (separate from the generative AI model) to detect inappropriate or harmful requests and responses.
While both of the above techniques are viable in the war against disinformation, they are experimental and model-based. To prevent inaccurate data from reaching the model for processing, initiatives such as digital watermarks, designed to validate accurate data and represent AI-generated content, are currently in the works. Equally important, the creation of AI watchdog agencies would be necessary before AI can be fully trusted as a robust information dissemination system.
Confidentiality and bias
Data used for training generative AI models, particularly medical data, must be reviewed to ensure that no identifiable information is included, thereby respecting the privacy of its users and the patients on whom the models have been trained. been trained. For crowdsourced data, AI models typically include privacy terms and conditions. Study participants must ensure that they comply with these conditions and do not provide information that can be traced back to the volunteer in question.
Bias is the risk inherited from AI models of skewing data based on the model’s training source material. Most AI models are trained on numerous datasets, usually obtained from the Internet.
“Despite developers’ efforts to mitigate bias, it remains difficult to fully identify and understand bias in accessible LLMs due to the lack of transparency over the data and training process. Ultimately, strategies to minimize these risks include exercising greater discretion in selection. training data, a thorough audit of generative AI results, and taking corrective action to minimize identified biases.
Conclusions
Generative AI models, the most popular of which include LLMs such as ChatGPT, Microsoft Copilot, Gemini AI, and Sora, represent some of the best human productivity improvements of the modern era. Unfortunately, progress in these areas has far outpaced credibility checks, leading to the risk of errors, misinformation, and bias, which could lead to serious consequences, particularly when it comes to health care. This article summarizes some of the dangers of generative AI in its current form and highlights underdeveloped techniques to mitigate these dangers.
Journal reference:
- Sorich, MJ, Menz, BD, & Hopkins, AM (2024). Quality and security of health information generated by artificial intelligence. In BMJ (p. q596). BMJDOI – 10.1136/bmj.q596, https://www.bmj.com/content/384/bmj.q596