The Viability of ChatGPT in Addressing Patient Questions About Medical Imaging
Can ChatGPT adequately address common questions from patients about medical imaging? This question prompted researchers to examine the viability of the generative pre-trained language model (GPLM) in providing accurate and relevant answers to various imaging questions. The study, recently published in the Journal of the American College of Radiology, aimed to assess the consistency, accuracy, and readability of ChatGPT responses.
Research Methodology and Findings
The researchers repeated the imaging questions three times to evaluate the consistency of ChatGPT responses. They compared the responses of unprompted ChatGPT (version 3.5) with responses that included a modifying prompt to emphasize accuracy and readability for the average person. The study found that there was no significant difference in accuracy between unprompted responses (82.6%) and prompted responses (86.7%). Furthermore, the consistency of responses increased from 71.6% for no-prompt responses to 86.4% with modifying prompts.
Importance of Automating Health Education
Alessandro Furlan, M.D., co-author of the study and associate professor of radiology at the University of Pittsburgh Medical Center (UPMC), highlighted the potential of automating the development of patient health educational materials and providing on-demand access to medical questions. This approach holds promise in improving patient access to health information.
Relevance of ChatGPT Responses
The study revealed that only 66.7% of unprompted ChatGPT responses and 79.6% of prompted responses were considered fully relevant to the posed questions. Safety-related questions posed a challenge, as only 50% of unprompted responses and 64.6% of prompted responses regarding safety were deemed fully relevant. This suggests that while the accuracy and consistency of ChatGPT responses are noteworthy, there is room for improvement, particularly in ensuring relevance.
- ChatGPT (version 3.5) demonstrates over 80% accuracy in responding to medical imaging questions.
- The consistency of responses is 71.6% for unprompted responses and 86.4% with modifying prompts.
- Complete relevance to questions is lower, with only 66.7% of unprompted responses and 79.6% of prompted responses considered fully relevant.
- Safety-related questions have the lowest full relevance percentages.
- ChatGPT responses lack readability, as none of them were at or below an eighth-grade reading level.
Concerns About Readability and Patient Access
The readability of ChatGPT responses raised concerns, as none of them met the eighth-grade reading level. This high complexity could hinder patient access to health information. Understanding health information is crucial for informed medical decisions, but the complexity of ChatGPT’s responses currently limits true patient access to health information.
Limitations and Future Implications
It is important to acknowledge the limitations of this study. The rapidly evolving nature of ChatGPT technology may impact its effectiveness in addressing patient questions. Additionally, the questions used in the study may not fully represent the varied ways patients ask similar questions. Moreover, the study did not explore how ChatGPT would handle questions posed in languages other than English.
Opinion: The Potential of ChatGPT in Improving Patient Access to Health Information
ChatGPT’s ability to respond to medical imaging questions with a high level of accuracy and consistency is impressive. While there is still room for improvement in terms of relevance and readability, it is clear that automating access to medical questions holds great promise. As further advancements are made in language models like ChatGPT, the potential to provide patients with reliable health information on demand will continue to grow.
To stay updated on the latest developments in language models and AI, visit the GPT News Room.