This summer, MedicalExpo e-magazine is highlighting ten of its most popular articles published in 2023—an opportunity to review the cutting-edge innovations and digital technology that have made an impact in several healthcare sectors this year. Here is article #10. It was first published on April 28, 2023.
There has been much discussion about OpenAI’s ChatGPT (short for chat-based Generative Pre-trained Transformer) enhancing healthcare and improving working conditions for those in medicine. But does the artificial intelligence bot present dangers as well?
Having looked into the pros and cons of ChatGPT Jan Homolak, MD, Assistant at the Department of Pharmacology and the Croatian Institute for Brain Research, University of Zagreb School of Medicine, is in no doubt that the bot can be used to enhance healthcare service. He explained:
“ChatGPT and similar tools have a tremendous potential to revolutionize the healthcare system and make it more efficient and accessible.
For example, we can easily imagine that AI-based tools might soon be implemented in clinical practice to improve diagnostics, detect medical errors and reduce the burden of paperwork.”
But, according to him, there is also reason to be wary:
“An adage popularized by Spider-Man—’With great power comes great responsibility’—can be applied here. ChatGPT is an extremely powerful tool, however, it should not be implemented hastily.
Instead, we should advocate its mindful introduction and an open debate about the risks and benefits.
Medicine is a particularly sensitive area when it comes to the implementation of novel technologies because human lives are at stake.”
ChatGPT Has Limited Medical Knowledge
Firstly, ChatGPT is still just a large language model with many limitations:
“It’s simply still not very good when it comes to medical knowledge.
For example, it achieved 66% and 72% on Basic Life Support and Advanced Cardiovascular Life Support tests and was near the passing threshold on the United States Medical Licensing Exam.
This is not what you’d be super thrilled to hear as a patient if it were the results of a doctor taking care of you.
“Bear in mind that such models usually achieve good results in knowledge-based tests as they are trained on huge datasets containing relevant information.
In contrast, they are notoriously bad at context and nuance, two things critical for safe and effective patient care, which require the implementation of medical knowledge, concepts and principles in real-world settings.
To illustrate this, while ChatGPT might be able to pass some medical tests, it fails miserably if you describe a clinical scenario and evaluate whether it is able to use the exact same knowledge in a clinical context.”
Responsibility and Ethics Are Also A Challenge
Responsibility is also a challenge—the much-discussed conundrum of who is to blame when an AI algorithm makes a mistake will have to be decided.
Then, there are the ethical issues to consider, also subjects of great debate and recently hitting headlines again with the news that Elon Musk plans to launch his own AI chatbot, TruthGPT. After signing an open letter that calls for AI development to be stopped, the billionaire Twitter owner expressed, in an interview on Fox News, his concerns about OpenAI’s ChatGPT and announced his plans to launch an alternative named “TruthGPT.”
According to Mr. Homolak, responsibility and ethics are especially pertinent when it comes to the use of AI in medicine:
“Training a good model requires a huge amount of high-quality (unbiased) data.
In medicine, this is often synonymous with high-quality data from prospective randomized controlled trials designed to reduce bias as much as possible and allow us to analyze a certain predetermined effect of interest.
There is a caveat, though: experimental design often allows us to reliably estimate only one primary effect the study was designed around—usually referred to as the primary outcome.
Regardless of the quality of data collection, the same dataset might be biased when it comes to other (secondary) outcomes.
AI algorithms are very good at analyzing data, however, they cannot overcome the problem of experimental design and data collection.”
“When it comes to ChatGPT, it has already been described in the literature that it can provide biased outputs.
In medicine, biased models are simply too dangerous to be implemented in clinical decision-making as there should be no room for error.”
Danger of Misinformation
But is there room for ChatGPT in medical research? Mr. Homolak suggested:
“It can be useful in a way that it can improve the readability of manuscripts or help researchers write better. However, it still cannot produce meaningful research text of sufficient quality without human intervention.
This is not necessarily bad—it can still be a useful tool to help you write a manuscript faster, but it is far away from being able to do it on its own.
There is also a problem of ‘hallucinations’—ChatGPT often makes up references and writes plausible-sounding but incorrect answers.
This is dangerous as it can promote the spread of misinformation.”
A Future for ChatGPT in Medicine
To him, clinical tools have to undergo rigorous quality control tests before being implemented to ensure they will not be responsible for medical errors or propagate them if they occur. But even if this becomes possible, Mr. Homolak is skeptical about AI tools such as ChatGPT ultimately taking over completely from medical and healthcare professionals:
“I highly doubt it will ever be able to fully replace medical doctors as their jobs are simply too complex and contextual for an algorithm to comprehend.
On the other hand, I can easily see how ChatGPT or similar tools might be able to completely take over some tasks that are currently overburdening our medical staff.
I think that ChatGPT might evolve into a set of tools that will make a doctor’s life easier by increasing physicians’ efficacy by redistributing workload and optimizing performance.”