Dr. Adam Rodman, an expert in internal medicine at Beth Israel Deaconess Medical Center in Boston, confidently expected that chatbots built to use artificial intelligence would help doctors diagnose illnesses.
He was wrong.
Instead, in a study Dr. Rodman helped design, doctors who were given ChatGPT-4 along with conventional resources did only slightly better than doctors who did not have access to the bot. And, to the researchers’ surprise, ChatGPT alone outperformed the doctors.
“I was shocked,” Dr. Rodman said.
The chatbot, from the company OpenAI, scored an average of 90 percent when diagnosing a medical condition from a case report and explaining its reasoning. Doctors randomly assigned to use the chatbot got an average score of 76 percent. Those randomly assigned not to use it had an average score of 74 percent.
Interesting small-scale study on accuracy of diagnosing illness:
— Greg Brockman (@gdb) November 18, 2024
– Human doctors: 74%
– Human doctors using ChatGPT: 76%
– ChatGPT alone: 90%
Takeaway seems like vast potential for AI to help with diagnosis, but need better human <> AI teamwork:https://t.co/evE1jurR3U