Applications of ChatGPT in Ophthalmology

From EyeWiki

All content on Eyewiki is protected by copyright law and the Terms of Service. This content may not be reproduced, copied, or put into any artificial intelligence program, including large language and generative AI models, without permission from the Academy.


Introduction

AI models are very useful in ophthalmology, addressing individualized patient concerns, offering rapid explanations, and encouraging dialogue about eye ailments, procedures, and therapies.[1] Chat Generative Pre-trained Transformer (ChatGPT) is a large language model (LLM) chatbot created by openAI in San Francisco that generates dialogue. ChatGPT is able to understand human language and respond to questions with a variety of answers. ChatGPT’s many functions include answering questions, translating languages, writing stories, and summarizing text. ChatGPT may play several roles in future medical education, such as answering research questions, generating case studies, and assessing student essays. In addition, ChatGPT may be used in clinical settings to generate notes, assist with treatment decisions, and communicate with patients.[2]

However, ChatGPT has limitations. Mainly, it may ignore the context of specific questions, which may cause it to generate irrelevant or even erroneous answers. With the increasing speed and accuracy of AI advancements, large language models like ChatGPT may play a larger role in medical education and clinical settings in ophthalmology.[2]

ChatGPT in Ophthalmology

ChatGPT has proven to be successful in the general fields of law, business, and medicine, but it has been demonstrated to have an inconsistent accuracy when responding to questions within specific medical specialties.[1] Thus, it is important to examine the potential of applying ChatGPT to ophthalmological contexts as the technology rapidly develops, which may ultimately benefit both practitioners and patients. Searching the keywords “ChatGPT” and “ophthalmology” in PubMed generated a total of 206 papers, a handful of which were included in the table below.

Paper Author Published Applications of ChatGPT Pros of ChatGPT Cons of ChatGPT
ChatGPT and Ophthalmology: Exploring Its Potential with Discharge Summaries and Operative Notes[3] Singh et al. July 2023 ●      Encouraging performance by ChatGPT in constructing ophthalmic discharge summaries and operative notes ●      Generated detailed responses in seconds

●      Could include follow-up instructions, medication, consultation times, and location

●      Corrected itself when confronted about factual inaccuracies

●      Operative notes required significant tuning
Eyes on AI: ChatGPT's Transformative Potential Impact on Ophthalmology[4] Dossantos et al. June 2023 ●      Can be used to conduct literature reviews and perform data analysis

●      Aid in making diagnoses and treatment recommendations

●      Assist with patient education

●      Serves as study aid for trainees and practitioners

●      Has potential to improve efficacy of patient triage by categorizing patient complaints by severity

●      Streamlines research process

●      Potential issues with accuracy, bias and ethics

●      Questions regarding data security and compliance with HIPAA

●      ChatGPT is limited to knowledge until September 2021

The Use of ChatGPT to Assist in Diagnosing Glaucoma Based on Clinical Case Reports[5] Delsoz et al September 2023 ●      Used for clinical diagnosis

●      Accuracy of ChatGPT to diagnose patients with primary and secondary glaucoma, using specific case examples was similar to senior ophtho residents

●      ChatGPT was correct on eight out of eight cases with common glaucoma cases

●      Useful in settings with limited access to ophtho (EM/primary care)

●      Primarily incorrect on uncommon and atypical glaucoma cases

●      Results of article specific for case examples where info is organized, real-world info may not be

●      Residents were able to review and interpret VF printouts, fundus images, and OCT reports in the case descriptions, ChatGPT can’t

Appropriateness and Readability of ChatGPT-4-Generated Responses for Surgical Treatment of Retinal Diseases[6] Momenaei et al October 2023 ●      ChatGPT-4 provides appropriate responses regarding definition, prevalence, risk factors, treatments, surgical success rates, postoperative information, and vision recovery after surgery for RD, MH, and ERM ●      Can be used as accessible supplemental resource for patient education about retinal procedures ●      ChatGPT recommendations had low readability scores

●      Inappropriate answers were generated in terms of prevention and diagnostic methods of macular holes and epiretinal membranes, adverse effects of laser, and visual improvement after silicone oil removal

Performance of ChatGPT in Diagnosis of Corneal Eye Diseases[7] Delsoz et al August 2023 ●      Enhances diagnostic speed and efficacy

●      Consistent identification of underlying conditions

●      ChatGPT 4.0 had a 85% success rate for diagnosing corneal eye diseases, compared to 60% success rate from ChatGPT version 3.5

●      Faster diagnosis compared to clinician

●      Inaccuracies in providing diagnoses for rare disease
Appropriateness and Readability of ChatGPT Responses to Ophthalmic Symptoms[8][9] Tsui et al

Waisberg et al

April 2023

May 2023

●      ChatGPT provides appropriate responses to patient ophthalmic symptoms ●      Can be used as real-time supplemental resource for patient symptom triage and message responding. ●      May not be considered appropriate according to individual physician standards

ChatGPT has the potential to generate a detailed and accurate ophthalmic note in a short amount of time. In their study, Waisberg et al. asked ChatGPT to generate a cataract surgery operative note.[10] ChatGPT correctly included the key steps involved in cataract surgery including the incision, curvillinear capsulorhexis, phacoemulsification, lens implantation and wound closure, in addition to the essential components of a well-written ophthalmic note.[10] Additionally, ChatGPT can be used in underserved areas to aid in diagnoses of trachoma and other infectious ocular diseases, diseases that are treatable but vision-threatening if not addressed.[11] Online health information searches are becoming more common – a survey conducted in the United States found that two-third of adults look up health information online, and one-third of adults use search engines to self-diagnose.[12] Given ChatGPT’s quick responses in a conversational style, it is possible that many people will start to utilize ChatGPT to aid in self-diagnosing. A survey conducted less than a year after ChatGPT was publicly accessible revealed that 78% of respondents were inclined to employ it to self-diagnose oneself.[13] Improved knowledge of their disease can also empower patients; ChatGPT is an accessible and user-friendly resource that would allow patients to further their understanding and make informed decisions about their health.[11] ChatGPT can also be used by ophthalmologists to ensure that the essential information about a diagnosis, management, and prognosis is well understood by their patients.[11][14]

In addition to simplifying medical jargon, ChatGPT could be used to automate multilingual translation of patient education materials, eventually lowering barriers to access of information for multiethnic communities.[14] Patients may also choose to ask ChatGPT their ophthalmic questions because it may provide a diagnosis faster than a specialist.[15]

ChatGPT in Neuro-ophthalmology

A limited number of studies thus far have been published discussing the potential use of ChatGPT in neuro-ophthalmic clinical settings. One study by Madadi et al. compared the diagnostic accuracy of ChatGPT and neuro-ophthalmologists after being given identical text from case reports.[16] The study reported an 82 percent accuracy rating of ChatGPT and an 86 percent accuracy rating of the neuro-ophthalmologists. With such promising results, ChatGPT demonstrates the potential to assist clinicians in diagnosing neuro-ophthalmic diseases effectively.[16] The study also suggests that ChatGPT can be used as a supplemental resource by clinicians in areas where there is a shortage of neuro-ophthalmologists.[16]

Another study completed by Tao et al. examined the use of ChatGPT in creating neuro-ophthalmic handouts and educational materials.[17] After assessing the handouts for accuracy/comprehensiveness, bias, currency, tone, and readability, fellowship-trained ophthalmologists reported a moderate level of satisfaction with the quality of the materials.[17] The findings of the study demonstrated that ChatGPT can be used to create an initial draft of educational materials for patients, but they must be further refined and edited by a neuro-ophthalmologist before dissemination.[17]

A study conducted by Waisberg et al. reported that ChatGPT can be used by clinicians to write complete and easy-to-read discharge summaries.[18] ChatGPT can also be used to stay up-to-date with the latest clinical trials and advancements in medicine.[18] The study reported limitations in the image analysis feature of ChatGPT which incorrectly interpreted fundus photographs of anterior ischemic optic neuropathy (AION) and non-arteric anterior ischemic optic neuropathy.[18] Another article explored the artificial intelligence-powered chatbot, Bard (now Gemini), developed by Google, and its use in ophthalmologic settings. Unlike ChatGPT, when a fundus image of AION was presented to Bard, it was able to identify it as an optic nerve, but it incorrectly provided information about glaucoma.[19]

ChatGPT has been shown to outperform glaucoma specialists and was comparable with retina specialists.[20]  In contrast, ChatGPT exhibited reasonable but inferior diagnostic accuracy than specialists in cornea[15], uveitis[21], and neuro-ophthalmology[16] cases.  This trend aligns with the hypothesis that the efficiency of ChatGPT is inferior in rarer diseases – Hu et al. demonstrated that GPT performed worse than family medicine physicians and junior ophthalmologists in detecting rare eye diseases, which included several neuro-ophthalmic cases.[22] ChatGPT is also more reliable when addressing general disease and condition information, perhaps due to the availability of structured and well-researched data in those areas.[23] The higher rates of inappropriate and unreliable content in surgery-related topics within ophthalmology indicates that the complexity and variability of surgical procedures is too challenging for the chatbot.

Additional studies must be conducted to examine further how ChatGPT can be used in neuro-ophthalmic clinical settings. Given the findings from studies exploring how ChatGPT can be used in other fields of ophthalmology, neuro-ophthalmologists and their patients have the potential to greatly benefit from ChatGPT. More specifically, neuro-ophthalmologists can use ChatGPT to provide therapeutic and diagnostic suggestions based on patient data; answer medical inquiries using up-to-date literature; and formulate research questions. The program can be used to analyze visual field deficits and interpret findings from optical coherence tomography images, as well as magnetic resonance imaging and computed tomography scans.

ChatGPT can also assist further in the research process by helping to conduct literature reviews, data analysis, and peer review. Patients in neuro-ophthalmic clinic settings can utilize ChatGPT as an educational source to improve their understanding of their disease as well as their prognosis. Neuro-ophthalmic fellows and ophthalmology residents can also utilize ChatGPT to generate practice questions as a study aid.

Ethical Considerations and Limitations

While ChatGPT has significant potential to be used as a resource in neuro-ophthalmology, there are many limitations associated with it. Lack of recency is a key limitation, as ChatGPT was trained using data only until September 2021. Google’s AI chatbot, Bard, on the other hand, lacks this issue because it is linked to Google and can provide more real time data.[19] Additionally, ChatGPT was trained using various unverified Internet sources, which may cause patient harm. ChatGPT’s sources lack transparency, as they are not substantiated by references. Other ethical concerns include data privacy and security, as neuro-ophthalmologists may upload facial photos, or images with identifying features. This may breach biometric security and lead to identity theft.

ChatGPT is also vulnerable to a phenomenon known as "hallucination", where the model generates plausible-sounding information with flawed reasoning. This presents a challenge in clinical settings, where precise diagnoses are essential to making major patient care decisions. ChatGPT excels in answering simple, single-step questions. However, it falters in answering more complex, multi-step questions. This is concerning in real-world settings, as medical diagnoses involve navigating an extremely complex labyrinth of information. Subjectively, ophthalmologists found that responses written by ChatGPT tended to be more generic, contain irrelevant information, hallucinate more frequently, and have distinct syntactic patterns.[24]

LLM chatbots like ChatGPT occasionally suffer from a phenomenon called “falsehood mimicry” and generate responses that align with a user’s misleading assumptions, stemming from user input that is not clear or accurate.[14]  Despite the dissemination of potentially false information, ChatGPT may respond with writing that looks perfect and believable because the generated response is grammatically polished. “Sycophancy bias” describes LLM chatbots that generate responses that align with user expectations, leading to increased diagnostic error. [25]Thus, it is important to carefully analyze all output, as users may have confirmation bias.

ChatGPT may also lead to inappropriate responses due to omission of critical information from the chatbot to the user.[23] For example, in Tailor et al., lack of neuroimaging recommendations for a case describing a cranial nerve III palsy could have resulted in significant fatal ramifications.

ChatGPT has been shown to overgeneralize some ophthalmic treatments and procedures. Notably, in Tailor et al., the chatbot assumed all eyelid surgeries were a blepharoplasty, which is not always true. Sometimes, when asked questions regarding the most common causes or treatments for certain scenarios, the chatbot would appropriately state multiple common causes but would also list exceptionally rare causes rather than more appropriate causes.

Chatbots like ChatGPT have not been developed as medical diagnostic tools, which is sometimes made clear to the user prior to responding to a medical question.[26] Additionally, some studies have shown issues with response repeatability. This variability is to be expected, however, as software that was intentionally created for medicine would require high repeatability. However, this has not stopped patients from using it prior to speaking to a human healthcare professional.[23] Thus, the responses should be examined thoroughly for potential inaccuracies and errors, and cross-checked with human experts when appropriate.[26]

Finally, eye care is a medical field that heavily relies on large quantities of nontext-based data.[27]  Although multimodal LLM chatbots capable of processing images and text are in development a major limitation of ChatGPT is a “lack of image interpretation that limits their applicability in more complex cases with information from various imaging modalities.”[16] Neuro-ophthalmology’s heavy reliance on imaging presents a major barrier to using ChatGPT to diagnose patients. Mihalache et al. performed a cross-sectional study using 136 publicly available ophthalmic cases with 448 accompanying multimodal images and 429 multiple choice questions.[28] ChatGPT 4 answered 70% of the questions correctly, and performed significantly better on retina questions (77%) than neuro-ophthalmology questions (58%). Large language models like ChatGPT continue to evolve and may play an increased role in clinical care and decision-making in the future.

On the Horizon: ChatGPT's Competitors

ChatGPT is not the only LLM chatbot available to patients now.[26]  Its competitors have similar capabilities to ChatGPT and have been evaluated for their performance and consistency in regards to ophthalmic case scenarios. Mandalos et al. evaluated the efficiency of three AI chatbots, ChatGPT-3.5, Bing Co-Pilot, and Google Gemini, in the diagnostic approach and management of 11 challenging ophthalmic cases of various ophthalmic subspecialities and compared the performance with that of a practicing ophthalmologist. The findings revealed that the individual performance of the chatbots was inferior to the practicing ophthalmologist. Chatbots may be a useful complementary resource for the diagnostic thinking process, but cannot and should not be a substitute for a clinician at this time. Google Gemini makes this clear as it refuses, on occasion, to provide medical insight citing its restrictions to provide medical advice.

The use of different LLM chatbots has been examined in neuro-ophthalmology as well.[29] Shukla et al. asked three LLM chatbots what the most likely diagnosis was for ten randomly selected neuro-ophthalmology cases. ChatGPT outperformed Microsoft Bing and Google Gemini in terms of appropriateness and applicability, but all three LLM chatbots demonstrated comparable accuracy when asked questions about neuro-ophthalmological case difficulties.

ChatGPT has multiple versions that are available to the public.[30] ChatGPT 3.5 is free for users, while ChatGPT 4 requires payment. Pushpanathan et al. assessed the performance of ChatGPT 3.5, ChatGPT 4, and Google Gemini and their responses to 37 common inquiries regarding ocular symptoms. All three LLM chatbots showed optimal mean comprehensiveness scores, but exhibited varied self-awareness capabilities. Self-awareness capabilities embody the chatbot ability to self-check and self-correct responses. ChatGPT 4’s responses statistically significantly outperformed ChatGPT 3.5 and Google Gemini.

References

  1. 1.0 1.1 Suchman K, Garg S, Trindade AJ. Chat Generative Pretrained Transformer Fails the Multiple-Choice American College of Gastroenterology Self-Assessment Test. Am J Gastroenterol. 2023;118(12):2280-2282. doi:10.14309/ajg.0000000000002320
  2. 2.0 2.1 Khan RA, Jawaid M, Khan AR, Sajjad M. ChatGPT - Reshaping medical education and clinical management. Pak J Med Sci. 2023;39(2):605-607. doi:10.12669/pjms.39.2.7653
  3. Singh S, Djalilian A, Ali MJ. ChatGPT and Ophthalmology: Exploring Its Potential with Discharge Summaries and Operative Notes. Semin Ophthalmol. 2023;38(5):503-507. doi:10.1080/08820538.2023.2209166
  4. Dossantos J, An J, Javan R. Eyes on AI: ChatGPT's Transformative Potential Impact on Ophthalmology. Cureus. 2023;15(6):e40765. Published 2023 Jun 21. doi:10.7759/cureus.40765
  5. Delsoz M, Raja H, Madadi Y, et al. The Use of ChatGPT to Assist in Diagnosing Glaucoma Based on Clinical Case Reports. Ophthalmol Ther. 2023;12(6):3121-3132. doi:10.1007/s40123-023-00805-x
  6. Momenaei B, Wakabayashi T, Shahlaee A, et al. Appropriateness and Readability of ChatGPT-4-Generated Responses for Surgical Treatment of Retinal Diseases. Ophthalmol Retina. 2023;7(10):862-868. doi:10.1016/j.oret.2023.05.022
  7. Delsoz M, Madadi Y, Munir WM, et al. Performance of ChatGPT in Diagnosis of Corneal Eye Diseases. Preprint. medRxiv. 2023;2023.08.25.23294635. Published 2023 Aug 28. doi:10.1101/2023.08.25.23294635
  8. Tsui JC, Wong MB, Kim BJ, Maguire AM, Scoles D, VanderBeek BL, Brucker AJ. Appropriateness of ophthalmic symptoms triage by a popular online artificial intelligence chatbot. Eye (Lond). 2023 Dec;37(17):3692-3693. doi: 10.1038/s41433-023-02556-2. Epub 2023 Apr 29. PMID: 37120656; PMCID: PMC10686397.
  9. Waisberg E, Ong J, Zaman N, Kamran SA, Sarker P, Tavakkoli A, Lee AG. GPT-4 for triaging ophthalmic symptoms. Eye (Lond). 2023 Dec;37(18):3874-3875. doi: 10.1038/s41433-023-02595-9. Epub 2023 May 25. PMID: 37231187; PMCID: PMC10698159.
  10. 10.0 10.1 Waisberg E, Ong J, Masalkhi M, et al. GPT-4 and Ophthalmology Operative Notes. Ann Biomed Eng. 2023;51(11):2353-2355. doi:10.1007/s10439-023-03263-5
  11. 11.0 11.1 11.2 Masalkhi M, Ong J, Waisberg E, et al. ChatGPT to document ocular infectious diseases. Eye (Lond). Published online November 15, 2023. doi:10.1038/s41433-023-02823-2
  12. Kuehn BM. More than one-third of US individuals use the Internet to self-diagnose. JAMA. 2013;309(8):756-757. doi:10.1001/jama.2013.629
  13. Shahsavar Y, Choudhury A. User Intentions to Use ChatGPT for Self-Diagnosis and Health-Related Purposes: Cross-sectional Survey Study. JMIR Hum Factors. 2023;10:e47564. Published 2023 May 17. doi:10.2196/47564
  14. 14.0 14.1 14.2 Tan TF, Thirunavukarasu AJ, Campbell JP, et al. Generative Artificial Intelligence Through ChatGPT and Other Large Language Models in Ophthalmology: Clinical Applications and Challenges. Ophthalmol Sci. 2023;3(4):100394. Published 2023 Sep 9. doi:10.1016/j.xops.2023.100394
  15. 15.0 15.1 Delsoz M, Raja H, Madadi Y, et al. The Use of ChatGPT to Assist in Diagnosing Glaucoma Based on Clinical Case Reports. Ophthalmol Ther. 2023;12(6):3121-3132. doi:10.1007/s40123-023-00805-x
  16. 16.0 16.1 16.2 16.3 16.4 Madadi Y, Delsoz M, Lao PA, et al. ChatGPT Assisting Diagnosis of Neuro-ophthalmology Diseases Based on Case Reports. Preprint. medRxiv. 2023;2023.09.13.23295508. Published 2023 Sep 14. doi:10.1101/2023.09.13.23295508
  17. 17.0 17.1 17.2 Tao BK, Handzic A, Hua NJ, Vosoughi AR, Margolin EA, Micieli JA. Utility of ChatGPT for Automated Creation of Patient Education Handouts: An Application in Neuro-Ophthalmology. J Neuroophthalmol. Published online January 4, 2024. doi:10.1097/WNO.0000000000002074
  18. 18.0 18.1 18.2 Waisberg E, Ong J, Masalkhi M, et al. GPT-4: a new era of artificial intelligence in medicine. Ir J Med Sci. 2023;192(6):3197-3200. doi:10.1007/s11845-023-03377-8
  19. 19.0 19.1 Waisberg E, Ong J, Masalkhi M, et al. Google's AI chatbot "Bard": a side-by-side comparison with ChatGPT and its utilization in ophthalmology. Eye (Lond). Published online September 28, 2023. doi:10.1038/s41433-023-02760-0
  20. Huang AS, Hirabayashi K, Barna L, Parikh D, Pasquale LR. Assessment of a Large Language Model's Responses to Questions and Cases About Glaucoma and Retina Management [published correction appears in JAMA Ophthalmol. 2024 Apr 1;142(4):393. doi: 10.1001/jamaophthalmol.2024.1158]. JAMA Ophthalmol. 2024;142(4):371-375. doi:10.1001/jamaophthalmol.2023.6917
  21. Rojas-Carabali W, Sen A, Agarwal A, et al. Chatbots Vs. Human Experts: Evaluating Diagnostic Performance of Chatbots in Uveitis and the Perspectives on AI Adoption in Ophthalmology. Ocul Immunol Inflamm. 2024;32(8):1591-1598. doi:10.1080/09273948.2023.2266730
  22. Hu X, Ran AR, Nguyen TX, et al. What can GPT-4 do for Diagnosing Rare Eye Diseases? A Pilot Study. Ophthalmol Ther. 2023;12(6):3395-3402. doi:10.1007/s40123-023-00789-8
  23. 23.0 23.1 23.2 Tailor PD, Xu TT, Fortes BH, et al. Appropriateness of Ophthalmology Recommendations From an Online Chat-Based Artificial Intelligence Model. Mayo Clin Proc Digit Health. 2024;2(1):119-128. doi:10.1016/j.mcpdig.2024.01.003
  24. Chen JS, Reddy AJ, Al-Sharif E, et al. Analysis of ChatGPT Responses to Ophthalmic Cases: Can ChatGPT Think like an Ophthalmologist?. Ophthalmol Sci. 2024;5(1):100600. Published 2024 Aug 23. doi:10.1016/j.xops.2024.100600
  25. Goodman KE, Yi PH, Morgan DJ. AI-Generated Clinical Summaries Require More Than Accuracy. JAMA. 2024;331(8):637-638. doi:10.1001/jama.2024.0555
  26. 26.0 26.1 26.2 Mandalos A, Tsouris D. Artificial Versus Human Intelligence in the Diagnostic Approach of Ophthalmic Case Scenarios: A Qualitative Evaluation of Performance and Consistency. Cureus. 2024;16(6):e62471. Published 2024 Jun 16. doi:10.7759/cureus.62471
  27. Acosta JN, Falcone GJ, Rajpurkar P, Topol EJ. Multimodal biomedical AI. Nat Med. 2022;28(9):1773-1784. doi:10.1038/s41591-022-01981-2
  28. Mihalache A, Huang RS, Popovic MM, et al. Accuracy of an Artificial Intelligence Chatbot's Interpretation of Clinical Ophthalmic Images. JAMA Ophthalmol. 2024;142(4):321-326. doi:10.1001/jamaophthalmol.2024.0017
  29. Shukla R, Mishra AK, Banerjee N, Verma A. The Comparison of ChatGPT 3.5, Microsoft Bing, and Google Gemini for Diagnosing Cases of Neuro-Ophthalmology. Cureus. 2024;16(4):e58232. Published 2024 Apr 14. doi:10.7759/cureus.58232
  30. Pushpanathan K, Lim ZW, Er Yew SM, et al. Popular large language model chatbots' accuracy, comprehensiveness, and self-awareness in answering ocular symptom queries. iScience. 2023;26(11):108163. Published 2023 Oct 10. doi:10.1016/j.isci.2023.108163
The Academy uses cookies to analyze performance and provide relevant personalized content to users of our website.