Introduction to Artificial Intelligence in Ophthalmology
This page was enrolled in the Residents and Fellows contest.
- 1 Definition
- 2 Types of Artificial Intelligence
- 3 Limitations of Artificial Intelligence
- 4 Method to Approaching Artificial Intelligence Studies
- 5 Potential Applications of Artificial Intelligence in Ophthalmology
- 6 Future directions
- 7 References
Artificial Intelligence (AI), a term introduced in the 1950s, refers to software that can mimic cognitive functions such as learning and problem solving. It makes it possible for machines to learn from experience and adjust to new inputs. AI machines can be trained to accomplish such tasks by processing and recognizing patterns in large amounts of data. It has numerous applications in several fields of ophthalmology.
Types of Artificial Intelligence
Simple Automated Detectors
A simple automated detector is a system that is fed an algorithm that identifies the presence or absence of features (e.g. location, dimensions, or contour of a lesion) based on objective criteria. The input into the system is a set of stepwise rules generated by the individuals who are engineering a model that predicts outcomes.This rule-based algorithm assesses features and ultimately yields an outcome (e.g. diagnosis) based on the patterns identified.
A more advanced form of AI is machine learning. Unlike simple automated detectors, the input into machine learning is a training dataset (e.g. a set of images) rather than predefined algorithms (i.e. rules written by the program designers). Machine learning can be further classified into supervised, semi-supervised, and unsupervised learning. In unsupervised learning, the machine can “learn” on its own by creating its own algorithms based on the dataset presented. However, the autonomous machine learning process can be jump-started by providing the machine with objective guidance through a labeled dataset and is the basis for supervised learning. A labeled dataset, for example, is a collection of images that have been preassigned a “ground truth” diagnosis by experts who utilized standard diagnostic methods. A consensus amongst experts on a reference standard diagnosis for each item (e.g. image) in the set may strengthen the quality of the dataset that is presented to the machine during the initial learning phase when the algorithms are built. Presenting an machine learning system with a hybrid of both training and labeled datasets is termed semi-supervised learning.See Figure 1.
Machine Learning Techniques
Machine learning models can employ various techniques to predict an output. Classification is one technique that is built upon supervised learning. This type of system allows for concrete categorization of outputs (e.g. presence versus absence of disease; mild versus severe category of disease). Likewise, clustering permits for similar categorization to be concluded; however, the clustering technique is appropriate for unsupervised machine learning. The regression technique is supervised and applied when the intention of machine learning is to determine a continuous score (e.g. numerical value) based on an input (e.g. image).
Utilizing Neural Networks in Machine Learning An AI system designed using artificial neural networks (ANNs) is considered a subset of machine learning and can be used in both supervised and unsupervised learning. The architecture of an ANN is dependent upon layers. An ANN consists of an input layer that is introduced with (e.g. diagnostic) features determined by programmers in advance and an output layer that receives the analysis from the input layer and determines the outcome (e.g. diagnosis). A layer refers to the level at which analysis of one distinct (e.g. diagnostic) feature is conducted. Each layer is composed of multiple neurons, or nodes, that perform the higher level analytical processing, emulating the cortical functions of the brain. Each node can be considered a basic calculation of the input including multiplication of an assigned “weight” and an mathematical activation function which enables more complex analysis. The result is then transmitted to multiple nodes in the next layer. Given the number of nodes and layers, each item (e.g. image) in the dataset is analyzed multiple times by the machine, fine-tuning the predictive CNN algorithm. See Figure 2 for visualization of this process and comparison to Deep Learning described below).
In a simple neural network, there are two layers: an input and output layer. However, deep learning algorithms are neural networks with an expanded number of communicating layers between the input and output layers. These are also known as convolutional neural networks (CNNs). This intricate type of machine learning involves supervised learning with labeled datasets (e.g. images with preassigned expert diagnoses). See Figure 3 for a flow diagram of this relationship.
The CNN begins with an input into the first layer, which then results in an output that serves as the input for the next layer in the series. The analysis from each layer is transmitted throughout the network until the final layer produces the outcome. CNNs designed for image-based diagnosis, for example, analyze the pixels in correlation with the features seen in the disease state depicted. If the predicted outcome does not match the expert outcome determined using standard methods, then the CNN alters its weights until the predicted outcome is accurate. Then, the machine is presented with a training dataset (e.g. unlabeled images) and provides the outcome (e.g. diagnosis) based on what the machine “learned” under supervision. See Figure 2 for visualization of this process in comparison to Machine Learning as described above.
While the training dataset can provide an initial analysis, it is important to ensure that the algorithm is generalizable to new datasets. This is accomplished with validation datasets, which minimize “overfitting” the algorithm to the training dataset. Techniques to minimize overfitting include dataset expansion, augmentation, applying drop out, and regularization. Augmentation can be utilized in examples where the original dataset cannot be expanded, and the images are altered (resizing, cropping, rotating, etc) in such a way that they still represent a realistic example. Drop out can be used to train the algorithm by ignoring a certain subset of nodes, training other nodes to perform the work of others. Regularization involves altering the total of all weights in the system. The final step is to test the algorithm’s performance on a new dataset that it has not been exposed to.
Limitations of Artificial Intelligence
As a quickly evolving field, AI comes with some challenges. For one, the accuracy of outcomes is heavily dependent on the quality of inputs. This has been described as the “garbage in, garbage out” phenomenon; if the initial dataset presented to the machine is inadequate, then the predictions generated by the AI tool will be inaccurate. In some situations, output recommendations by AI tools may be simply incorrect. For example, IBM Watson Health’s AI algorithm, which predicts treatments for patients with cancer, recommended the use of bevacizumab in patients with severe bleeding. However, hemorrhage is stated as a black box warning for bevacizumab. This example highlights the importance of training and validating AI algorithms.
Erroneous predictions by AI algorithms can bring up the issue of liability for physicians. Current law protects physicians from liability, as long as they follow the standard of care. Thus, physicians are incentivized to use AI predictions only if they confirm existing decision-making processes, instead of as a resource to improve and build upon patient care. In the future, further medicolegal implications must be considered if AI becomes integrated into the standard of care.
Additionally, CNN algorithms are not entirely designed by programmers, but by self-generated rules. They arrive to conclusions opaquely as programmers are unaware of the reasoning behind these self-generated rules. This is called the black box dilemma, as people may be hesitant to trust predictions that stem from a process that, by definition, lacks transparency.
There is a fear that AI reduces the need for physicians, as numerous studies have shown that certain algorithms have higher success rates of diagnosing diseases as compared to those of clinicians. This has been a concern in image-based fields such as radiology and pathology, with concerns that it may limit physicians by narrowing the scope of their clinical judgment, reducing a reliance on broad differential diagnoses, and automating the process of patient care which may affect that patient-physician relationship. By predicting diagnoses in a purely algorithmic and objective manner, AI dismisses any subjective facets of a disease that may be unique to a patient, potentially overlooking crucial information.
However, it can be argued that AI merely augments the work of physicians by serving as a diagnostic tool generating predictions that can positively affect patient management. For example, an AI-integrated telemedicine platform designed to screen and refer patients with cataracts exhibited diagnostic performance of over 90%. More importantly, the platform improved physician efficiency by allowing them to evaluate ten times as many patients a year. By complementing the role of physicians, AI has the potential to significantly improve patient care by increasing efficiency and outcomes as it becomes incorporated into clinical practice in the near future.
Method to Approaching Artificial Intelligence Studies
Al is rapidly involving in the field of ophthalmology and new experimental algorithms are emerging in the literature that describe the AI system and methodology. Articles regarding AI studies can be challenging to understand, especially when there is no standardized way of presenting data, statistics and clinical value. Nevertheless, there are fundamental characteristics that readers can look for in order to critically appraise such studies. In the introduction, articles will typically emphasize the clinical gap that AI may fulfill and the research question it seeks to answer. Additionally, the introduction usually summarizes a thorough literature search that explores existing technologies pertaining to the disease and discusses the potential for AI to build on these technologies to provide further insight.
In the methods section, articles will elaborate on the basic framework of an AI system, which consists of two phases: (1) training and validation and (2) testing. The training and validation phase can be further broken down into two other parts: (1) a training dataset entailing data and/or images; and (2) a selection of a CNN. Datasets may vary in sample size and may be limited in diversity or generalizability. The algorithm responsible for analyzing datasets is rooted in a preset reference standard. This so-called “brain” for AI algorithms stems from human diagnosis and assessment, highlighting the need to specify the credibility of the human assessors. A high-quality reference standard eliminates a limitation of AI mentioned earlier, the garbage-in, garbage-out situation.
Moreover, AI studies should clearly describe the workflow of the AI system. For example, the workflow may consist of an input, such as an image, which is then analyzed by the AI system to detect specific features and ultimately produce an outcome, a diagnosis. The resulting diagnosis may complement the work of physicians, demonstrating the potential for reported AI systems to be ultimately incorporated into clinical practice. Nevertheless, as described by Ting et al., AI articles should generally include limitations of the AI systems discussed. This gives readers a clearer understanding of the potential for integration of AI into clinical practice and of the associated shortcomings of such systems.
Potential Applications of Artificial Intelligence in Ophthalmology
The incorporation of AI systems into clinical practice can potentially enhance productivity in the workplace, as well as aid in the clinical decision-making process. Applying AI to medical diagnostic assessments allows for the automatic analysis of imaging, for example, and the subsequent generation of a diagnosis or prediction of a disease course. In ophthalmology, many AI platforms are being explored for potential use in the detection, surveillance, and treatment of various ocular diseases. However, many are in the experimental phase and further evaluation must be done to assess if these algorithms are appropriate for clinical practice. AI algorithms have been described in the literature in several fields in ophthalmology such as diabetic retinopathy, glaucoma, age-related macular degeneration, retinopathy of prematurity, retinal vascular occlusions, keratoconus, cataract, refractive errors, retinal detachment, squint, and ocular cancers. It is also useful for intraocular lens power calculation, planning squint surgeries, and planning intravitreal antivascular endothelial growth factor injections. In addition, AI can detect cognitive impairment, dementia, Alzheimer's disease, stroke risk, and so on from fundus photographs and optical coherence tomography.
Multiple deep learning programs have demonstrated high sensitivity and specificity in recognizing glaucomatous optic nerve changes. These AI readings are based on diagnostic features otherwise typically assessed by a human expert, including OCT and color fundus photography findings, visual field testing results, and intraocular pressure and corneal thickness measurements. In addition to these AI systems that screen for the presence or absence of glaucoma, Muhammad et al. developed a deep learning algorithm that accurately identifies glaucoma suspects, allowing for more timely management. An expanded discussion of the applications of AI in the field of glaucoma can be found in the article “Artificial Intelligence in Retina.”
Emulating a decision tree model, a machine learning algorithm was developed to anticipate the course of periocular reconstruction during surgical treatment of basal cell carcinoma.Moreover, several machine learning systems, through the use of artificial neural networks, have been designed to predict disease outcomes for choroidal melanoma by analyzing demographic data and oncologic history.Habibalahi et al. implemented machine learning techniques in the development of a multispectral imaging system for the detection of ocular surface squamous neoplasia on biopsy. The system demarcates the region of neoplastic changes, providing a visual representation of disease margins to the clinician or surgeon in minimal time.
Machine learning programs have been developed to detect and grade cataracts.  Wu et al recently validated a model using an AI algorithm called ResNet to identify referable cataracts. Deep learning algorithms for the assessment of congenital cataracts in particular have also been reported. A validated system by Liu et al., known as the CC-Cruiser, demonstrated high accuracy for identifying the region, density, and degree of congenital cataract formation based on slit-lamp photographs.Machine learning systems for cataracts have also been shown to adequately guide plans for surgical intervention, as well as anticipate the likelihood of posterior capsular opacification developing post-operatively.Calculating intraocular lens power can be conducted through machine learning methods. One notable example is the Hill-RBF formula that analyzes the following inputted data: axial length, central corneal thickness, anterior chamber depth, lens thickness, corneal diameter, and keratometry measurements.
In the pediatric population, timely ocular management is critical for the preservation of vision. Incorporation of AI into screening and treatment practices may aid in achieving optimal ophthalmic care A deep learning algorithm for the assessment of strabismus from external images has been developed with the potential for implementation in tele-ophthalmology. Other systems to detect strabismus are based on eye tracking deviations or retinal birefringence scanning.  Other machine learning systems have the potential to facilitate screening for high myopia among other refractive errors, as well as classify children susceptible to reading disabilities.Van Eenwyk et al describes a system using machine learning that incorporates the Brückner pupil red reflex imaging and eccentric photorefraction to detect amblyogenic features of strabismus or high refractive errors. AI systems for congenital cataracts are discussed above in the “Cataract” section.
AI has been utilized in diabetic retinopathy (DR), retinopathy of prematurity (ROP) and age-related macular degeneration (AMD). The IDx-DR system was recently FDA approved for DR. Other applications exist that can identify severity of DR and clinically significant macular edema. In ROP, AI tools such as the i-ROP DL system can distinguish features such as plus disease and is comparable or better than expert diagnosis. In AMD, AI can be used to identify the difference between non-exudative and exudative AMD. An expanded discussion of the applications of AI in the field of retina can be found in the article “Artificial Intelligence in Retina.”
The advent of AI may reshape the field of medicine. There is unprecedented potential for AI to expand scientific inquiry by using neural networks to generate hypotheses and make new discoveries. With rapid analysis of vast amounts of data, AI can explore associations between disease features that may not be readily apparent to humans. There is potential for AI to enhance physicians’ abilities to diagnosis conditions earlier and more accurately. Ultimately, AI has the potential to assist physicians by individualizing medical management, exposing patients to therapy only when the clinical judgement of the physician is supported by the results of deep learning.
Looking beyond solely utilizing AI in clinical practice, machine learning methods may play a role in guiding research investigations that aim to identify disease features newly discovered through automated techniques.On a global level, the application of AI to existing tele-ophthalmology programs may facilitate outreach to underserved regions, addressing the shortage of specialists available to provide their expertise.
- Kapoor R, Walters SP, Al-Aswad LA. The current state of artificial intelligence in ophthalmology. Surv Ophthalmol. 2019;64(2):233-240. doi:10.1016/j.survophthal.2018.09.002
- Akkara JD, Kuriakose A. Role of artificial intelligence and machine learning in ophthalmology. Kerala J Ophthalmol 2019;31:150-60. doi: 10.4103/kjo.kjo_54_19
- Roach, L. “Artificial Intelligence.” EyeNet Magazine, Nov. 2017, www.aao.org/eyenet/article/artificial-intelligence.
- The ultimate guide to AI in radiology. The Ultimate Guide to AI in Radiology. https://www.quantib.com/the-ultimate-guide-to-ai-in-radiology. Accessed November 13, 2019.
- Ting DSW, Lee AY, Wong TY. An Ophthalmologist’s Guide to Deciphering Studies in Artificial Intelligence. Ophthalmology. 2019;126(11):1475-1479. doi:10.1016/j.ophtha.2019.09.014
- LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539
- Price WN, Gerke S, Cohen IG. Potential Liability for Physicians Using Artificial Intelligence. JAMA. October 2019. doi:10.1001/jama.2019.15064
- When AIs Outperform Doctors: Confronting the Challenges of a Tort-Induced Over-Reliance on Machine Learning by A. Michael Froomkin, Ian R. Kerr, Joelle Pineau :: SSRN. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3114347. Accessed December 1, 2019.
- Castelvecchi D. Can we open the black box of AI? Nature International Weekly Journal of Science. https://www.nature.com/articles/doi:10.1038/538020a. Published October 5, 2016. Accessed November 12, 2019.
- Ardila D, Kiraly AP, Bharadwaj S, et al. End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat Med. 2019;25(6):954-961. doi:10.1038/s41591-019-0447-x
- Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118. doi:10.1038/nature21056
- Johnston SC. Anticipating and Training the Physician of the Future: The Importance of Caring in an Age of Artificial Intelligence. Acad Med. 2018;93(8):1105-1106. doi:10.1097/ACM.0000000000002175
- Ting DSJ, Ang M, Mehta JS, Ting DSW. Artificial intelligence-assisted telemedicine platform for cataract screening and management: a potential model of care for global eye health. Br J Ophthalmol. 2019;103(11):1537-1538. doi:10.1136/bjophthalmol-2019-315025
- Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44-56. doi:10.1038/s41591-018-0300-7
- Ting DSW, Peng L, Varadarajan AV, et al. Deep learning in ophthalmology: The technical and clinical considerations. Prog Retin Eye Res. April 2019. doi:10.1016/j.preteyeres.2019.04.003
- Akkara. Role of artificial intelligence and machine learning in ophthalmology. http://www.kjophthal.com/article.asp?issn=0976-6677;year=2019;volume=31;issue=2;spage=150;epage=160;aulast=Akkara. Accessed December 1, 2019.
- Muhammad H, Fuchs TJ, De Cuir N, et al. Hybrid Deep Learning on Single Wide-field Optical Coherence tomography Scans Accurately Classifies Glaucoma Suspects. J Glaucoma. 2017;26(12):1086-1094. doi:10.1097/IJG.0000000000000765
- Habibalahi A, Bala C, Allende A, Anwer AG, Goldys EM. Novel automated non invasive detection of ocular surface squamous neoplasia using multispectral autofluorescence imaging. Ocul Surf. 2019;17(3):540-550. doi:10.1016/j.jtos.2019.03.003
- Yang J-J, Li J, Shen R, et al. Exploiting ensemble learning for automatic cataract detection and grading. Comput Methods Programs Biomed. 2016;124:45-57. doi:10.1016/j.cmpb.2015.10.007
- Zhang L, Li J, Zhang I, Han H, Liu B, Yang J, et al. Automatic cataract detection and grading using Deep Convolutional Neural Network. In: 2017 Presented at: IEEE 14th International Conference on Networking, Sensing and Control (ICNSC); 2017. p. 60‐5.
- Wu X, Huang Y, Liu Z, et al. Universal artificial intelligence platform for collaborative management of cataracts. Br J Ophthalmol. 2019;103(11):1553-1560. doi:10.1136/bjophthalmol-2019-314729
- Liu X, Jiang J, Zhang K, et al. Localization and diagnosis framework for pediatric cataracts based on slit-lamp images using deep features of a convolutional neural network. PLoS ONE. 2017;12(3):e0168606. doi:10.1371/journal.pone.0168606
- Mohammadi S-F, Sabbaghi M, Z-Mehrjardi H, et al. Using artificial intelligence to predict the risk for posterior capsule opacification after phacoemulsification. J Cataract Refract Surg. 2012;38(3):403-408. doi:10.1016/j.jcrs.2011.09.036
- Chen Z, Fu H, Lo W-L, Chi Z. Strabismus Recognition Using Eye-Tracking Data and Convolutional Neural Networks. J Healthc Eng. 2018;2018:7692198. doi:10.1155/2018/7692198
- Gramatikov BI. Detecting central fixation by means of artificial neural networks in a pediatric vision screener using retinal birefringence scanning. Biomed Eng Online. 2017;16(1):52. doi:10.1186/s12938-017-0339-6
- Reid JE, Eaton E. Artificial intelligence for pediatric ophthalmology. Curr Opin Ophthalmol. 2019;30(5):337-346. doi:10.1097/ICU.0000000000000593
- Van Eenwyk J, Agah A, Giangiacomo J, Cibis G. Artificial intelligence techniques for automatic screening of amblyogenic factors. Trans Am Ophthalmol Soc. 2008;106:64-73.
- Abràmoff MD, Lavin PT, Birch M, Shah N, Folk JC. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. NPJ Digit Med. 2018;1:39. doi:10.1038/s41746-018-0040-6
- Brown JM, Campbell JP, Beers A, et al. Automated Diagnosis of Plus Disease in Retinopathy of Prematurity Using Deep Convolutional Neural Networks. JAMA Ophthalmol. 2018;136(7):803-810. doi:10.1001/jamaophthalmol.2018.1934
- Russakoff DB, Lamin A, Oakley JD, Dubis AM, Sivaprasad S. Deep Learning for Prediction of AMD Progression: A Pilot Study. Invest Ophthalmol Vis Sci. 2019;60(2):712-722. doi:10.1167/iovs.18-25325