2. Data-Driven AI in High Myopia and Pathologic Myopia
PM is associated with an elongation of the axial length of the eye, which is usually associated with morphological changes in the sclera, choroid, Bruch’s membrane, retinal pigment epithelium, and neural retina. In addition, due to increases in the progressive and excessive axial lengths, highly myopic eyes also have high refractive errors and related ophthalmic changes. These changes may be further amplified when the eye undergoes refractive or cataract surgery due to the excessive length of the eye. Thus, it is expected that high myopia will generate a considerable amount of data during a long-term follow-up period, which would require an efficient method to analyze and interpret the findings.
Earlier, redundant and inconsistent data were collected due to the non-integrated and fragmented data management procedures. This has led to information quality problems, which has hampered the acquisition of an accurate diagnosis, resulting in poor management of myopic eyes.
With the recent creation and general distribution of digital hospital information systems, an opportunity has opened up for determining the onset and progression of PM through a much larger set of data. This has advanced the understanding of PM with more comprehensive perspectives and on more solid theoretical bases.
Data-driven AI studies are usually performed using ML techniques because they can detect different categories, obtain information buried in a large amount of data, and optimize the model that best fits the data. The models that are regressed by training data would verify the capacity for data categorization. ML techniques involve supervised learning, semi-supervised learning, and unsupervised learning. They include many methods such as kernel ridge regression, support vector machines (SVM), nearest neighbors, gaussian processes, naive Bayes, random forests, neural networks, and others. Further evolutional methods such as extreme gradient boosting (XGBoost) and light gradient boosting machine (LightGBM) supply more chances in regression models and can determine potential relationships to understand the occurrence and progression of high and pathologic myopia. With these powerful methods, representative patterns can be statistically calculated and extracted for ensemble predictive models.
Earlier studies reported that the incidence of myopia had reached 84.6% in elementary school children and 95.5% in university students in China
[47][48][49], and it is not difficult to believe that such levels are not unique to China. Thus, it is urgent to monitor eyes with high myopia at an earlier stage, which raises the need for AI-assisted screening techniques. In areas with high levels of myopia, several data-driven studies on high myopia have reported that DL learning models can be used to solve real problems with sensible solutions (
Table 1). The ML models have shown that the refractive errors and the risk of high myopia (myopia ≤ −6.0 diopters) that develop within ten years are predictable in school-aged children
[50]. In this approach, the random forest model, generalized estimating equation model, and mixed-effects model were fitted and evaluated by the coefficient of determination (R2), the root mean square error (RMSE), mean absolute error (MAE), and characteristics of the area under the receiver operating curves (AUC). The model was tested by both internal and external datasets. Typically, the random forest model had the best performance and the AUC reached as high as 0.802 to 0.976. This approach provided evidence for transforming clinical practice, health policy-making, and precise individualized interventions regarding the practical control of school-aged myopia by employing big data and ML. However, in some circumstances where the clinical data are not available, it may be difficult for physicians to manage high myopia patients. To address this, ML models were also designed and trained to play roles in analyzing eyes with high myopia. By training with the wavefront aberrometry values through the XGBoost algorithm, DL models have been used to predict the subjective refractive errors, and the mean absolute error between true values and predicted values ranged from 0.094 to 0.301 diopters, and the combination of machine learning and aberrometry based on wavefront decomposition basis will aid in the development of refined algorithms
[51]. Furthermore, highly myopic eyes often have hyperopic refractive errors after cataract surgery, despite the use of partial coherence interferometry, which could eliminate biometric errors. Through XGBoost regression, AI models trained by medical records extracted from myopia patients could improve the accuracy of implementing IOL power in high myopia with cataracts
[52].
Table 1. Data-driven artificial intelligence (AI) models in high myopia and pathologic myopia.
In addition, in situations where only limited information can be accessed, electrooculographic (EOG) data could also be used to train ML models in classifying myopic refractive disorders. It has been reported that when the logistic regression model, Naïve Bayes model, and random forest model were trained by EOG data, the random forest model had the best performance with a sensitivity of 95.5% and a specificity of 96%. The total classification accuracy reached 90.91%, and the achieved models could inspire novel approaches to clinical screening of myopia when general data are not available
[53]. Furthermore, because the axial length value is a key indicator for high myopia, simply assessing the change in axial length can be used to evaluate the myopia progression. More specifically, these methods can be used by practitioners to judge the true extent of myopia progression before performing a cycloplegic refraction examination. Linear regression, SVM, and bagged trees have been used to predict increases in axial length in adolescents. From an evaluation of the performance of models by five-folded cross-validation, the linear model achieved a high level of precision with an R square value of 0.87
[55].
In addition to these methods of predicting the actual outputs, there are other ways to use AI algorithms. It is generally accepted that clinical data tend to be imperfect and may lack different parts during clinical research because each performed examination is required to test the evidence-based hypothesis. However, these imperfect data would be a high barrier for research and the understanding of these disease processes. One of the benefits of ML algorithms is that they can fill in the missing values based on a scientific method, and the results can be closer to the true value. This will lead to a better understanding of the occurrences and progression of the disease processes. Furthermore, even with abundant data or features that can be assessed, physicians still need to determine how to filter out important values to test a hypothesis. In addition to traditional methods such as the principal component analysis (PCA), ML algorithms supply multiple choices for data dimension reduction, such as randomized singular value decomposition-based PCA, spectral embedding, isomap embedding, and others. These algorithms offer opportunities for clinicians to analyze the abundant data and to determine ways to test their hypotheses.
For myopia control, it is widely known that the environment, especially luminance and ultraviolet, plays important roles in affecting the progression of myopia. As the nature of collecting monitoring environmental data is complex, it is difficult to implement monitoring widely in the public. Through luminance, ultraviolet light levels, and step number data, AI models could be trained in different indoor and outdoor locations. These methods can be useful monitoring tools for community- or school-based public health interventions or individual health management
[54].
ML models have been typically used to fill in missing clinical data and to select features that were highly correlated with the myopia in adolescents
[56]. Features selected by ML learning algorithms have been used to explore the potential risk factors that affect the severe axial length elongation in highly myopic eyes. These approaches are particularly important because they provide reference data for physicians when faced with complex situations. To screen for high myopia in rural areas where myopia specialists or essential instruments are not available, these predictive values would be important indicators for high myopia screening and for monitoring the progression of myopia.