Artificial intelligence (AI) is defined as any machine that has cognitive functions mimicking humans for problem solving or learning [6]. AI has already been tested in several fields of endoscopy, such as in the detection of Barrett’s esophagus [7] or the evaluation of adenoma detection rate during colonoscopy.
Supervised | The algorithm is trained by labeling data tagged with the correct answer |
Semisupervised | The algorithm is trained without marking the training data |
Unsupervised | The algorithm is structured on a large amount of unlabeled data based on a small amount of labeled data |
Author (Year) | Study Design | Population | Aim | Results | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Mossotto et al. (2017) | Prospective cohort study | 287 paediatric IBD | To develop a ML model to classify disease subtypes | Classification accuracy with supervised ML models of 71.0%, 76.9%, and 82.7% utilizing endoscopic data only, histological only, and combined endoscopic/histological data, respectively | ||||||
Quénéhervé et al. (2019) | Retrospective cohort study | 23 CD patients, 27 UC patients, and 9 control patients | To test computer-based analysis of CLE images and discriminate healthy subjects vs. IBD, and UC vs. CD | Sensitivity of 100% and specificity of 100% in IBD diagnosis; sensitivity of 92% and specificity of 91% in IBD differential diagnosis |
||||||
Ozawa et al. (2019) | Retrospective cohort study | 26,304 colonoscopy images from a cumulative total of 841 UC patients | To test a CNN-based CAD system in identification of endoscopic inflammation severity | AUROCs of 0.86 and 0.98 to identify MES 0 and 0–1, respectively | ||||||
Stidham et al. (2019) | Retrospective cohort study | 16,514 images from 3082 UC patients | To test DL models in grading endoscopic severity of UC | AUROCs of 0.96, PPV of 0.87, sensitivity of 83.0%, specificity of 96.0%, and NPV of 0.94 in distinguishing endoscopic remission from MES 2–3 | ||||||
Gottlieb et al. (2021) | Phase II randomized controlled study | 249 UC patients | To test a recurrent neural network model in predicting MES and UCEIS from individual full-length endoscopy videos |
Excellent agreement metric with a QWK of 0.84 for MES and 0.85 for UCEIS |
||||||
Yao et al. (2021) | Phase II randomized controlled study | 315 videos from 157 UC patients | To test a fully automated video analysis system for grading endoscopic disease | Excellent performance with a sensitivity of 0.90 and specificity of 0.87; correct prediction of MES in 78% of videos (k = 0.84) |
||||||
Bhambhani et al. (2021) | Retrospective cohort study | 777 endoscopic images from 777 UC patients | To test a DL models in the automated grading of each individual MES | AUC of 0.89, 0.8, and 0.96 for classification of MES 1, 2, and 3, respectively; overall accuracy of 77.2% |
||||||
Becker et al. (2021) | Prospective cohort study | 1672 videos from 1105 UC patients | To test a DL–based system on raw endoscopic videos | AUC of 0.84 for MES ≥ 1, 0.85 for MES ≥ 2 and 0.85 for MES ≥ 3 | ||||||
Maeda et al. (2021) | Prospective cohort study | 145 UC patients | To test AI in stratifying the relapse risk of patients in clinical remission | Relapse rate significantly higher in the AI-active group than in the AI-healing group (28.4% vs. 4.9%, | p | < 0.001) | ||||
Takenaka et al. (2020) | Prospective cohort study | 40,758 images of colonoscopies and 6885 biopsy results from 2012 UC patients | To test a DNN system based on endoscopic images of UC for predicting endoscopic and histological remission | Accuracy of 90.1% and κ coefficient of 0.798 for endoscopic remission; accuracy of 92.9%and κ coefficient of 0.85 for histological remission |
||||||
Maeda et al. (2019) | Retrospective cohort study | 187 UC patients | To test a CAD system in predicting persistent histologic inflammation using EC | Sensitivity, specificity, and accuracy of 74%, 97%, and 91%, respectively; κ =1 | ||||||
Honzawa et al. 2019 | Retrospective cohort study | 52 UC patients in clinical remission | To test a new endoscopic imaging system using the iscan TE-c (MAGIC score) to quantify mucosal inflammation in patients with quiescent UC | MAGIC score significantly higher in the MES 1 than in the MES 0 group ( | p | = 0.0034); MAGIC score significantly correlated with the Geboes score ( | p | = 0.015) | ||
Bossuyt et al. (2020) | Prospective cohort study | 29 UC patients and 6 controls | To test a RD algorithm based on channel of the red-green-blue pixel values and pattern recognition from endoscopic images | Good correlation between RD and RHI (r = 0.74, | p | < 0.0001), MES (r = 0.76, | p | < 0.0001), and UCEIS (r = 0.74, | p | < 0.0001) |