Artificial intelligence (AI) is defined as any machine that has cognitive functions mimicking humans for problem solving or learning. AI has already been tested in several fields of endoscopy, such as in the detection of Barrett’s esophagus or the evaluation of adenoma detection rate during colonoscopy.
1. Introduction
Crohn’s disease (CD) and ulcerative colitis (UC) are chronic inflammatory bowel disease (IBD), with increasing incidence all around the world and a great impact on general well-being, social functioning, and utilization of healthcare resources
[1][2]. The diagnosis of IBD is a daily challenge for physicians, being based on different elements such as clinical data, biochemical values, radiology, endoscopy, and histology
[3]. Among them, endoscopy represents a cornerstone in the diagnosis and follow-up of CD and UC
[4][5].
In the last five years, the concept of endoscopy has evolved from a traditional one to a new idea based on artificial intelligence (AI). AI is defined as any machine that has cognitive functions mimicking humans for problem solving or learning
[6]. AI has already been tested in several fields of endoscopy, such as in the detection of Barrett’s esophagus
[7] or the evaluation of adenoma detection rate during colonoscopy
[8][9].
Attention has shifted to the potential role of AI in the field of IBD where endoscopic activity is based on several scores, such as the Mayo endoscopic subscore (MES), the Ulcerated Colitis Endoscopic Index of Severity (UCEIS), the Crohn’s Disease Endoscopic Index of Severity (CDEIS), the Lewis score, and the Capsule Endoscopy Crohn’s Disease Activity Index (CECDAI)
[10][11][12][13][14]. The reason for this large number of scores lays in the need for establishing a strict definition of disease activity, thus reducing the interobserver variability and having a solid comparative analysis of different patients or studies
[15].
2. What Is Artificial Intelligence and Its Current Application in Endoscopy?
AI-assisted endoscopy is based on computer algorithms that perform as human brains do
[16]. They react (output) to what they receive as information (input) and what they have learned when built. The fundamental principle of this technology is “machine learning” (ML)
[17].
There are many different ML methods (
Table 1) and one of the most popular is the use of artificial neural networks (ANN)
[18]. ANN is based on multiple interconnected layers of algorithms, which process data in a specific pattern and feed data so that the system can be trained to carry out a specific task
[19]. Another diffuse ML method is the Support-vector machine (SVM), which is used for classifying data sets by creating a line or plane to separate data into distinct classes
[20]. An evolution of ML is deep learning (DL): a complex, multilayer neural network architecture learns representations of data automatically by transforming the input information into multiple levels of abstractions
[21][22]. An evolution of the simpler ANN is the convolution neural network (CNN), inspired by the response of human visual cortex neurons to a specific stimulus and being able to convolve the input and pass its result to the next layer
[19][23].
Table 1. Algorithms involved in machine learning process.
Supervised |
The algorithm is trained by labeling data tagged with the correct answer |
Semisupervised |
The algorithm is trained without marking the training data |
Unsupervised |
The algorithm is structured on a large amount of unlabeled data based on a small amount of labeled data |
Based on this technology, three kinds of tools have been generated to support endoscopy in each part of its activity
[24][25][26]:
- -Computer-aided detection (CADe), which detects gastrointestinal lesions;
- -Computer-aided diagnosis (CADx), which characterizes gastrointestinal lesions;
- -Computer-aided monitoring (CADm), which evaluates the procedure and the endoscopist, thus improving the quality of endoscopy.
In particular, CADe and CADx are the best developed systems with many experiences around the world demonstrating their better performance than the human eye
[9][27][28][29]; for example, the GI-Genius Medtronic system reached a sensibility of 99.7% in polyps’ detection as shown by Hassan et al.
[27]. The application fields of AI are expanding rapidly and IBD is the next target of this innovative technology.
3. Artificial Intelligence (AI) in the Diagnosis of Inflammatory Bowel Disease (IBD)
One of the first applications of AI has been the attempt to facilitate the diagnosis of IBD and the differential diagnosis between CD and UC. In the model of Mossotto
[30], three supervised ML models were developed utilizing endoscopic data only, histological only, and combined endoscopic/histological with an accuracy of 71.0%, 76.9%, and 82.7%, respectively
[30]. The model combining endoscopic and histological data was tested on a statistically independent cohort of 48 pediatric patients from the same clinic, with an accuracy of about 83.3% in patients’ classification.
Quénéhervé and colleagues
[31] tried to design a model to diagnose IBD and establish differential diagnoses between CD vs. UC. They based their research on confocal laser endomicroscopy (CLE), which is an adaptation of light microscopy whereby focal laser illumination is combined with pinhole limited detection to geometrically reject out-of-focus light
[32]. The authors built a score based on 14 functional and morphological parameters to perform a quantitative analysis of the mucosa called cryptometry and detect a diagnosis of IBD with a sensitivity and a specificity to near 100%. Moreover, this entry reached a sensitivity of 92.3% and a specificity of 91.3% in the differential diagnosis between CD and UC.
Diagnosis of IBD can be a complex and challenging procedure due to its heterogeneous presentation. It is generally believed that making a correct diagnosis requires information on the endoscopic and histological features, together with clinical and biochemical data. AI support may be helpful in the diagnostic process by combining all suggestive features intelligently.
3. AI in UC, State-of-the-Art
As previously underlined, endoscopy plays a fundamental role in the diagnosis and assessment of IBD activity
[5]. According to this concept, endoscopy should guarantee an exact staging of the disease and a high level of concordance between different operators. Indeed, the definition of recurrence or the assessment of remission are cornerstones in the disease management, thus guiding the next clinical or surgical decisions
[33][34].
In the research of Ozawa, the authors designed a CAD system using a CNN and evaluated its performance in the identification of normal or inflamed mucosa, using a large dataset of endoscopic images from patients with UC
[35]. The performance of this new tool was valuable, with areas under the receiver operating characteristic curves (AUROCs) of 0.86 and 0.98 in the identification of MES 0 (completely normal mucosa) and MES 0–1 (mucosal healing state), respectively
[35]. In a similar experience from Stidham et al.
[36] a CNN showed an AUROC of 0.96 in distinguishing endoscopic remission (MES = 0 or 1) from moderate to severe disease (MES = 2 or 3), with a good weighted κ agreement between the CNN and the adjudicated reference score for identifying exact MES (κ = 0.84; 95% CI, 0.83–0.86). The application of this CNN to the entirety of the colonoscopy videos had high accuracy in identifying moderate to severe disease with an AUROC of 0.97
[36].
Moreover, Gottlieb and colleagues
[37] developed another recurrent neural network able to predict MES and UCEIS from entire endoscopy videos and not only from images. The system automatically selected the frame to be analyzed and scores were calculated on the colon section, showing high agreement with the human central reader score
[37]. Similarly, a fully automated video analysis system was developed to assess the grade of UC activity and predicted MES in 78% of videos (κ = 0.84). In external clinical trial videos, reviewers agreed on MES in 82.8% of videos (κ = 0.78)
[38].
Not only were automated systems able to assess endoscopic activity from still images
[39], but they were also able to predict a binary version of the MES directly analyzing a raw colonoscopy video, resulting in a high level of accuracy (AUC of 0.94 for MES ≥ 1 and 0.85 for MES ≥ 2 and MES ≥ 3)
[40]. Looking forward, it seems that AI can also guide real-time therapy decisions in patients with UC in clinical remission by helping to stratify the relapse risk one year after AI-assisted colonoscopy
[41].
Other experiences pushed forward the application of AI in the prediction of histology. Indeed, Takenaka and colleagues
[42] designed a deep neural network algorithm, defined as DNUC, based on more than 40,000 images from colonoscopies and 6000 biopsies of 875 patients prospectively collected. AI system evaluations were matched with the UCEIS score expressed for each image by three expert endoscopists and with the Geboes score determined by pathologists
[43]. The DNUC revealed an accuracy of 90.9% and 92.9% in the detection of endoscopic and histological remission, respectively. In addition, Maeda et al.
[44] developed a CADx system to predict persistent histological inflammation using endocytoscopy in 187 retrospectively collected patients. Endocytoscopy is one of the most valuable technologies, although it is not widely available in endoscopic departments. Providing ultra-high-resolution white light images (520x), endocytoscopy allows the so-called virtual histology or optical biopsy
[45].
Finally, a multicenter research in inactive patients with UC (PRognOstiC valuE of rEd Density in Ulcerative Colitis: PROCEED-UC; NCT04408703) is planned to assess the predictive value of the RD score for sustained clinical remission. It is plausible that the RD score might be used in the future as the first objective operator-independent endoscopic target in a treat-to-target strategy in UC. The main characteristics of the studies on endoscopic AI application in IBD are summarized in Table 2.
Table 2. Most relevant studies on endoscopic AI application in IBD.
Author (Year) |
Study Design |
Population |
Aim |
Results |
Mossotto et al. (2017) |
Prospective cohort study |
287 paediatric IBD |
To develop a ML model to classify disease subtypes |
Classification accuracy with supervised ML models of 71.0%, 76.9%, and 82.7% utilizing endoscopic data only, histological only, and combined endoscopic/histological data, respectively |
Quénéhervé et al. (2019) |
Retrospective cohort study |
23 CD patients, 27 UC patients, and 9 control patients |
To test computer-based analysis of CLE images and discriminate healthy subjects vs. IBD, and UC vs. CD |
Sensitivity of 100% and specificity of 100% in IBD diagnosis; sensitivity of 92% and specificity of 91% in IBD differential diagnosis |
Ozawa et al. (2019) |
Retrospective cohort study |
26,304 colonoscopy images from a cumulative total of 841 UC patients |
To test a CNN-based CAD system in identification of endoscopic inflammation severity |
AUROCs of 0.86 and 0.98 to identify MES 0 and 0–1, respectively |
Stidham et al. (2019) |
Retrospective cohort study |
16,514 images from 3082 UC patients |
To test DL models in grading endoscopic severity of UC |
AUROCs of 0.96, PPV of 0.87, sensitivity of 83.0%, specificity of 96.0%, and NPV of 0.94 in distinguishing endoscopic remission from MES 2–3 |
Gottlieb et al. (2021) |
Phase II randomized controlled study |
249 UC patients |
To test a recurrent neural network model in predicting MES and UCEIS from individual full-length endoscopy videos |
Excellent agreement metric with a QWK of 0.84 for MES and 0.85 for UCEIS |
Yao et al. (2021) |
Phase II randomized controlled study |
315 videos from 157 UC patients |
To test a fully automated video analysis system for grading endoscopic disease |
Excellent performance with a sensitivity of 0.90 and specificity of 0.87; correct prediction of MES in 78% of videos (k = 0.84) |
Bhambhani et al. (2021) |
Retrospective cohort study |
777 endoscopic images from 777 UC patients |
To test a DL models in the automated grading of each individual MES |
AUC of 0.89, 0.8, and 0.96 for classification of MES 1, 2, and 3, respectively; overall accuracy of 77.2% |
Becker et al. (2021) |
Prospective cohort study |
1672 videos from 1105 UC patients |
To test a DL–based system on raw endoscopic videos |
AUC of 0.84 for MES ≥ 1, 0.85 for MES ≥ 2 and 0.85 for MES ≥ 3 |
Maeda et al. (2021) |
Prospective cohort study |
145 UC patients |
To test AI in stratifying the relapse risk of patients in clinical remission |
Relapse rate significantly higher in the AI-active group than in the AI-healing group (28.4% vs. 4.9%, p < 0.001) |
Takenaka et al. (2020) |
Prospective cohort study |
40,758 images of colonoscopies and 6885 biopsy results from 2012 UC patients |
To test a DNN system based on endoscopic images of UC for predicting endoscopic and histological remission |
Accuracy of 90.1% and κ coefficient of 0.798 for endoscopic remission; accuracy of 92.9%and κ coefficient of 0.85 for histological remission |
Maeda et al. (2019) |
Retrospective cohort study |
187 UC patients |
To test a CAD system in predicting persistent histologic inflammation using EC |
Sensitivity, specificity, and accuracy of 74%, 97%, and 91%, respectively; κ =1 |
Honzawa et al. 2019 |
Retrospective cohort study |
52 UC patients in clinical remission |
To test a new endoscopic imaging system using the iscan TE-c (MAGIC score) to quantify mucosal inflammation in patients with quiescent UC |
MAGIC score significantly higher in the MES 1 than in the MES 0 group (p = 0.0034); MAGIC score significantly correlated with the Geboes score (p = 0.015) |
Bossuyt et al. (2020) |
Prospective cohort study |
29 UC patients and 6 controls |
To test a RD algorithm based on channel of the red-green-blue pixel values and pattern recognition from endoscopic images |
Good correlation between RD and RHI (r = 0.74, p < 0.0001), MES (r = 0.76, p < 0.0001), and UCEIS (r = 0.74, p < 0.0001) |
Abbreviations: AUC: area under the curve; AUROC: areas under the receiver operating characteristic curve; CAD: computer-assisted diagnosis; CD: Crohn’s disease; CLE: confocal laser endomicroscopy; CNN: convolution neural network; DL: deep learning; DNN: deep neural network; IBD: inflammatory bowel disease; MAGIC: Mucosal Analysis of Inflammatory Gravity by i-scan TE-c Image; MES: Mayo endoscopic subscore; ML: machine learning; NPV: negative predictive value; PPV: positive predictive value; QWK: quadratic weighted kappa, RD: red density; RHI: Robarts Histopathology index; UC: ulcerative colitis, UCEIS: Ulcerative Colitis Endoscopic Index of Severity.
This entry is adapted from the peer-reviewed paper 10.3390/jcm11030569