Peptic Ulcer Bleeding and AI: History
Please note this is an old version of this entry, which may differ significantly from the current revision.
Contributor:

Peptic ulcer bleeding (PUB) is a common gastrointestinal (GI) emergency requiring prompt assessment, with a mortality rate of 2–10%.

  • peptic ulcer
  • bleeding
  • artificial intelligence
  • endoscopy

1. Introduction

Peptic ulcer bleeding (PUB) is a common gastrointestinal (GI) emergency requiring prompt assessment, with a mortality rate of 2–10% [1,2,3]. Recently, with the reduced incidence of peptic ulcer disease and the advancement of endoscopic therapy, the bleeding-related hospitalization and mortality rates of PUB have decreased [4,5]. International guidelines have been updating the optimal management approach for patients suffering from PUB [6,7]. The cascade of management can be divided into three stages: pre-endoscopy, endoscopic, and post-endoscopy management. Pre-endoscopy management includes assessing patient’s risk for hospitalization, providing adequate fluid and blood component resuscitation, prescribing medication such as a proton pump inhibitor (PPI), and identifying the timing of endoscopy (Figure 1). Endoscopic management includes assessing the nature of bleeder (e.g., peptic ulcer disease, malignancy, or variceal hemorrhage) and providing endoscopic therapy as appropriate. For post-endoscopy management, intravenous PPI infusion therapy is prescribed to reduce PUB recurrence. Furthermore, eradication of Helicobacter pylori infection decreases the recurrence of peptic ulcer disease, and long-term secondary PPIs are required for patients who are at risk for recurrent bleeding.
Figure 1. Cascade management of peptic ulcer bleeding. This diagram illustrates the potential role of artificial intelligence (AI) in the future management of peptic ulcer bleeding (PUB) based on text, data, and imaging. Blue: studies with multicenter clinical data validation; Green: studies with single-center clinical data validation; Red: no relevant research found in this field. AI, artificial intelligence; DL, deep learning; ER, emergency room; NLP, natural language processing; PUB, peptic ulcer bleeding.
Developed since the 1950s, artificial intelligence (AI) refers to computer programs that can simulate the human cognitive process in problem-solving and learning [8]. Through the machine learning (ML) approach, the computer can process large data to build various predicting models. Meanwhile, deep learning (DL) has further simulated human neuronal networks with improved performance, especially image processing, since 2010. A UK survey study demonstrated that the gastroenterology trainee experience for PUB management declined from 76% in 1996 to 15% in 2011 [2], owing to the decreased incidence of peptic ulcer disease. The use of AI technology for PUB could enhance the accuracy of patient triage, help achieve accurate therapeutic decisions, and prevent human errors caused by inexperience, especially in an emergency. In this review, we highlight the published literature in the last 5 years with keywords of “artificial intelligence”, “peptic ulcer bleeding”, “nonvariceal bleeding”, “deep learning”, or “machine learning” from a PubMed search to determine the current status and gain insight into the role of AI in PUB management.

2. Application of AI in the Pre-Endoscopy Period for Patient Risk Assessment

Upon presentation at the hospital, stratification of patients in terms of gastrointestinal bleeding (GIB) risk is recommended [6,7,9]. Accurately identifying (“phenotype”) patients with GIB during initial assessment is the first step toward patient management, especially during these times of the COVID-19 pandemic. Shung et al. [10] used multiple natural language processing (NLP)-based approaches for automated phenotyping of patients in the emergency department. They found that the syntax-based NLP algorithm from patient triage information performed better than the systematized nomenclature of medicine code information for the patient’s condition, which allows early use of patient triage to subsequent patient management.
In the past two decades, three widely validated scoring systems, namely, Glasgow–Blatchford score (GBS) for outpatient management [11], Rockall score for mortality [12], and the AIMS65 score [13,14,15], have been utilized for predicting low-risk patients. However, compared with these conventional scores [16], ML can potentially improve risk assessment for the need for transfusion, endoscopic evaluation, or hospital admission for observation. Clinical ML use is also more feasible than such conventional scores for busy clinicians through the automatic deployment of ML models with existing available electronic health records in many healthcare systems. In 2003–2008, nine small studies were conducted to investigate ML’s potential for PUB risk assessment in comparison with the conventional scores [16]. The median areas under the curve (AUCs) were higher in artificial neural networks (0.93; range, 0.78–0.98) than in other ML models (0.81, range: 0.40–0.92) when predicting patient mortality, intervention requirement, or rebleeding. Moreover, ML generally provided a better prognostic performance in patients with GIB than conventional scores, and artificial neural networks tended to outperform other ML models.
In 2020, Seo et al. [17] prospectively analyzed 1439 PUB cases to compare the accuracy of ML and conventional scores for PUB patient instability including hypotension, rebleeding, and mortality. Four ML algorithms, namely, logistic regression with regularization, random forest classifier (RF), gradient boosting classifier (GB), and voting classifier (VC), were compared using the GBS and Rockall scores. The RF model was the most accurate in predicting mortality (AUROC: RF 0.917 vs. GBS 0.710), while the VC model was the most accurate for hypotension (VC 0.757 vs. GBS 0.668) and rebleeding within 7 days (VC 0.733 vs. GBS 0.694). The global feature importance analysis identified clinically significant variables, including blood urea nitrogen, albumin, hemoglobin, platelet, prothrombin time, age, and lactate. Thus, the ML models may be helpful in early predicting high-risk patients with initially stable upper GIB upon admission to the emergency department. However, ML performance relies on the quality of data, and these studies usually had a small sample size (<1000 cases) with no external validation data for their performance.
Shung et al. [18] were the first to conduct a large prospective international study for building an ML model for patients with PUB by testing and comparing the performance of the ML model and the conventional scoring system in 2020. They collected patient data from medical centers in four countries (US, Scotland, England, and Denmark; n = 1958) to build a model that can predict the need for hospital-based intervention (transfusion or hemostatic intervention) or 1 month mortality. Data from two Asia-Pacific sites (Singapore and New Zealand; n = 399) were externally validated. Only nonendoscopic features such as age, sex, clinical symptoms, and laboratory variables (hemoglobin, albumin, international normalized ratio, urea, and creatinine) were selected to build the model. The ML model showed a higher AUC (0.91) than GBS (0.88, p = 0.001), Rockall score (0.73, p < 0.001), and AIMS65 score (0.78, p < 0.001). In the external validation cohort, the ML still achieved a higher AUC (0.90) than GBS (0.87, p = 0.004), Rockall score (0.66, p < 0.001), and AIMS65 score (0.64 (p < 0.001). The proposed ML model improved the identification of low-risk patients who can be safely discharged early from the emergency department. Importantly, this ML model identified more than two times the number of patients with very low risk than the available best-performing clinical risk tool.
After presentation in the hospital, initially stable patients who are at risk for hemodynamic instability requiring blood transfusion must be identified during the dynamic monitoring of the patient status. Levi et al. [19] developed an ML model using publicly available intensive care unit (ICU) databases of 14,620 records with input variables, including several laboratory analyses and demographic information. Their model, which was based on the patient’s vital signs and laboratory test changes in the first 5 h of ICU admission, showed a high level of accuracy (overall AUC, 0.80) in predicting the need for transfusion in the next 24 h of admission.
Therefore, such an algorithm is essential to provide improved risk assessment through the automatic retrieval of information from electronic health records, thereby allowing timely decision support in an already crowded clinical scenario.

3. Application of AI during Endoscopy

Forrest [20] described the endoscopic classification of PUB in 1974 (Figure 2). The classification requires endoscopist judgment of the risk for rebleeding and the need for endoscopic intervention. Current guidelines [3,6,7] suggest that patients who are highly at risk for ulcers, such as those with active spurting, active oozing, or a nonbleeding visible vessel, should receive endoscopic therapy because of the high risk for persistent bleeding or rebleeding, especially when only relying on drug therapy. However, the ability to make a correct classification varies with the endoscopist’s experience, whereby an experienced endoscopist [21,22] can reportedly make better clinical judgment than clinical risk scores [23]. In the study of Laine et al. [24], the rate of correct identification of the endoscopic characteristic of hemorrhage increased as the endoscopic experience increased (performing five cases per month), from 59% to 73% before a training course. After the training course, the increase was related to the training level: fellows, 15% increase; physicians with 0–20 years of experience since training, 8% increase; physicians with an experience of 20 years or more since training, 3% increase. In an Italian study, Forrest Ia/b lesions showed a high interobserver agreement, whereas Forrest II/III lesions exhibited a low agreement [25].
Figure 2. Forrest classification of bleeding peptic ulcers.
To explore whether AI is useful for identifying the endoscopic characteristics of hemorrhage during endoscopy, our study [26] initiated the proposal of a DL model that can classify endoscopic images with different bleeding risks according to the Forrest classification and using 2378 still endoscopic images from 1694 patients with PUB (Figure 3). The agreement of the model was moderate to substantial with the senior endoscopist on the testing dataset. The accuracy of the DL model was higher than that of a novice endoscopist. Therefore, the DL model has potential use, particularly in aiding young endoscopists in decision making during emergent endoscopy.
Figure 3. Illustration of the DL approach for analyzing endoscopy images in peptic ulcer disease: (a) heatmap image showing an active bleeder in the endoscopy image (upper); (b) segmentation of the ulcer area (left) from the original endoscopy image (right).

This entry is adapted from the peer-reviewed paper 10.3390/jcm10163527

This entry is offline, you can click here to edit this entry!