Shung et al.
[18] were the first to conduct a large prospective international study for building an ML model for patients with PUB by testing and comparing the performance of the ML model and the conventional scoring system in 2020. They collected patient data from medical centers in four countries (US, Scotland, England, and Denmark;
n = 1958) to build a model that can predict the need for hospital-based intervention (transfusion or hemostatic intervention) or 1 month mortality. Data from two Asia-Pacific sites (Singapore and New Zealand;
n = 399) were externally validated. Only nonendoscopic features such as age, sex, clinical symptoms, and laboratory variables (hemoglobin, albumin, international normalized ratio, urea, and creatinine) were selected to build the model. The ML model showed a higher AUC (0.91) than GBS (0.88,
p = 0.001), Rockall score (0.73,
p < 0.001), and AIMS65 score (0.78,
p < 0.001). In the external validation cohort, the ML still achieved a higher AUC (0.90) than GBS (0.87,
p = 0.004), Rockall score (0.66,
p < 0.001), and AIMS65 score (0.64 (
p < 0.001). The proposed ML model improved the identification of low-risk patients who can be safely discharged early from the emergency department. Importantly, this ML model identified more than two times the number of patients with very low risk than the available best-performing clinical risk tool.
After presentation in the hospital, initially stable patients who are at risk for hemodynamic instability requiring blood transfusion must be identified during the dynamic monitoring of the patient status. Levi et al.
[19] developed an ML model using publicly available intensive care unit (ICU) databases of 14,620 records with input variables, including several laboratory analyses and demographic information. Their model, which was based on the patient’s vital signs and laboratory test changes in the first 5 h of ICU admission, showed a high level of accuracy (overall AUC, 0.80) in predicting the need for transfusion in the next 24 h of admission.
Therefore, such an algorithm is essential to provide improved risk assessment through the automatic retrieval of information from electronic health records, thereby allowing timely decision support in an already crowded clinical scenario.
3. Application of AI during Endoscopy
Forrest
[20] described the endoscopic classification of PUB in 1974 (
Figure 2). The classification requires endoscopist judgment of the risk for rebleeding and the need for endoscopic intervention. Current guidelines
[3][6][7][3,6,7] suggest that patients who are highly at risk for ulcers, such as those with active spurting, active oozing, or a nonbleeding visible vessel, should receive endoscopic therapy because of the high risk for persistent bleeding or rebleeding, especially when only relying on drug therapy. However, the ability to make a correct classification varies with the endoscopist’s experience, whereby an experienced endoscopist
[21][22][21,22] can reportedly make better clinical judgment than clinical risk scores
[23]. In the study of Laine et al.
[24], the rate of correct identification of the endoscopic characteristic of hemorrhage increased as the endoscopic experience increased (performing five cases per month), from 59% to 73% before a training course. After the training course, the increase was related to the training level: fellows, 15% increase; physicians with 0–20 years of experience since training, 8% increase; physicians with an experience of 20 years or more since training, 3% increase. In an Italian study, Forrest Ia/b lesions showed a high interobserver agreement, whereas Forrest II/III lesions exhibited a low agreement
[25].
Figure 2. Forrest classification of bleeding peptic ulcers.
To explore whether AI is useful for identifying the endoscopic characteristics of hemorrhage during endoscopy, our study
[26] initiated the proposal of a DL model that can classify endoscopic images with different bleeding risks according to the Forrest classification and using 2378 still endoscopic images from 1694 patients with PUB (
Figure 3). The agreement of the model was moderate to substantial with the senior endoscopist on the testing dataset. The accuracy of the DL model was higher than that of a novice endoscopist. Therefore, the DL model has potential use, particularly in aiding young endoscopists in decision making during emergent endoscopy.
Figure 3. Illustration of the DL approach for analyzing endoscopy images in peptic ulcer disease: (a) heatmap image showing an active bleeder in the endoscopy image (upper); (b) segmentation of the ulcer area (left) from the original endoscopy image (right).