Computer-Aided Detection False Positives: History
Please note this is an old version of this entry, which may differ significantly from the current revision.
Contributor:

Defining FPs based on the duration of time is an objective way of classifying FPs. However, the threshold required for reporting FPs is unsettled. One report suggested that only FPs > 2 s be reported, and another only reported FPs > 1 s, while the majority of FPs (i.e., more than 90%) lasted <0.5 s. It is unknown whether ignoring the transient FPs (i.e., those lasting for <1 or 2 s) would increase the risk of missing a real polyp.

  • artificial intelligence
  • computer-aided detection
  • colonoscopy
  • false positive
  • water exchange

1. Introduction

Missed lesions account for 57.8% of interval colorectal cancers (i.e., cancers that occur within 3–5 years after a negative colonoscopy) [1]. To reduce incidences of missed lesions and interval cancers, measures were proposed to improve the quality of colonoscopies. One of the most important quality metrics is the adenoma detection rate (ADR), defined as the proportion of patients with at least one adenoma [2].

Artificial intelligence (AI) is being used in the computer-aided detection (CADe) and diagnosis (CADx) of polyps [3]. Randomized controlled trials (RCTs) showed CADe-assisted colonoscopy significantly increased the ADR [4,5,6,7,8]. A meta-analysis confirmed that the ADR was significantly higher in the CADe group than in the conventional group (36.6% vs. 25.2%; RR, 1.44; 95% confidence interval, 1.27–1.62;p< 0.01;I2= 42%) [9].

An accompanying limitation of the CADe is false positives (FPs), which occur when the algorithm identifies a “polyp” that the endoscopist would disagree with. FPs were ranked 3rd in importance among 59 future research questions related to CADe [10]. We assessed CADe-overlaid video analyses, RCTs using real-time CADe to enhance polyp detection during colonoscopies, and studies that used FPs as the primary outcome. We test the hypothesis that the systematic review of the literature on FPs will yield insight into methods of managing and limiting the adverse effects of this drawback of CADe.

2. Adverse Effects of FPs

The time expended to differentiate an FP from a true lesion can potentially increase the withdrawal time. Although most RCTs on the real-time application of CADe found a longer withdrawal time in the CADe group compared to the control group [4,6,7,8], the withdrawal time without biopsy was not significantly different. In a post hoc analysis of a small fraction (40/342 or <11.7%) of the original CADe groups in the RCT studies, Hassan et al. found that 94% of FPs were discarded by the endoscopist immediately without further exploration, and the time wasted on the remaining FPs only contributed to about 1% of the withdrawal time. In a real-life situation, where the bowel preparation is usually less than optimal and endoscopists are less experienced, the impacts of bowel preparation on FPs and withdrawal time require more objective studies.

The presence of FPs might lead to unnecessary biopsies of non-neoplastic tissues. (with another 1 unreported [6] and 1 showing no difference [5]) listed inTable 3showed a significant increase in the biopsy of non-neoplastic polyps in the CADe group, which was typically double the number reported for the control group. The removal of hyperplastic polyps—other than the diminutive ones at the distal rectosigmoid colon—is justified, as these polyps contribute to the serrated pathway of colorectal carcinogenesis [27]. If these biopsies were, in fact, unwarranted, then there exists an avoidable non-indicated use of medical resources.

The application of the CADx to characterize the polyps following their detection with the CADe might help reduce the number of unnecessary polypectomies of non-neoplastic polyps. Preliminary results showed promise for simultaneously classifying polyps with endocytoscopic images [28], or even with white light images [29] after using the CADe to detect the polyps in white light.

The recurrent appearance of FPs on the screen may lead to increased fatigue and decreased vigilance on the part of the endoscopist [30]. Inundating the endoscopist with such a large amount of prompts on the screen, even if only very transient attention is demanded for each prompt, engenders the risk of the fatigue of the endoscopist. However, a study showed that a real-time CADe system, integrated on one primary endoscopy monitor instead of the two monitors used in most RCTs (Table 3), improved the ADR without an increase in the subjective fatigue level reported by the endoscopists during the colonoscopy [14]. The unblinded report, developed by proponents of the CADe algorithm under study, raised questions regarding the objectivity of the results.

False positives cause distractions and the need for refocusing, potentially resulting in adverse effects during the search for real polyps. To illustrate how difficult it is to refocus after distraction, a study on mobile phone use while driving showed that the risk of a rear-end accident occurring increased by 2.34–3.56 times, despite increasing their time headway by 0.41–0.59 s to offset the distraction of texting while driving [32].

Too many FPs may hamper the enthusiasm of the endoscopist to apply the CADe in clinical practice. One recent survey on the views of gastroenterologists regarding the potential use of artificial intelligence found that 33.9% of respondents worried about high numbers of FPs [33]. Reports that emphasize the lack of importance of FPs based on subjective assessment need to be re-evaluated by studies with more objective and unbiased designs.

3. How to Address the Occurrence of FPs

There is considerable variability in FPRs in the literature (Table 1). This variability suggests that there are diverse definitions of FPs and various conditions that affect the occurrence of FPs inside the bowel lumen, which indicates that there is an opportunity to minimize FPs through standardizing the definitions of FPs and optimizing the condition of the bowel lumen.

An example of a simple method that could be used to reduce FPs is re-training the CADe algorithms with scenarios that currently lead to FPs. Another approach could be the adoption of recurrent neural networks, which have memory and can process temporal sequences of frames in a way that is similar to the learning process of human brains [10]. (You Only Look Once, Version 3), a state-of-the-art, real-time object detection algorithm, better specificity was achieved (increasing from 90.9% to 93.7%) To filter out most short flashes, Podlasek et al. suggested setting a threshold of persistent time for FPs to show up; however, this method might introduce a minor detection lag, depending on the desired sensitivity [22].

Optimal bowel preparation is the prerequisite for a high-quality CADe-assisted colonoscopy and is associated with fewer FPs [13]. As the major source of CADe FP alerts is the wrinkled walls, they can be reduced by ensuring adequate luminal insufflation. The use of an anti-spasmodic agent, such as Hyoscine-n-butylbromide, might be helpful in reducing the contraction of the colon wall [34]. Adding simethicone or rinse water to the bowel preparation regimen helps eliminate bubble-induced FPs [35,36].

Before the FPs can be effectively reduced, proper training of the endoscopist to recognize and ignore FPs is needed to enable the widespread adoption of the CADe for the detection of colon neoplasms [12].

The optimization of the condition of the bowel lumen can be controlled by the colonoscopist using water exchange colonoscopy, which will be discussed in detail below.

4. Water Exchange and Its Potential Beneficial Effect on Reducing FPs

Among the Gastrointestinal (GI) Endoscopy Editorial Board’s top 10 topics in endoscopy in 2019, water exchange (WE) and artificial intelligence (i.e., CADe) were both considered important advances in GI endoscopy [37]. The coincidence brought both to the forefront of the discussion on the improvement of ADR.

Compared with traditional gas (i.e., air or CO2) insufflation for colonoscopes, WE is an effective insertion method that minimizes insertion pain and enhances ADR [38,39,40]. It features infusing water to guide the scope advancement in an airless lumen, while suctioning the infused water at the same time during insertion, thus aiming at the almost complete removal of the infused water when cecal intubation is achieved. A network meta-analysis concluded that WE produced the highest ADR when compared with water immersion and gas insufflation [41]. A modified Delphi review also endorsed WE as having better bowel cleanliness, as well as less insertion pain and higher ADRs, than gas insufflation [42].

WE can effectively salvage-clean bubbles and fecal debris during insertion, resulting in better bowel cleanliness during withdrawal. WE consistently showed better Boston Bowel Preparation Scale (BBPS) scores than air insufflation, both in the whole colon and the right colon, the latter of which was usually the dirtiest colon segment [38,39,40]. WE might also help reduce FPs associated with crumpled folds, as there is less need for suction cleaning, and thus the related spasms, during withdrawal [43]. In an analysis of the CADe-overlaid withdrawal phase videos of colonoscopies from an RCT comparing right colon ADR inserted with WE or air insufflation, Tang et al.

WE and CADe both increase ADR but through different mechanisms. WE increases ADRs mainly through insertion salvage cleaning, thus revealing otherwise unexposed polyps (Table 5). On the other hand, CADe works as a second observer and points out polyps that are exposed but not recognized due to human error [17]. In other words, the individual strengths of WE and CADe complement the weakness of one another.

This entry is adapted from the peer-reviewed paper 10.3390/diagnostics11061113

This entry is offline, you can click here to edit this entry!