Gastrointestinal Tract Polyp Anomaly Segmentation

Gastrointestinal Tract Polyp Anomaly Segmentation: Comparison

Please note this is a comparison between Version 1 by Muhammad Ramzan and Version 2 by Camila Xu.

Computer-aided polyp segmentation is a crucial task that supports gastroenterologists in examining and resecting anomalous tissue in the gastrointestinal tract. The disease polyps grow mainly in the colorectal area of the gastrointestinal tract and in the mucous membrane, which has protrusions of micro-abnormal tissue that increase the risk of incurable diseases such as cancer. A deep learning method, Graft-U-Net, is proposed to segment polyps using colonoscopy frames. Graft-U-Net is a modified version of UNet, which comprises three stages, including the preprocessing, encoder, and decoder stages.

segmentation
convolutional neural network
deep learning
gastrointestinal tract

1. Introduction

The stomach, small intestine, and large intestine (which include the colon, rectum, and anus) are the parts of the gastrointestinal tract (GI tract) ^[1][2][1,2]. The GI tract is the core part of the digestive system of the human body where mucosal findings vary from mild to extremely lethal diseases ^[3][4][3,4]. The mucous membrane has protrusions of abnormal tissue referred to as polyps. Polyps can grow in the GI tract in any place, but most are found in the colorectal area. Non-neoplastic and neoplastic are the two categories of colorectal polyps [5]. Non-neoplastic polyps can be divided into subcategories—hyper-plastic, hamartomata’s polyps, and inflammatory—which are recognized as non-cancerous diseases. On the other hand, neoplastic polyps can become cancerous depending upon the size of the polyps. The growth of polyps mostly takes place in the colorectal area (inner tissue lining); they are non-cancerous but indorse colorectal cancer (CRC), which is a very dangerous and lethal disease. The scope of CRC across the world accounts for nearly 10%, of all cancer-related deaths [6]. The colorectal polyps are analyzed and removed after examining the colon using a standardized colonoscopy procedure. There are different endoscopy methods to examine the GI tract, but Confocal Laser Endomicroscopy (CLE) is a cutting-edge and microscopic-level endoscopic technique that allows for subcellular imaging and optical biopsies to be performed while the patient is being examined. The Colonoscopy expert can use CLE to observe real-time histology images, as well as examine the GI tract, connective tissue, and mucosal cell structure [7]. Histopathology examination (HE) is manually performed by a gastroenterologist for polyp or tumor removal. Neoplastic lesions (adenomatous polyps) are resected to reduce CRC [8]. Similarly, the survival rate is increased by diagnosing colon cancer at its early stage. Endocytoscopy is used in the NBI mode, which allows the endoscopist to acquire real-time microvascular photos with a magnification of 520 [9]. The colonoscopy procedure depends on the operator, who can make mistakes that increase the chance of a higher miss rate of the adenomatous polyps. The size of the polyp can be on a macroscopic level in the tissue of the colon, which provides a hindrance to manual disease detection. Additionally, manually screening is a time-consuming task that requires the doctor to have experience and ability [10]. A procedural colonoscopy is a time-demanding, expensive, and aggressive process whereby air insufflation and a high-quality bowel are required during examination [11]. Colonoscopy data are collected in the form of videos by the clinical centers. The endoscopist captures data in tough routines that are not used efficiently for clinical diagnosis procedures [12]. The number of frames captured in video colonoscopy cannot be observed properly in real-time, which increases the miss rate. CADx systems are employed resourcefully for disease detection and the delineation of polyps. Computer vision and system design are successfully led in medical fields to develop accurate and efficient systems that mainly depend on well-organized data ^{[13][14][15][16][17][18][19]}[13,14,15,16,17,18,19]. Similarly, there is a big bottleneck of public data available for accelerating the development of robust algorithms in this realm [20]. Automatic polyp segmentation has become a thought-provoking task because of the disparities in the shapes, positions, sizes, colors, appearances of polyps, and their masking with mucosa, stool, and other materials that are a hindrance in the correct diagnostics [21]. In the previous studies, different methods of feature extraction (feature map, patterns, color, etc.) were employed for polyp detection, semantic segmentation, localization, and classification ^[22][23][22,23]. Previous studies have found a high rate of missed detection. Recently, emerging convolutional neural network (CNN) deep learning methods have offered a solution to overcome the above-addressed challenges and also improve polyp detection accuracy during colonoscopy. Automatic polyp segmentation is crucial in the medical field. The computer-based identification and localization of polyps using frames of colonoscopy can save time for clinicians, and it also helps them to concentrate on more severe cases. A recent study revealed that deep learning-based automatic polyp segmentation has become a crucial research area that has achieved high accuracy using colonoscopy images and videos [24]. Preferably, a consistent, reliable, and robust computer-aided diagnostics (CADx) system is needed for polyp detection and segmentation.

2. Gastrointestinal Tract Polyp Anomaly Segmentation

Automatic disease detection and segmentation have become active research areas in the past decade ^{[25][26][27][28][29][30][31][32]}[28,29,30,31,32,33,34,35]. Several algorithms and efficient methods have been developed for polyp detection. With the development of methods and algorithms, the texture and color of the polyps were focused on in one research paper by applying handcrafted descriptors for learning features ^[33][36]. An existing study reveals that CNN has become a very famous method in the research industry for the accomplishment of public challenges in the computer vision field ^[34][37]. By using CNN, software modules and algorithms have been designed for edge and polyp detection in the frames ^[35][38]. Colonoscopy images and videos have been used for polyp detection via region-based CNN methods, including transfer learning (Inception and ResNet) and post-processing techniques ^[36][39]. The framework has been performed for disease detection and segmentation problems using the Generative Adversarial Network (GAN) model ^[37][40]. Real-time performance and high-sensitivity algorithms, including the YOLO algorithm, have been developed for polyp segmentation ^[38][41]. Transfer learning for polyp segmentation has been evaluated in terms of specificity and sensitivity ^[39][42]. The computer vision approaches have been improved due to the inclusion of data-driven methods for polyp segmentation ^[40][43]. Object segmentation has been performed using the down- and up-sampling techniques for the pixel-wise classification of polyps ^[41][44]. The fully convolutional network (FCN) has been suggested by Long et al. for polyp dissection ^[42][45]. UNet is the modified and extended architecture of the FCN ^[43][46]. Unet comprises an analysis path and a synthesis path that are recognized as an encoder and a decoder, respectively. The analysis part provides the detail of the deep features, while the synthesis part offers segmentation based on learned features. The encoder–decoder network is a very core component in terms of semantic segmentation in UNet and the FCN ^[27][30]. Multiple variants of UNet for biomedical segmentation are found in the literature. The encoder–decoder in UNet applies convolution layers whereby the encoder extracts essential semantic features ranging from down- to up-level. Table 1 depicts a summary of the existing models that are used for polyp segmentation using the Kvasir-SEG dataset.

Table 1.

A review of the existing models employed for polyp segmentation on the Kvasir-SEG dataset.

Refs.	Years	Type of CNN	Dataset	Results (mDice)
^[44][47]	2022	AMNet	Kvasir-SEG	91.20%
^[45][48]	2022	BSCA-Net		91.00%
^[46][49]	2022	SwinE-Net		93.80%
^[47][50]	2021	MSNet		90.70%
^[48][51]	2021	SANet		90.40%
^[49][52]	2021	UACANet		90.50%

The decoder generates the required segmentation mask by using extracted features from the encoder. The up-sampled (decoder) features are concatenated with the down-sampled (encoder) features using a skip connection. The final output binary masks are produced by the convolutional layers. The pre-trained network, including VGG16 and VGG19 ^[50][53], is replaced by the encoder stage of the UNet model for polyp segmentation tasks. The residual networks are very successful in transfer learning, such as ResNet50 for disease detection and localization ^[51][54]. Identity mapping and 3 × 3 convolutional layers are used by the residual network ^[52][55]. Vanishing gradients and exploding gradients are eliminated in a deeper neural network using identity mapping ^[53][56]. Several clinical endoscopy and colonoscopy image datasets are publicly available, and researchers can use them. In the proposed work, two datasets, CVC-ClinicDB and Kvasir-SEG, are employed for model evaluations.