EfficientNetB0 cum FPN-based Semantic Segmentation of Gastrointestinal Tract

EfficientNetB0 cum FPN-based Semantic Segmentation of Gastrointestinal Tract: Comparison

Please note this is a comparison between Version 1 by Neha Sharma and Version 3 by Camila Xu.

This research proposes a hybrid encoder decoder-based model for semantic se segmentation of the Ggastrointestinal tract. Here EfficientNet B0 is used as a bottom-up encoder architecture(GI) organs is crucial in radiation therapy for treating GI cancer. It allows for downsampling to capture contextual informeveloping a targeted radiation by extracting meaningful and discriminative features from input images. The performance of the EfficientNet B0 encoder is compared with three encoders: ResNet 50, MobileNet V2, and Timm Gernet. Here, Feature Pyramid Network (FPN) is used as a top-down decoder architecture for upsampling to recover spatial information. The performance of the FPN decoder is compared with three decoders: PAN, Linknet, and MAnet. Furthermore, the proposed hybrid model is analyzed using Adam, Adadelta, SGD, and RMSprop optimizerstherapy plan while minimizing radiation exposure to healthy tissue, improving treatment success, and decreasing side effects. Medical diagnostics in GI tract organ segmentation is essential for accurate disease detection, precise differential diagnosis, optimal treatment planning, and efficient disease monitoring.

semantic segmentation
gastrointestinal tract
FPN
PAN
MAnet
LinkNet

1. Introduction

The gastrointestinal (GI) tract aids digestion by breaking down and absorbing food. However, gastrointestinal cancer is a significant public health concern affecting millions globally [1]. Tumors of the esophagus, stomach, large intestine, and small intestine are all examples of GI cancer [2]. The choice of diagnostic method or combination of methods is based on the patient’s symptoms, the suspected condition, and the healthcare provider’s clinical judgment. The accuracy of a diagnosis is essential for the effective treatment and management of diseases. Despite the availability of options such as surgery, chemotherapy, and targeted therapy, radiation therapy has proved to be an effective treatment for GI cancer [3].

Radiation therapy, which employs high-intensity radiation to kill cancer cells, is typically used with other medicines. However, because the GI tract organs are convoluted and irregular in shape, accurate and precise targeting of cancer cells is essential to the success of radiation treatment [4]. Medical diagnostics in GI tract organ segmentation is critical for specific illness detection, multiple diagnosis, appropriate therapy planning, and effective disease monitoring. Diagnostic tests assist in localizing and diagnosing illnesses or anomalies in the GI system by segmenting the organs, allowing for focused treatments and personalized treatment options. Accurate segmentation helps differentiate distinct GI illnesses with similar symptoms, leading to appropriate diagnosis and care. It is critical for detecting the extent and location of conditions, enabling surgical decisions, targeted medicines, and monitoring disease progression or treatment response, all of which contribute to better patient outcomes [5].

Deep learning models have demonstrated significant promise in medical image analysis, notably in organ and structural segmentation ^[6][7][6,7]. HThis resere a deep learning arch proposes a hybrid encoder–decoder-based model for semantic segmentation model isof the GI tract. In the proposed to segment gastrointestinal tract using hybrid model, EfficientNet B0 is used as a bottom-up encoder architecture for downsampling to capture contextual information by extracting meaningful and discriminative features from input images. The performance of the EfficientNet B0 as encoder is compared with that of three encoder and fs: ResNet 50, MobileNet V2, and Timm Gernet. Here, the Feature pPyramid network as decoderNetwork (FPN) is used as a top-down decoder architecture for upsampling to recover spatial information. The performance of the FPN decoder is compared with that of three decoders: PAN, Linknet, and MAnet. Furthermore, the proposed hybrid model is analyzed using Adam, Adadelta, SGD, and RMSprop optimizers. The experiment is carried out utilizing the UW Madison GI tract dataset, which contains 38,496 MRI pictures of cancer patients.

2. EfficientNetB0 cum FPN-based Semantic Segmentation of Gastrointestinal TractLiterature Review

A significant amount of research has been conducted on gastrointestinal tract segmentation and categorization ^[8][9][10][8,9,10]. Yu et al. developed a unique architecture for polyp identification in the gastrointestinal tract in 2016 [11]. They combine offline and online knowledge to minimize the false acceptance created through offline design and boost recognition results even more. Widespread testing using the polyp segmentation dataset indicated that their solution outperformed others. In 2017, Yuan Y et al. suggested a unique automated computer-aided approach for detecting polyps in colonoscopy footage. They used an unsupervised sparse autoencoder (SAE) to train discriminative features. Then, to identify polyps, a distinctive unified bottom-up and top-down strategy was presented [12]. In 2019, Kang J et al. used the strong object identification architecture “Mask R-CNN” to detect polyps in colonoscopy pictures. They developed a fusion technique to improve results by combining Mask R-CNN designs with differing backbone topologies. They employed three open intestinal polyp datasets to assess the proposed model [13]. In 2019, Cogan T et al. published approaches for enhancing results for a collection of images using full-image pre-processing with a cutting-edge deep learning technique. Three cutting-edge designs based on transfer learning were trained on the Kvasir dataset, and their performance was accessed on the validation dataset. In each example, 80% of the photos from the Kvasir dataset were used to test the model, leaving 20% to validate the model [14]. In 2020, Öztürk et al. developed a successful classification approach for a gastrointestinal tract classification problem. The CNN output is enhanced using a very efficient LSTM structure. To assess the contribution of the proposed technique to the classification performance, experiments were carried out utilizing the GoogLeNet, ResNet, and AlexNet designs. To compare the results of their framework, the same trials were replicated via CNN fusion with ANN and SVM designs [15]. Özturk et al. 2021 presented an artificial intelligence strategy for efficiently classifying GI databases with a limited quantity of labeled images. As a backbone, the proposed AI technique employs the CNN model. Combining LSTM layers yields a categorization. To accurately analyze the suggested residual LSTM architecture, all tests were conducted using AlexNet, GoogLeNet, and ResNet. The proposed technique outperforms previous state-of-the-art techniques [16]. In 2022, Ye R et al. suggested the SIA-Unet, an upgraded Unet design that utilizes MRI data. It additionally contains an attention module that filters the spatial information of the feature map to fetch relevant data. Many trials on the dataset were carried out to assess SIA-Unet’s performance [17]. In 2022, Nemani P et al. suggested a hybrid CNN–transformer architecture for segmenting distinct organs from images. With Dice and Jaccard coefficients of 0.79 and 0.72, the proposed approach is resilient, scalable, and computationally economical. The suggested approach illustrates the principle of deep learning to increase treatment efficacy [18]. Chou, A. et al. used U-Net and Mask R-CNN approaches to separate organ sections in 2022. Their best U-Net model had a Dice score of 0.51 on the validation set, and the Mask R-CNN design received a Dice value of 0.73 [19]. In 2022, Niu H et al. introduced a technique for GI tract segmentation. Their trials used the Jaccard index as the network assessment parameter. The greater the Jaccard index, the better the model. The results demonstrate that their model improves the Jaccard index compared to other methods [20]. In 2022, Li, H, and colleagues developed an improved 2.5D approach for GI tract image segmentation. They investigated and fused multiple 2.5D data production methodologies to efficiently utilize the association of nearby pictures. They suggested a technique for combining 2.5D and 3D findings [21]. In 2022, Chia B et al. introduced two baseline methods: a UNet trained on a ResNet50 backbone and a more economical and streamlined UNet. They examined multi-task learning using supervised (regression) and self-supervised (contrastive learning) approaches, building on the better-performing streamlined UNet. They discovered that the contrastive learning approach has certain advantages when the test distribution differs significantly from the training distribution. Finally, they studied Featurewise Linear Modulation (FiLM), a way of improving the UNet model by adding picture metadata such as the position of the MRI scan cross-section and the pixel height and breadth [22]. Georgescu M. et al. suggested a unique technique for generating ensembles of diverse architectures for medical picture segmentation in 2022 based on the variety (decorrelation) of the models constituting the ensemble. They used the Dice score among model pairs to measure the correlation between the outputs of the two models that comprise each pair. They chose models with low Dice scores to foster variety. They conducted gastrointestinal tract image segmentation studies to compare their diversity-promoting ensemble (DiPE) with another technique for creating ensembles that relies on picking the highest-scoring U-Net models [23].

3. Input Dataset

This research employs magnetic resonance imaging (MRI) data collected from patients who underwent MRI-guided radiotherapy at the University of Wisconsin-Madison Carbone Cancer Centre. This research uses a dataset comprising 85 patients, encompassing 38496 scans of various GI parts. The 16-bit grayscale Portable Network Graphics (PNG) layout represents the scans, while the annotations are given in comma-separated values (CSV) representations.The ground truth mask are generated from these annotations using RLE encoder. Hence there are 14085 masks of large bowel, 11201 masks for small bowel whereas 8627 masks are for stomach. 33913 masks have no organ from the GI tract, so these are blank masks. The RLE-encoded masks are used to describe the segmented areas. The dataset is available on the kaggle website [24]. The dimensions of each slice exhibit variability, ranging from 234x234 to 384x384 pixels. Figure 1 shows the image of the dataset with its ground truth masks. Figure 1(a) shows the input image of case32_day19_slice_0089. Figure 1(b) shows the mask for the large bowel, figure 1(c) shows the small bowel, figure 1(d) shows the mask for the stomach, and figure 1(e) shows an image with three concatenated masks.


(a)	(b)	(c)	(d)	(e)

Figure 1. UW Madison GI Tract Dataset, (a) Input Image Mask, (b) Large bowel Mask, (c) Small Bowel Mask, (d) Stomach Mask, and (e) Concatenated Mask [24]

Proposed Methodology

This research presents a segmentation model for segmenting GI tract parts such as the stomach and small and large bowel. Figure 2 depicts the suggested technique, which includes the input dataset, which is the UW Madison GI tract dataset. The second block is a downsampling encoder. Several encoders are used for downsampling in semantic segmentation to derive meaningful and hierarchical representations from the input data. To discover the optimum encoder for the segmentation job, four different encoders were implemented: ResNet 50 [25], EfficientNet B0 [26], MobileNet V2 [27], and Timm Gernet [28]. These encoders are pre-trained transfer learning models that did well on the imagenet dataset. These encoders play a vital role in downsampling the input data, allowing the decoder network to construct accurate and complete semantic segmentation maps of the gastrointestinal system. Different performance measures were used to assess these encoders. The best encoder will then be finalized based on the results and utilized as the encoder component of the final optimized model.

Several decoders are used for upsampling in semantic segmentation to regain spatial resolution and construct high-resolution segmentation maps. Upsampling is required because it restores the fine-grained details lost during downsampling. Dilated convolution-based decoders maintain spatial resolution while increasing the receptive field. By varying the dilation rates in the decoder, these devices successfully capture fine features and contextual information at several scales. The sort of decoder employed is decided by the application's specific requirements and the nature of the target objects. Some decoders are better at capturing little details than others at keeping spatial context. Four alternative decoders were used to determine the optimum decoder for GI tract segmentation. Feature Pyramid Network (FPN) [29], Pyramid Attention Network (PAN) [30], Linknet [31], and MAnet [32] are the names of the four decoders. These segmentation models were chosen for their excellent performance in earlier medical imaging research and their versatility in dealing with characteristics of various sizes. The best decoder is selected based on the findings of these four models.

Figure 2. Proposed Methodology

Optimizers for hyperparameter tuning are the next component of the suggested technique. Semantic segmentation employs several optimizers to improve training efficacy and subsequent model performance. Several variables impact the selection of which optimizer to utilize, including the dataset, model design, available computational resources, and the demands of the segmentation task. In this case, four different optimizers were evaluated: Adam [33], Adadelta [34], RMSprop [35], and SGD [36]. The best optimizer was chosen based on the results obtained by several optimizers. After all the encoder, decoder, and optimizer selection experiments, the most optimized model will be finalized. The final model will partition the input picture into three classes: small bowel, big colon, and stomach. In both the mask and the segmented image, yellow represents the big intestine, green represents the small colon, and red represents the stomach.

Results & Discussions

This section displays the results of the different encoder, decoder, and optimizer evaluations. The implementation used the Google Colab platform, Keras and TensorFlow environments, and the Python programming language.

Encoder Evaluation for Downsampling

Figure 3 evaluates four encoders that segment GI organs in the GI tract using the Dice coefficient, Jaccard coefficient, and loss. The four encoders are EfficientNet B0, MobileNet V2, Timm_Gernet_S, and ResNet 50. Figure 4 compares different encoders in terms of the processing time required by each encoder model. The findings reveal that EfficientNet B0 had the most significant Dice coefficient of 0.8975 and Jaccard coefficient of 0.8832, with a loss of 0.1251 and the shortest processing time of 2 hours and 25 minutes. MobileNetV2 likewise performed well, with a Dice coefficient of 0.8968, a Jaccard coefficient of 0.866, and a loss of 0.1378, but needed slightly more processing time than EfficientNet B0. Timm_gernet_s obtained a Dice coefficient of 0.8917, a Jaccard coefficient of 0.8610, and a loss of 0.1351 in 2 hours and 26 minutes. ResNet 50 got the same Dice and Jaccard coefficients as Adam, with a loss of 0.1301 and a processing time of 2 hours and 39 minutes. In conclusion, the results indicate that EfficientNet B0 is the most effective encoder model for segmenting GI organs in the GI tract.

Figure 3. Dice, Jaccard, and Loss Comparison of Different Encoders

Figure 4. Processing Time Comparison for Different Encoders

Best Encoder- EfficientNet B0

The EfficientNet-B0 architecture has become a well-known convolutional neural network (CNN) architecture suitable for use as an encoder in semantic segmentation tasks. The EfficientNet-B0 has been used in the proposed research design as a backbone network to extract features from the input image using downsampling. The current study proposes a unique network design using a compound scaling strategy. A very accurate and efficient model is produced by this approach, which balances the network's depth, breadth, and resolution.


(a)	(b)

(c)

Figure 5. Results with Best Encoder- EfficientNet B0 (a) Validation Dice Coefficient, (b) Validation Jaccard Coefficient, and (c) Validation loss

EfficientNet-B0 is a convolutional neural network architecture composed of multiple blocks, each incorporating a blend of convolutional layers, activation functions, and pooling operations. It is a convolutional neural network architecture widely used for image classification tasks. In the context of semantic segmentation, the output of EfficientNet-B0 is commonly utilized as input to a decoder network. Using EfficientNet-B0 as an encoder for semantic segmentation has resulted in exceptional levels of accuracy and efficiency across a range of applications, including medical image segmentation [26]. Figure 5 shows the plots of the encoder model. Figure 5(a) shows the validation dice coefficient, figure 5(b) shows the validation Jaccard coefficient and figure 5(c) shows the model loss plot. The EfficientNet B0 outperformed when compared with different encoders such as ResNet 50, MobileNet V2, and Timm Gernet.

Decoder Evaluation for Upsampling

Figure 6 evaluates four decoders to segment GI organs in the GI tract using the Dice coefficient, Jaccard coefficient, and loss. The names of the four decoders used are FPN, PAN, LinkNet, and MAnet. Figure 7 compares different decoders in terms of the processing time required by each decoder model. FPN had the most significant Dice coefficient of 0.8975 and Jaccard coefficient of 0.8832, with a loss of 0.1251 and a processing time of 2 hours and 39 minutes. PAN fared similarly to FPN, with a Dice coefficient of 0.8936, a Jaccard coefficient of 0.8638, and a loss of 0.1278. It took significantly longer to process. Linknet produced a Dice coefficient of 0.8865, a Jaccard coefficient of 0.8567, and a loss of 0.1319 in 2 hours and 36 minutes of processing time. MAnet, on the other hand, had the lowest Dice and Jaccard coefficients and the most significant loss, with a Dice and Jaccard coefficient of 0.7141 and a loss of 0.3685. MAnet also needed the most processing time (3 hours and 7 minutes). Finally, the results indicate that FPN is the most successful segmentation model for segmenting GI organs in the GI tract.

Figure 6. Dice, Jaccard, and Loss Comparison of Different Decoders

Figure 7. Processing Time Comparison for Different Decoders

Best Decoder- FPN

The FPN segmentation model is a famous deep-learning architecture for medical picture segmentation and other semantic segmentation problems. The FPN segmentation model's structure entails a segmentation head, a top-down pathway, lateral connections, and a backbone network. After several up-sampling and convolutional layers, the top-down route produces feature maps with varying spatial resolutions. The feature maps from the top-down pathway are linked to the feature maps from the backbone network through lateral connections. Because of this, the model can accurately represent details across several scales. The segmentation head then uses the fused feature maps to predict the segmentation masks for the various item classes in the input picture. As a result of its well-designed architecture, the FPN segmentation model is widely used in a wide variety of picture segmentation tasks [29]. Figure 8 shows the plots of the FPN segmentation model. Figure 8(a) shows the validation dice coefficient, figure 8(b) shows the validation Jaccard coefficient, and figure 8(c) shows the model loss plot. The FPN outperformed decoders such as PAN, Linknet, and MAnet.


(a)	(b)

(c)

Figure 8. Result with Best Decoder- FPN (a) Validation Dice Coefficient, (b) Validation Jaccard Coefficient, and (c) Validation loss

Optimizer Evaluation for Hyperparameter Tuning

Figure 9 evaluates the performance of the proposed model with four optimizers that segment GI organs in the GI tract using the Dice coefficient, Jaccard coefficient, and loss. Figure 10 compares different optimizers regarding the processing time required by the proposed model. The findings reveal that the Adam optimizer obtained the most significant Dice coefficient of 0.8975 and Jaccard coefficient of 0.8832, with the lowest loss of 0.1251. Adam needed 2 hours and 28 minutes to complete his processing. RMS prop also performed well, with a Dice coefficient of 0.8905, a Jaccard coefficient of 0.8605, and a loss of 0.1377. However, it took a little longer to digest than Adam. SGD and Ada Delta, on the other hand, achieved worse Dice and Jaccard coefficient performance and more significant loss than the other optimizers. SGD had a Dice coefficient of 0.7531 and a Jaccard value of 0.7253, with a loss of 0.3571, whereas Ada Delta had a Dice coefficient of 0.7472, a Jaccard coefficient of 0.7204, and a loss of 0.3692. In conclusion, the results indicate that Adam is the most effective optimizer for segmenting GI organs in the GI tract.

Figure 9. Dice, Jaccard, and Loss Comparison of Different Optimizers

Figure 10. Processing Time Comparison for Different Optimizers

Best Optimizer- Adam

The Adam optimizer is a common choice for training deep neural networks for semantic segmentation problems. Adam stands for "Adaptive Moment Estimation," an adaptation of the stochastic gradient descent (SGD) optimizer that employs adaptive learning rates for each weight parameter in the network [33]. Adam operates in semantic segmentation by modifying the learning rate for each weight parameter based on its first and second moments. This adaptive learning rate modification leads to faster convergence and better optimization performance than classic gradient descent-based optimizers. Adam can also handle sparse gradients, which is helpful for segmentation jobs in which many pixels have no labels. The optimizer's hyperparameters, such as learning rate and momentum, may be modified to optimize segmentation performance on a given dataset. Adam is a popular choice for semantic segmentation problems because of its quick convergence, variable learning rate modification, and capacity to handle sparse gradients. Figure 11 shows the plots of the Adam optimizer. Figure 11(a) shows the validation dice coefficient, figure 11(b) shows the validation Jaccard coefficient and figure 11(c) shows the model loss plot. The Adam optimizer outperformed when compared with different optimizers such as AdaDelta, RMSprop, and SGD.


(a)	(b)

(c)

Figure 11. Results with Best Optimizer- Adam (a) Validation Dice Coefficient, (b) Validation Jaccard Coefficient, and (c) Validation loss

Visualization of Results for the Best Optimized Model

Figure 12 depicts the results of the model in the form of images. Figure 12 includes the input image, ground truth mask, and the predicted segmented image. Here yellow represents the large bowel, green is for the small bowel, and red is for the stomach. The similarity between the ground truth mask and the segmented image shows how much the suggested method can accurately segment the input image. It can be seen in the images that the segmented images are very much similar to the ground truth mask of the input image. So the proposed model can segment the MRI scan of the Gastrointestinal tract to assist radiation therapy for speeding up the treatment.

Input Image	Ground Truth Mask	Segmented Image

Figure 12. Visualization of Results

State of Art Comparison for UW Madison GI Tract Dataset

Table 1 summarises several approaches and their associated outcomes segmentation of GI tract organs using the UW Madison GI tract dataset. The references and years of publication are provided, and the procedures utilized and the findings obtained are mentioned in Table 1. In 2022, the SIA UNet method received a Dice score of 0.78. CNN Transformer earned a somewhat higher Dice score of 0.79 and an IoU score of 0.72. The combination of UNet and Mask RCNN yielded a Dice score of 0.51. Furthermore, UNet on 2.5D data produced a Dice score of 0.36% and an IoU score of 0.12%. An ensemble of multiple architectures performed well, with a Dice score of 0.88. Finally, the proposed model, a Hybrid EfficientNet B0, and FPN, received the highest Dice score of 0.8975 and an IoU score of 0.8832. the table reveals that the proposed model outperforms the state of art results for the UW Madison GI tract dataset to segment GI tract organs.

Table 1. State-of-the-Art Comparison

Ref/ Year	Techniques	Dice	IoU/ Jaccard
[17]/ 2022	SIA UNet	0.78	---
[18]/ 2022	CNN Transformer	0.79	0.72
[19]/ 2022	UNet and Mask RCNN	0.51	---
[20]/ 2022	UNet on 2.5D	0.36	0.12
[21]/ 2022	Ensemble of Different Architecture	0.88	---
[37]/ 2022	UNet	0.8854	0.8819
Proposed Model	EfficientNetB0 and FPN	0.8975	0.8832

Conclusion

The gastrointestinal tract (GI) is a critical mechanism in the human body that aids nutrition, digestion, and absorption. It breaks down food into smaller molecules that the body can absorb and utilize. There has been a significant increase in GI malignancies among men and women in recent years. Radiation therapy is usually considered the most common treatment for GI cancer. The therapy includes employing high-energy X-rays to target malignant cells while leaving avoiding healthy organs in the GI system. Therefore, it is essential to develop an automated method for accurately segmenting GI tract organs to speed up medical therapy. Medical diagnosis in GI tract organ segmentation has various advantages. Accurate segmentation of GI organs enables accurate illness detection and localization, assisting in early diagnosis and tailored therapy planning. This research proposes a hybrid encoder decoder-based model for semantic segmentation of the GI tract. In the proposed hybrid model, EfficientNet B0 is used as bottom-up encoder architecture for downsampling to capture contextual information by extracting meaningful and discriminative features from input images.

In contrast, Feature Pyramid Network (FPN) is a top-down decoder architecture for up-sampling to recover spatial information. The proposed model achieved the dice coefficient and Jaccard index values as 0.8975 and 0.8832, respectively. This research aimed to find the most feasible combination of these components for segmentation optimization. In this study, the best-performing model used EfficientNet B0 as the encoder, FPN as the decoder, and Adam as the optimizer. This strategy is likely to improve cancer therapy efficacy and timeliness.

References

Li, B.; Meng, M.Q.-H. Tumor Recognition in Wireless Capsule Endoscopy Images Using Textural Features and SVM-Based Feature Selection. IEEE Trans. Inf. Technol. Biomed. 2012, 16, 323–329. 1. Li, B.; Meng, M.Q.-H. Tumor Recognition in Wireless Capsule Endoscopy Images Using Textural Features and SVM-Based Feature Selection. IEEE Trans. Inf. Technol. Biomed. 2012, 16, 323–329, doi:10.1109/TITB.2012.2185807.2. Bernal, J.; Sánchez, J.; Vilariño, F. Towards Automatic Polyp Detection with a Polyp Appearance Model. Pattern Recognit. 2012, 45, 3166–3182, doi:10.1016/j.patcog.2012.03.002.3. Zhou, M.; Bao, G.; Geng, Y.; Alkandari, B.; Li, X. Polyp Detection and Radius Measurement in Small Intestine Using Video Capsule Endoscopy. In Proceedings of the 2014 7th International Conference on Biomedical Engineering and Informatics; IEEE, 2014.4. Wang, Y.; Tavanapong, W.; Wong, J.; Oh, J.H.; de Groen, P.C. Polyp-Alert: Near Real-Time Feedback during Colonoscopy. Comput. Methods Programs Biomed. 2015, 120, 164–179, doi:10.1016/j.cmpb.2015.04.002.5. Li, Q.; Yang, G.; Chen, Z.; Huang, B.; Chen, L.; Xu, D.; Zhou, X.; Zhong, S.; Zhang, H.; Wang, T. Colorectal Polyp Segmentation Using a Fully Convolutional Neural Network. In Proceedings of the 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI); IEEE, 2017.6. Dijkstra, W.; Sobiecki, A.; Bernal, J.; Telea, A. Towards a Single Solution for Polyp Detection, Localization and Segmentation in Colonoscopy Images. In Proceedings of the Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications; SCITEPRESS - Science and Technology Publications, 2019.7. Lafraxo, S.; El Ansari, M. GastroNet: Abnormalities Recognition in Gastrointestinal Tract through Endoscopic Imagery Using Deep Learning Techniques. In Proceedings of the 2020 8th International Conference on Wireless Networks and Mobile Communications (WINCOM); IEEE, 2020.8. Du, B.; Zhao, Z.; Hu, X.; Wu, G.; Han, L.; Sun, L.; Gao, Q. Landslide Susceptibility Prediction Based on Image Semantic Segmentation. Comput. Geosci. 2021, 155, 104860, doi:10.1016/j.cageo.2021.104860.9. Gonçalves, J.P.; Pinto, F.A.C.; Queiroz, D.M.; Villar, F.M.M.; Barbedo, J.G.A.; Del Ponte, E.M. Deep Learning Architectures for Semantic Segmentation and Automatic Estimation of Severity of Foliar Symptoms Caused by Diseases or Pests. Biosyst. Eng. 2021, 210, 129–142, doi:10.1016/j.biosystemseng.2021.08.011.10. Scepanovic, S.; Antropov, O.; Laurila, P.; Rauste, Y.; Ignatenko, V.; Praks, J. Wide-Area Land Cover Mapping with Sentinel-1 Imagery Using Deep Learning Semantic Segmentation Models. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 10357–10374, doi:10.1109/jstars.2021.3116094. 11. Yuan, Y., Li, D., & Meng, M. Q. H. (2017). Automatic polyp detection via a novel unified bottom-up and top-down saliency approach. IEEE journal of biomedical and health informatics, 22(4), 1250-1260.12. Yuan, Yixuan, Dengwang Li, and Max Q-H. Meng. "Automatic polyp detection via a novel unified bottom-up and top-down saliency approach." IEEE journal of biomedical and health informatics 22, no. 4 (2017): 1250-1260.13. Kang, Jaeyong, and Jeonghwan Gwak. "Ensemble of instance segmentation models for polyp segmentation in colonoscopy images." IEEE Access 7 (2019): 26440-26447.14. Cogan, T., Cogan, M., & Tamil, L. (2019). MAPGI: Accurate identification of anatomical landmarks and diseased tissue in gastrointestinal tract using deep learning. Computers in biology and medicine, 111, 103351.15. Öztürk, Ş.; Özkaya, U. Gastrointestinal Tract Classification Using Improved LSTM Based CNN. Multimed. Tools Appl. 2020, 79, 28825–28840, doi:10.1007/s11042-020-09468-3.16. Öztürk, Ş.; Özkaya, U. Residual LSTM Layered CNN for Classification of Gastrointestinal Tract Diseases. J. Biomed. Inform. 2021, 113, 103638, doi:10.1016/j.jbi.2020.103638.17. Ye, R.; Wang, R.; Guo, Y.; Chen, L. SIA-Unet: A Unet with Sequence Information for Gastrointestinal Tract Segmentation. In Pacific Rim International Conference on Artificial Intelligence; Springer: Cham, 2022; pp. 316–326.18. Nemani, P.; Vollala, S. Medical Image Segmentation Using LeViT-UNet++: A Case Study on GI Tract Data. arXiv [cs.NE] 2022.19. Chou, A.; Li, W.; Roman, E. GI Tract Image Segmentation with U-Net and Mask R-CNN. Image Segmentation with U-Net and Mask R-CNN.20. Niu, H.; Lin, Y. SER-UNet: A Network for Gastrointestinal Image Segmentation. In Proceedings of the Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics; ACM: New York, NY, USA, 2022.21. Li, H.; Liu, J. Multi-View Unet for Automated GI Tract Segmentation. In Proceedings of the 2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI); IEEE, 2022.22. Chia, B.; Gu, H.; Lui, N. Gastrointestinal Tract Segmentation Using Multi-Task Learning;CS231n: Deep learning for computer vision stanford spring 2022.23. Georgescu, M.-I.; Ionescu, R.T.; Miron, A.-I. Diversity-Promoting Ensemble for Medical Image Segmentation. arXiv [eess.IV] 2022.24. https://www.kaggle.com/competitions/uw-madison-gi-tract-image-segmentation/data25. Rezende, E.; Ruppert, G.; Carvalho, T.; Ramos, F.; de Geus, P. Malicious Software Classification Using Transfer Learning of ResNet-50 Deep Neural Network. In Proceedings of the 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA); IEEE, 2017.26. Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv [cs. LG] 2019.27. Srinivasu, P.N.; SivaSai, J.G.; Ijaz, MF; Bhoi, A.K.; Kim, W.; Kang, J.J. Classification of Skin Disease Using Deep Learning Neural Networks with MobileNet V2 and LSTM. Sensors (Basel) 2021, 21, 2852, doi:10.3390/s21082852.28. Zhang, H.; Dana, K.; Shi, J.; Zhang, Z.; Wang, X.; Tyagi, A.; Agrawal, A. Context Encoding for Semantic Segmentation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE, 2018.29. Pu, B.; Lu, Y.; Chen, J.; Li, S.; Zhu, N.; Wei, W.; Li, K. MobileUNet-FPN: A Semantic Segmentation Model for Fetal Ultrasound Four-Chamber Segmentation in Edge Computing Environments. IEEE J. Biomed. Health Inform. 2022, 26, 5540–5550, doi:10.1109/JBHI.2022.3182722.30. Ou, X.; Wang, H.; Zhang, G.; Li, W.; Yu, S. Semantic Segmentation Based on Double Pyramid Network with Improved Global Attention Mechanism. Appl. Intell. 2023, doi:10.1007/s10489-023-04463-1.31. Chaurasia, A.; Culurciello, E. LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation. arXiv [cs.CV] 2017.32. Chen, B.; Xia, M.; Qian, M.; Huang, J. MANet: A Multi-Level Aggregation Network for Semantic Segmentation of High-Resolution Remote Sensing Images. Int. J. Remote Sens. 2022, 43, 5874–5894, doi:10.1080/01431161.2022.2073795.33. Gill, K.S.; Sharma, A.; Anand, V.; Gupta, R.; Deshmukh, P. Influence of Adam Optimizer with Sequential Convolutional Model for Detection of Tuberculosis. In 2022 International Conference on Computational Modelling, Simulation and Optimization (ICCMSO); IEEE, 2022; pp. 340–344.34. Gill, K.S.; Sharma, A.; Anand, V.; Gupta, R. Brain Tumor Detection Using VGG19 Model on Adadelta and SGD Optimizer. In Proceedings of the 2022 6th International Conference on Electronics, Communication and Aerospace Technology; IEEE, 2022.35. Zou, F.; Shen, L.; Jie, Z.; Zhang, W.; Liu, W. A Sufficient Condition for Convergences of Adam and RMSProp. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE, 2019.36. Gower, R.M.; Loizou, N.; Qian, X.; Sailanbayev, A.; Shulgin, E.; Richtarik, P. SGD: General Analysis and Improved Rates. arXiv [cs.LG] 2019.37. Sharma, Neha, Sheifali Gupta, Deepika Koundal, Sultan Alyami, Hani Alshahrani, Yousef Asiri, and Asadullah Shaikh. "U-Net Model with Transfer Learning Model as a Backbone for Segmentation of Gastrointestinal Tract." Bioengineering 10, no. 1 (2023): 119.
Bernal, J.; Sánchez, J.; Vilariño, F. Towards Automatic Polyp Detection with a Polyp Appearance Model. Pattern Recognit. 2012, 45, 3166–3182.
Zhou, M.; Bao, G.; Geng, Y.; Alkandari, B.; Li, X. Polyp Detection and Radius Measurement in Small Intestine Using Video Capsule Endoscopy. In Proceedings of the 2014 7th International Conference on Biomedical Engineering and Informatics, Dalian, China, 14–16 October 2014.
Wang, Y.; Tavanapong, W.; Wong, J.; Oh, J.H.; de Groen, P.C. Polyp-Alert: Near Real-Time Feedback during Colonoscopy. Comput. Methods Programs Biomed. 2015, 120, 164–179.
Li, Q.; Yang, G.; Chen, Z.; Huang, B.; Chen, L.; Xu, D.; Zhou, X.; Zhong, S.; Zhang, H.; Wang, T. Colorectal Polyp Segmentation Using a Fully Convolutional Neural Network. In Proceedings of the 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Shanghai, China, 14–16 October 2017.
Dijkstra, W.; Sobiecki, A.; Bernal, J.; Telea, A. Towards a Single Solution for Polyp Detection, Localization and Segmentation in Colonoscopy Images. In Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Prague, Czech Republic, 25–27 February 2019.
Lafraxo, S.; El Ansari, M. GastroNet: Abnormalities Recognition in Gastrointestinal Tract through Endoscopic Imagery Using Deep Learning Techniques. In Proceedings of the 2020 8th International Conference on Wireless Networks and Mobile Communications (WINCOM), Reims, France, 27–29 October 2020.
Du, B.; Zhao, Z.; Hu, X.; Wu, G.; Han, L.; Sun, L.; Gao, Q. Landslide Susceptibility Prediction Based on Image Semantic Segmentation. Comput. Geosci. 2021, 155, 104860.
Gonçalves, J.P.; Pinto, F.A.C.; Queiroz, D.M.; Villar, F.M.M.; Barbedo, J.G.A.; Del Ponte, E.M. Deep Learning Architectures for Semantic Segmentation and Automatic Estimation of Severity of Foliar Symptoms Caused by Diseases or Pests. Biosyst. Eng. 2021, 210, 129–142.
Scepanovic, S.; Antropov, O.; Laurila, P.; Rauste, Y.; Ignatenko, V.; Praks, J. Wide-Area Land Cover Mapping with Sentinel-1 Imagery Using Deep Learning Semantic Segmentation Models. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 10357–10374.
Yuan, Y.; Li, D.; Meng, M.Q.H. Automatic polyp detection via a novel unified bottom-up and top-down saliency approach. IEEE J. Biomed. Health Inform. 2017, 22, 1250–1260.
Poorneshwaran, J.M.; Kumar, S.S.; Ram, K.; Joseph, J.; Sivaprakasam, M. Polyp Segmentation Using Generative Adversarial Network. In Proceedings of the 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, 23–27 July 2019; pp. 7201–7204.
Kang, J.; Gwak, J. Ensemble of instance segmentation models for polyp segmentation in colonoscopy images. IEEE Access 2019, 7, 26440–26447.
Cogan, T.; Cogan, M.; Tamil, L. MAPGI: Accurate identification of anatomical landmarks and diseased tissue in gastrointestinal tract using deep learning. Comput. Biol. Med. 2019, 111, 103351.
Öztürk, Ş.; Özkaya, U. Gastrointestinal Tract Classification Using Improved LSTM Based CNN. Multimed. Tools Appl. 2020, 79, 28825–28840.
Öztürk, Ş.; Özkaya, U. Residual LSTM Layered CNN for Classification of Gastrointestinal Tract Diseases. J. Biomed. Inform. 2021, 113, 103638.
Ye, R.; Wang, R.; Guo, Y.; Chen, L. SIA-Unet: A Unet with Sequence Information for Gastrointestinal Tract Segmentation. In Pacific Rim International Conference on Artificial Intelligence; Springer: Cham, Switzerland, 2022; pp. 316–326.
Nemani, P.; Vollala, S. Medical Image Segmentation Using LeViT-UNet++: A Case Study on GI Tract Data. arXiv 2022, arXiv:2209.07515.
Chou, A.; Li, W.; Roman, E. GI Tract Image Segmentation with U-Net and Mask R-CNN. Image Segmentation with U-Net and Mask R-CNN. Available online: http://cs231n.stanford.edu/reports/2022/pdfs/164.pdf (accessed on 4 June 2023).
Niu, H.; Lin, Y. SER-UNet: A Network for Gastrointestinal Image Segmentation. In Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics, Nanjing, China, 24–26 June 2022.
Li, H.; Liu, J. Multi-View Unet for Automated GI Tract Segmentation. In Proceedings of the 2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), Chengdu, China, 19–21 August 2022.
Chia, B.; Gu, H.; Lui, N. Gastrointestinal Tract Segmentation Using Multi-Task Learning; CS231n: Deep Learning for Computer Vision Stanford Spring. 2022. Available online: http://cs231n.stanford.edu/reports/2022/pdfs/75.pdf (accessed on 4 June 2023).
Georgescu, M.-I.; Ionescu, R.T.; Miron, A.-I. Diversity-Promoting Ensemble for Medical Image Segmentation. arXiv 2022, arXiv:2210.12388.