Histopathological Gastric Cancer Detection on GasHisSDB Dataset

Histopathological Gastric Cancer Detection on GasHisSDB Dataset: Comparison

Please note this is a comparison between Version 1 by Ming Ping Yong and Version 2 by Catherine Yang.

Gastric cancer is a leading cause of cancer-related deaths worldwide, underscoring the need for early detection to improve patient survival rates. Deep learning pre-trained networks have shown promise in this regard, but each model can only extract a limited number of image features for classification. To overcome this limitation, the use of ensemble models, which combine the decisions of multiple pre-trained networks, proves to be effective.

histopathology
gastric cancer
deep learning
convolutional neural network

1. Introduction

Gastric cancer is one of the most common cancers and leading causes of cancer-related mortality [1]. Gastric cancer is considered a single heterogeneous disease with several histopathologic characteristics [2], where the gastric cancer presents distinct subtype with different histologic appearance, making the detection a non-trivial task. The clinical gold standard of gastric cancer detection is histopathology screening of a biopsy or surgical specimen using a microscope to identify the cancerous features [3]. This is done conventionally by pathologists by manually screening the tissue biopsies, first by using a low magnification factor to search for potential cancerous region(s) with naked eyes. Once a suspicious region is identified, the pathologists will switch to a high magnification factor to analyze the details of the region. During the diagnostic procedure, the pathologists assess the gigapixel-sized whole slide image (WSI) by traversing the WSI to find the small abnormal region of interest (ROI) as described above repeatedly, to make diagnostic decisions.

However, this conventional and manual visual analysis of tissue biopsies by pathologists is extremely laborious, time-consuming, and subjective, where the conclusion drawn by a pathologist can be different from another. The correct analysis of histopathology is highly dependent upon the expertise and experience of the pathologists. This makes the manual histopathological analysis prone to human errors such as misdetection and misdiagnosis, coupled with a shortage of pathologists, leading to long backlogs in the processing of patient cases and consequently increases the likelihood of delayed cancer detection.

Since most gastric cancers are adenocarcinomas, there are no apparent symptoms in the early stage or may present with non-specific symptoms such as gastric discomfort which are often mistaken as gastric ulcers and gastritis [4]; this causes a delay in the gastric cancer detection. Early detection of gastric cancer is the key factor to reduce mortality [5]. This can be observed in patients with an early gastric cancer diagnosis and detection; they have a survival rate of above 90% [6]. When detected in the late stage, the survival rate reduces substantially to below 30% ^[7][8][7,8].

The limitations of the manual diagnostic workflow lead to the development of computer-aided diagnosis (CAD) to assist pathologists by making the diagnosis more efficient and autonomous. CAD is gaining attention and becoming more accessible nowadays due to the advancement in digital pathology, resulting in slide scanning quality improvement and cost reduction in digital storage [9]. In addition, these systems not only reduce the time and cost of cancer diagnosis but also the inter-pathologist variability in diagnostic decisions [10].

For gastric cancer detection using histopathological images, various CAD techniques have been explored based on classification and segmentation models. Machine learning is the conventional CAD approach used to perform gastric cancer detection. In this approach, the used models extract handcrafted features such as color, texture, and shape features for the detection ^[11][12][13][11,12,13]. The common machine learning classifiers are support vector machine (SVM), random forest, and Adaboost ^[14][15][16][14,15,16].

Later, the deep learning approach is introduced to automate feature selection. Many works have reported deep convolutional neural networks (CNN) achieve promising performance in histopathological image classification and segmentation tasks in cancer ^[17][18][19][17,18,19], metastasis ^[20][21][20,21], and gene mutation ^[22][23][22,23] analysis; some even reported performance comparable to pathologists’ assessment ^{[9][24][25][26][27][28]}[9,24,25,26,27,28].

2. Histopathological Gastric Cancer Detection on GasHisSDB Dataset Using Deep Ensemble Learning

The classical machine learning approach based on handcrafted feature extractions was used in automating histopathology tasks initially. Doyle et al. ^[29][33] extracted various combinations of handcrafted textural and graph features such as gray level features, Haralick features, Gabor filter features, the Voronoi diagram, Delaunay triangulation, minimum spanning tree, and nuclear features. After that, the authors applied spectral clustering algorithms as dimensionality reduction methods to filter the useful features before passing them to SVM to classify whether the images are normal or breast cancer. The model achieved an accuracy of 95.8% in cancerous image detection and 93.3% in cancer image grading. In the work of Kather et al. ^[30][34], six distinct sets of handcrafted texture descriptors including lower-order and higher-order histogram features, local binary patterns, gray-level co-occurrence matrix, Gabor filters, and perception-like features were combined into a feature set; after that, various classifiers including the 1-nearest neighbor, linear SVM, radial-basis function SVM, and decision trees were used for the colorectal image binary and multiclass classification. The proposed work managed to achieve 98.6% accuracy in the binary classification and 87.4% accuracy in the multiclass study. Although the classical machine learning approach can achieve promising performance, it requires in depth expertise in the histopathology domain to design meaningful features, which serve as its shortcoming and barrier to developing an effective machine learning model. To address this problem, deep learning approach is introduced for histopathology task automation. Unlike machine learning, deep learning models do not require handcrafted features as the input; they can learn the required features automatically. However, a huge dataset is usually needed for the deep learning models to learn the features effectively and then achieve a high performance. Data augmentation and transfer learning are two common methods used to address the huge dataset requirement in training deep learning models. The previous generates artificial samples to expand the dataset. In the work of Sharma and Mehra ^[31][35], the dataset was augmented using flipping, translation, scaling, and rotation technique; Han et al. ^[32][36] balanced the dataset using the augmentation methods including intensity change, rotation, and flipping; Joseph et al. ^[33][37] applied translation, scaling, flipping, and rotation with constant fill mode to expand the dataset. The model accuracies improved by 2.76–12.28% across various magnifications in ^[31][35], 3.4% at the image level and 5.8% at the patient level in ^[32][36], and 4.52–8.17% across various magnifications in ^[33][37] in the respective tasks after the data augmentation. The second method to overcome the huge dataset requirement is transfer learning, where a model that has been trained for one task is applied as a starting point of a model to perform a different task. In the work of Al-Haija et al. ^[34][38], the pre-trained ResNet50 was fine-tuned for the breast cancer classification task; Mehra ^[35][39] compared the transfer learning and training from scratch methods using three models which are VGG-16, VGG-19, and ResNet-50; Celik et al. ^[36][40] proposed transfer learning using the pre-trained networks DenseNet-161 and ResNet-50. The pre-trained networks accuracies improved by 5.9–14.76% in ^[34][38], 12.67% (between best performing models) in ^[35][39], and 1.96–6.73% in ^[36][40] in the respective tasks over the custom CNNs or training the models from scratch. Although the methods above have achieved relatively good performance in the histopathological image analysis, there is another notable method called ensemble learning that can be integrated with these methods to further improve the classification performance. Ensemble learning involves aggregating the output decisions of multiple base models, which would be the pre-trained networks in this case, through relatively simple ensemble strategies to make the final predictions. The intuition behind the ensemble model is that each base model may have its limitation in feature extraction despite its good performance, and these limitations can be overcome through the strength of the other base models. Hence, by combining multiple base models, the ensemble model has a wider coverage of extracted features, resulting in better performance. For instance, Ghosh et al. ^[37][41] proposed an ensemble model concatenating the results of DenseNet-121, InceptionResNetV2, Xception, and custom CNN to classify 112,180 colorectal images, which are resized into 100 × 100 pixels, into multiple classes. Different weights were assigned to the results of each base model depending on their individual performance. The ensemble model ultimately achieved 99.13% balanced accuracy. In the work of Zheng et al. ^[38][42], the weighted voting strategy was used as ensemble method to aggregate pre-trained networks including VGG-16, Xception, ResNet-50, and DenseNet-201 in performing breast cancer multiclass classification on 7909 images across four magnifications, achieving accuracy 98.90%. Paladini et al. ^[39][43] proposed using the feature concatenation strategy to aggregate the feature outputs of pre-trained networks including ResNet-101, ResNeXt-50, Inception-V3, and DenseNet-161 and consequently processed the aggregated feature vectors through fully connected and classification layers for the colorectal image classification using the dataset consists of 150 × 150 pixels images, achieving an accuracy of 96.16%. The ensemble models accuracies improved by 1.83–2.16% in ^[37][41], 0.1–5.25% in ^[38][42], and 0.74–2.18% in ^[39][43] in the respective tasks over their corresponding base models. A WSI can be as large as 100,000 × 100,000 pixels; it is costly and time-consuming to annotate the WSI in detail. A common method to process the WSI is to crop it into smaller patches for artificial intelligence training and classification. Downsizing the WSI prior to cropping it into smaller patches is usually conducted for resource constraint centers. This comes at the cost of lower classification performance because the smaller patch size contains less information for classification purposes. Therefore, the selection of patch size demands the consideration of trade-off between computational power and classification performance. With the promising performance shown by the ensemble models supported by its capability of extracting many important features from multiple base models, the ensemble models have the potential to extract sufficient important features from the smaller patch size yet achieve promising performance. This can have significant impact in making WSI with lower resolution to be more accessible to correct classification by deep learning models, consequently reducing the specification of the digital scanner, data storage, and high computational server required in the histopathology tasks. This would translate to more efficient and autonomous histopathological diagnosis, leading to lower likelihood of delayed cancer detection.