Artificial Intelligence Advancements in Cervical Cancer

Artificial Intelligence Advancements in Cervical Cancer: History

View Latest Version

Please note this is an old version of this entry, which may differ significantly from the current revision.

Subjects: Health Care Sciences & Services

Contributor:

Marta Brandão

Miguel Mascarenhas Saraiva

Artificial intelligence has yielded remarkably promising results in several medical fields, namely those with a strong imaging component. Gynecology relies heavily on imaging since it offers useful visual data on the female reproductive system, leading to a deeper understanding of pathophysiological concepts. The applicability of artificial intelligence technologies has not been as noticeable in gynecologic imaging as in other medical fields so far. However, due to growing interest in this area, some studies have been performed with exciting results. From urogynecology to oncology, artificial intelligence algorithms, particularly machine learning and deep learning, have shown huge potential to revolutionize the overall healthcare experience for women’s reproductive health.

artificial intelligence
gynecology
deep learning
machine learning
cervical cancer

1. Introduction

Over the past years, interest and research in artificial intelligence (AI) technologies and their applicability to medical practice has considerably increased [1]. AI-based systems have made their way into a range of different medical fields, especially in those with a strong imaging component [2], offering exciting prospects for more efficient and effective use of medical images [3].

Artificial intelligence refers to a digitalized computer system that replicates the processing of the human brain [4], its intelligent behavior and critical thinking [5]. By using computer technology, these complex models have the potential to improve patient care by speeding up processes and increasing their accuracy and efficiency [6], with lower human demand [7]. It has proven its benefits in disease diagnosis and treatment, health management, drug research and development, and precision medicine [8].

Indeed, the world is facing a quickly evolving new era with growing needs for higher quality global healthcare [9]. As medical activity generates ever-increasing amounts of digital images and medical records, AI algorithms appear as candidates to handle these data efficiently.

When discussing the concept of artificial intelligence and its subsets, it is important to clarify that AI, machine learning (ML), and deep learning (DL) are overlapping disciplines [10]. In fact, ML uses computer algorithms automatically developed from input training data to recognize patterns within large databases [11]. Thus, these models appear as highly effective tools to predict future outcomes based on new unforeseen data and decision making in various disciplines [12]. Additionally, the models can be refined as new data are continuously added [13].

Furthermore, ML techniques can be either supervised or unsupervised [14]. A supervised algorithm uses a dataset that contains input features, such as output target pairs, labeled at the start of training, to learn mapping and establish meaningful relationships between the input data and the corresponding output, and creating a model that is able to differentiate among output labels. Then, the trained model takes in new, fresh, unseen data and makes predictions or classifications based on the knowledge from labeled examples [15]. Thus, these models depend heavily on high-quality labeled data. Moreover, once a model has been developed, it is tested on the new patient’s data, apart from those included in the training data, to determine its applicability to other people or scenarios [16].

On the other hand, unsupervised ML models are data-driven systems that automatically learn from the relationships between elementary bits of information associated with each variable of a dataset. Contrary to supervised ML, unsupervised ML methods reveal associations or clusters existing within datasets and model patterns without any predefined output data [17]. Unsupervised learning can be particularly beneficial and complement supervised ML approaches. As these methods can discover potentially unrecognized patterns from large databases, they can feed into supervised algorithms, which in turn will build new models to discriminate among the classes of interest [18].

Alternatively, DL is a subset of ML [11]. Convolutional neural networks (CNNs) are a complex multilayer architecture inspired by biological processes, since their design intends to replicate the structure and organization of the visual cortex, where interconnected neurons process and transmit information [19]. Therefore, they are particularly tailored to visual-imagery-related tasks.

Thus, AI algorithms, namely DL and CNNs, hold great promise in the field of medical imaging [2], from image recognition, processing, and reconstruction to automated analysis and classification [20]. Therefore, they are of great contribution to disciplines that rely heavily on images, and gynecology could be a player at the forefront in the development and application of AI models [21,22].

2. Application in Gynecological Imaging

AI application in gynecology is still at an early stage when compared with other specialties. In fact, despite gynecology being one of the areas with the largest imaging component, the impact of AI in practice is still in an embryonic phase. Nevertheless, there is a need to understand the limitations of the available clinical imaging methods, namely clinician workload and intra and interobserver variability, and AI has the potential to overcome these limitations while increasing diagnostic accuracy [23]. However, AI has a huge and recognized potential to assist in repetitive tasks, such as automatically identifying good-quality images and identifying imaging patterns [21].

2.1. Artificial Intelligence Advancements in Cervical Cancer

Cervical cancer is highly prevalent, with a cumulative worldwide incidence of 13.3 cases per 100,000 women-years, which is increased in low-income countries [24]. Additionally, it is associated with a mortality rate of 7.2 deaths per 100,000 women-years [24]. Furthermore, cervical cancer can be easily treated if detected at its early stages [25]. In daily practice, cervical cancer screening is based on human papillomavirus (HPV) testing and cytological examination. Therefore, it depends heavily on the pathologist’s experience, which also is less accurate and has high interobserver variability. Colposcopy is also a critical component of cervical cancer detection. However, because of the increased workload, visual screening leads to misdiagnosis and low diagnostic accuracy [26]. Several authors have advocated the potential of AI-powered cytological examination and colposcopy image analysis, identifying abnormal cells or lesions, thus strengthening cervical cancer screening and diagnostics [27]. This see-and-treat approach allows for earlier and effective treatment of lesions using minimally invasive procedures, such as thermocoagulation, reducing the malignancy and associated mortality [26], while reducing the need for unnecessary biopsies. Table 1 summarizes the most recent evidence about AI models in colposcopy.

The first to study the implementation of an AI model in cervical cancer diagnosis was Mehlhorn and colleagues, namely during colposcopy exams. In 2012, the group developed a computer-assisted diagnostic (CAD) device based on image-processing methods to automatically analyze colposcopy images. The CAD system revealed a diagnostic accuracy of 80%, with a sensitivity of 85% and a specificity of 75%, in differentiating normal or cervical intraepithelial neoplasia grade 1 (CIN1) from high-grade squamous intraepithelial lesions (HSILs)(CIN2 or CIN3) in colposcopy exams [28]. A second study by the same group confirmed the benefit of the CAD application during colposcopy exams’ evaluation, demonstrating an increase in diagnostic accuracy when the exam was evaluated by a less-experienced gynecologist [29]. A Greek group developed and trained a clinical-decision support system (CDSS) based on an artificial neural network to correctly triage 740 women before referral to colposcopy; this was based on the cytological diagnosis and the expression of various biomarkers [30]. Women detected with cervical intraepithelial neoplasia grade 2 or worse (CIN2+) were chosen to undergo colposcopy. The CDSS presented a sensitivity of 89.4%, a specificity of 97.1%, a positive predictive value of 89.4%, and a negative predictive value of 97.1%. This system has the potential to reduce the referral rate for colposcopy when applicated in clinical practice.

Sato et al. were the first to develop a preliminary DL model based on a Keras neural network with 485 images from 158 individuals who underwent colposcopy [31]. The CNN tried to classify colposcopy images and predict post-procedure diagnoses. Patients were classified into three groups: severe dysplasia, carcinoma in situ (CIS), and invasive cancer (IC). Rather than evaluating the performance of a given AI-based model itself, the authors wanted to establish its feasibility and usefulness in clinical practice as quick and efficient way to obtain an accurate preoperative diagnosis that could help doctors in the decision-making process. The model reached 50% accuracy in this dataset.

Asiedu et al. extracted color and textural-based features from visual inspection with acetic acid and lugol’s iodine, and then used the data to train a support vector machine (SVM) model to distinguish cervical intraepithelial neoplasia (CIN) from normal and benign tissue [32]. The proposed framework achieved a sensitivity, specificity, and accuracy of 81.3%, 78.6%, and 80.0%, respectively, achieving better performance than expert physicians using the same dataset. In the same year, Miyagi et al. developed a CNN for classification of cervical squamous intraepithelial lesions from colposcopy images of 330 patients, 97 with low-grade squamous intraepithelial lesions (LSILs) and 213 with HSILs, who underwent colposcopy and lesion biopsy [33]. The CNN differentiated HSILs from LSILs with higher accuracy (82.3% vs. 79.7%) and specificity (88.2% vs. 77.3%), although with slightly lower sensitivity (80.0% vs. 83.1%). A study by the same group in 2020 included the results of human papilloma virus (HPV) testing [34]. The trained CNN revealed an accuracy of 94.1%, higher than gynecologists’ 84.3% global accuracy. This study was one of the first to include additional variables in order to increase the diagnostic accuracy of the CNN.

In 2020, Yuan and colleagues worked on a database composed of 22,330 cases, including 10,365 normal cases, 6357 LSIL cases, and 5608 HSIL cases [35]. Based on a dataset of three frames per case, they developed a ResNet CNN for differentiating between normal images and dysplastic lesions (LSILs or HSILs). The CNN revealed 85% sensitivity, 82% specificity, and 93% accuracy. Also, they created a U-Net CNN capable of delimitating squamous lesions (LSILs or HSILs) in acetic acid and iodine images. The model had 84.7% sensitivity in acetic acid images and 61.6% in lugol’s iodine images. These lesion delimitation models are of utmost importance for guiding colposcopy-based biopsies. Finally, the group developed a MASK-R CNN model to detect HSILs. The model detected HSILs with 84.7% sensitivity in both acetic acid and iodine images, accurately identifying lesions that benefit from treatment.

A Chinese group carried out a study to develop and validate a Colposcopic Artificial Intelligence Auxiliary Diagnostic System (CAIADS) using digital records of 19,435 patients, including colposcopy images and pathological results, which was considered the gold standard [36]. Agreement between CAIADS-graded colposcopy and pathology findings was higher than in expert-interpreted colposcopy (82.2% vs. 65.9%). The CAIADS model was able to increase its diagnostic accuracy after considering patients’ related factors (such as previous cytology results). The new model also revealed a superior ability to predict biopsy sites, with a median mean-intersection-over-union (mIoU) of 0.758.

In 2021, Fu et al. intended to create a model incorporating the results of HPV typing, cytological examination, and colposcopy analysis [37]. First of all, they acquired colposcopy images and created a multiple-image-based DL model using a multivariable logistic regression (MLR), presenting an area under the curve (AUC) of 0.845. Then, the results of the cytology test and HPV test were used to build an ML model, with an AUC of 0.837. Finally, they built a cross-modal integrated model using ML, through combining the multiple-image-based DL model and the Cytology–HPV joint diagnostic model. The authors proved the synergetic benefits of the ensembled model, presenting a higher AUC of 0.921. A ShuffleNet-based cervical precancerous lesion classification method based on colposcopy images was developed by Fang and colleagues [38]. The image dataset was classified into five categories, namely normal, cervical cancer, LSILs (CIN1), HSILs (CIN2/CIN3), and cervical neoplasm. In this dataset, the colposcopy images were expanded to reduce the impact of uneven distribution between the lesions’ categories, Additionally, the ShuttleNet network was compared with other CNNs (like the RestNet or the DenseNet). The new CNN model presented a global accuracy of 81.23%, with an AUC of 0.99. A recent study by Chen et al. collected images from 6002 colposcopy examinations of normal cervixes and those with LSILs and HSILs [39]. A new model based on EficcientNet-B0 using Gate Recurrent Unit was developed in order to accurately identify HSILs. The CNN revealed a sensitivity of 93.6%, specificity of 87.6%, and accuracy of 90.6% in distinguishing between HSILs, LSILs, and normal-cervix images.

Additionally, the diagnosis of cervical cancer can also be guided using magnetic resonance imaging (MRI). Urushibara et al. designed a study including 418 patients, 177 patients with pathologically confirmed cervical cancer and 241 patients without cancer, who underwent MRI between 2013 and 2020 [40]. They compared the performance of a DL architecture, called Xception, with experienced radiologists in the diagnosis of cervical cancer on sagittal T2-weighted images. The CNN presented higher sensitivity (88.3% vs. 78.3–86.7%) and accuracy (90.8% vs. 86.7–89.2%), with similar specificity.

The development of AI models in cervical cancer diagnosis can also be accomplished at the histological level. In fact, in 2019, Sompawong and colleagues applied a Mask Regional Convolutional Neural Network (Mask R-CNN) to analyze cervical cells using liquid-based histological slides and screening for abnormal nuclear features [41]. The proposed algorithm achieved an accuracy of 91.7%, sensitivity of 91.7%, and specificity of 91.7%. In the same year, a group of Indian pathologists trained a CNN to identify abnormal features from liquid-based cytology (LBCC) smears, using 2816 images—816 presenting abnormal features, indicating LSILs or HSILs, and 2000 normal images, containing benign epithelial cells and reactive changes [42]. The referred model yielded a sensitivity of 95.6%, with 79.8% specificity. In addition, its high negative predictive value of 99.1% makes it a potentially valuable tool for cervical cancer screening. The technological development was accompanied by a multicenter observational study that evaluated the performance of AI-assisted cytology for the detection of CIN or cancer [43]. The group used 188,542 digital cytological images to train a supervised DL algorithm. The DL model detected 92.6% of CIN 2 and 96.1% of CIN 3, showing an equivalent sensitivity but higher specificity compared to skilled senior cytologists.

In fact, a validated AI-assisted cytology system, called Landing CytoScanner^®, was enrolled in a cohort study including 0.7 million women [44]. Women with abnormal results in both AI-assisted and manual readings were diagnosed using colposcopy and biopsy. The outcomes were of histologically confirmed CIN of grade 2 or worse (CIN2+). The agreement rate between AI and the manual reading was 94.7% and the kappa value was 0.92. The large number of images analyzed contributed to the robustness of this experiment. Given its ability to exclude most normal cytology, with increased sensitivity compared with manual cytology readings, the results support the AI-based cytology system for primary screening of cervical cancer in a large-scale population. More recently, a Chinese group studied the diagnostic performance of an artificial intelligence-enabled liquid-based cytology (AI-LBC) in triaging women with HPV [45]. AI-LBC achieved sensitivity for the detection of CIN2+ comparable to that of experienced cytologists (86.49% vs. 83.78%), but significantly higher in specificity (51.33% vs. 40.93%). Similar results were observed for CIN3+. Moreover, the AI-LBC reduced colposcopy referral by 10%, compared with cytologists, making the process more effective by reducing the number of false positives in the cytological evaluation. Even though there are positive conclusions, prospective designs are needed to test the triaging performance of the developed model.

In order to increase the diagnostic accuracy of cervical lesions, new image methods have been evaluated. High-resolution endomicroscopy (HRME) consists of a fiber optic fluorescence microscope capable of acquiring nuclear images in vivo. In 2022, Brenes et al. used a dataset of images from over 1600 patients to train, validate, and test a CNN algorithm to diagnose CIN2+ cases from HRME images [46]. The proposed method consistently outperformed the current gold-standard methods, achieving an accuracy of 87%, with a sensitivity of 94% and specificity of 58%. By incorporating the HPV status, specificity increased to 71%.

Finally, AI-models can also provide prognostic information, guiding therapeutic decision. In 2019, Matsuo et al. compared the performance of a DL model with four survival-analysis models, including the Cox proportional hazard regression model, the mainstay for survival analyses in oncologic research in predicting survival in women with cervical cancer [47]. The study included 768 women, with a median follow-up time of 40.2 months. The new model exhibited superior performance, outperforming the prediction models for overall survival, but with similar results in predicting progression-free survival. The prognostic information given using DL algorithms was also evaluated in a retrospective study evaluating 157 women who developed recurrent cervical cancer among 431 women with cervical cancer diagnosed between January 2008 and December 2014 [48]. Predictions of 3- and 6-month survival after recurrence were compared between the current approach (linear regression model) and their experimental approach (DL neural network model). The DL model inputs included some clinical and laboratorial parameters and achieved significantly better prediction for 3-month (AUC 0.747 vs. 0.652) and 6- month (AUC 0.724 vs. 0.685) survival. Better predictions of limited life expectancy in women with recurrent cervical cancer pave the way for even more personalized clinical decisions, thus helping clinicians to individually adjust the level of care provided.

Table 1. Summary of Studies about AI implementation in colposcopy. Sn, sensititivy; Sp, specificity; AUC, area under the curve; CIN, cervical intraepithelial neoplasia; HSIL, high-grade squamous intraepithelial neoplasia; LSIL, low-grade squamous intraepithelial neoplasia; N, normal; VIA, visual inspection with acetic acid; VILI, visual inspection lugol iodine. NK—not known.

Author, Year	Study Aim	Patients n	Frames n	Pathologic Confirmation	AI Methoad	Dataset Method	Analysis Method	Catego-Ries	Performance Metrics %
Author, Year	Study Aim	Patients n	Frames n	Pathologic Confirmation	AI Methoad	Dataset Method	Analysis Method	Catego-Ries	Sn	Sp	AUC
Mehlhorn, 2012, Germany [28]	Detection of CIN 2/3 lesions	198	375 frames (VIA) – Normal: 39 – CIN 1: 41 – CIN 2: 99 – CIN 3: 19	Yes	Color texture analysis methods	frame annotation in VIA (normal vs. CIN I vs. CIN II-III)	n-fold cross validation	HSIL (CIN 2 or CIN3)	85	75	80
Asiedu, 2019, USA [32]	Differentiating normal vs. abnormal (CIN+)	134	Not known Only number of patients per category	Yes	SVM	frame annotiation in VIA and VILI (VILI/VIA positive vs. VILI/VIA negative)	5-fold cross validation (80–20%)	Abnormal (LSIL or HSIL)	81	79	80
Miyagi, 2019, Japan [33]	Differentiating LSIL vs. HSIL	330	1 frame per colposcopy (VIA) – LSIL: 97 – HSIL: 213	Yes	ResNet	frame labeling in acid free (LSIL vs. HSIL)	5-fold cross validation	LSIL vs. HSIL	80	88	83
Yuan, 2020, China [35]	Differentiating normal vs. abnormal (LSIL+)	22,330	3 frames per colposcopy (AF, VIA and VILI) – Normal: 10,365 × 3 – LSIL: 6357 × 3 – HSIL: 5608 × 3	Yes	ResNet	frame annotation in acid-free, VIA and VILI (normal vs. LSIL vs. HSIL)	Train–test validation (80–10–10%)	Abnormal (LSIL or HSIL)	85	82	93
	Predicting the area of lesion (LSIL+)	11,198	11,198 VIA frames + 11,198 VILI frames – Normal: NK – LSIL: NK – HSIL: NK	Yes	U-Net			VIA	85	NK	NK
	Predicting the area of lesion (LSIL+)	11,198		Yes	U-Net			VILI	62	NK	NK
	Detection of HSIL	11,198		Yes	MASK R			VIA	85	NK	NK
	Detection of HSIL	11,198		Yes	MASK R			VILI	85	NK	NK
Xue, 2020, China [36]	Differentiating normal vs. LSIL vs. HSIL vs. cancer	19,435	101,7267 acid-free frames – Normal: NK – LSIL: NK – HSIL: NK – Cancer: NK	Yes	U-Net + YOLO	frame annotation in acid-free (normal vs. LSIL vs. HSIL vs. Cancer)	Train–test validation (70–10–20%)	LSIL+	87	49	69
Xue, 2020, China [36]	Differentiating normal vs. LSIL vs. HSIL vs. cancer	19,435		Yes	U-Net + YOLO		Train–test validation (70–10–20%)	HSIL+	66	90	78
Chen, 2022, China [39]	Differentiating LSIL vs. HSIL	6002	18,006 frames (AF, VIA and VILI)	Yes	E-B0 with GRU	frame labeling in acid-free, VIA and VILI (LSIL vs. HSIL)	Train–test validation (60–20–20%)	LSIL vs. HSIL	88	94	91
Fang, 2022, China [38]	Differentiating normal vs. cervical cancer vs. LSIL vs. HSIL vs. cervical neoplasm	1189	6996 acid-free frames – Normal: 2352 – LSIL: 780 – HSIL: 2532 – Cervical cancer: 408 – Cervical neoplasm: 924	Not mentioned	ShuffleNet	frame labeling in acid free (normal vs. LSIL vs. HSIL vs. cervical cancer vs. cervical neoplasm) + data augmentation	train–test (90–10%)	N vs. all	90	NK	NK
								LSIL vs. all	86	NK	NK
								HSIL vs. all	82	NK	NK
								Cervical neoplasm		NK	NK
								Cervical cancer		NK	NK

This entry is adapted from the peer-reviewed paper 10.3390/jcm13041061

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.