Artificial Intelligence Advancements in Cervical Cancer: History
Please note this is an old version of this entry, which may differ significantly from the current revision.

Artificial intelligence has yielded remarkably promising results in several medical fields, namely those with a strong imaging component. Gynecology relies heavily on imaging since it offers useful visual data on the female reproductive system, leading to a deeper understanding of pathophysiological concepts. The applicability of artificial intelligence technologies has not been as noticeable in gynecologic imaging as in other medical fields so far. However, due to growing interest in this area, some studies have been performed with exciting results. From urogynecology to oncology, artificial intelligence algorithms, particularly machine learning and deep learning, have shown huge potential to revolutionize the overall healthcare experience for women’s reproductive health.

  • artificial intelligence
  • gynecology
  • deep learning
  • machine learning
  • cervical cancer

2.1. Artificial Intelligence Advancements in Cervical Cancer

Cervical cancer is highly prevalent, with a cumulative worldwide incidence of 13.3 cases per 100,000 women-years, which is increased in low-income countries [24]. Additionally, it is associated with a mortality rate of 7.2 deaths per 100,000 women-years [24]. Furthermore, cervical cancer can be easily treated if detected at its early stages [25]. In daily practice, cervical cancer screening is based on human papillomavirus (HPV) testing and cytological examination. Therefore, it depends heavily on the pathologist’s experience, which also is less accurate and has high interobserver variability. Colposcopy is also a critical component of cervical cancer detection. However, because of the increased workload, visual screening leads to misdiagnosis and low diagnostic accuracy [26]. Several authors have advocated the potential of AI-powered cytological examination and colposcopy image analysis, identifying abnormal cells or lesions, thus strengthening cervical cancer screening and diagnostics [27]. This see-and-treat approach allows for earlier and effective treatment of lesions using minimally invasive procedures, such as thermocoagulation, reducing the malignancy and associated mortality [26], while reducing the need for unnecessary biopsies. Table 1 summarizes the most recent evidence about AI models in colposcopy.
The first to study the implementation of an AI model in cervical cancer diagnosis was Mehlhorn and colleagues, namely during colposcopy exams. In 2012, the group developed a computer-assisted diagnostic (CAD) device based on image-processing methods to automatically analyze colposcopy images. The CAD system revealed a diagnostic accuracy of 80%, with a sensitivity of 85% and a specificity of 75%, in differentiating normal or cervical intraepithelial neoplasia grade 1 (CIN1) from high-grade squamous intraepithelial lesions (HSILs)(CIN2 or CIN3) in colposcopy exams [28]. A second study by the same group confirmed the benefit of the CAD application during colposcopy exams’ evaluation, demonstrating an increase in diagnostic accuracy when the exam was evaluated by a less-experienced gynecologist [29]. A Greek group developed and trained a clinical-decision support system (CDSS) based on an artificial neural network to correctly triage 740 women before referral to colposcopy; this was based on the cytological diagnosis and the expression of various biomarkers [30]. Women detected with cervical intraepithelial neoplasia grade 2 or worse (CIN2+) were chosen to undergo colposcopy. The CDSS presented a sensitivity of 89.4%, a specificity of 97.1%, a positive predictive value of 89.4%, and a negative predictive value of 97.1%. This system has the potential to reduce the referral rate for colposcopy when applicated in clinical practice.
Sato et al. were the first to develop a preliminary DL model based on a Keras neural network with 485 images from 158 individuals who underwent colposcopy [31]. The CNN tried to classify colposcopy images and predict post-procedure diagnoses. Patients were classified into three groups: severe dysplasia, carcinoma in situ (CIS), and invasive cancer (IC). Rather than evaluating the performance of a given AI-based model itself, the authors wanted to establish its feasibility and usefulness in clinical practice as quick and efficient way to obtain an accurate preoperative diagnosis that could help doctors in the decision-making process. The model reached 50% accuracy in this dataset.
Asiedu et al. extracted color and textural-based features from visual inspection with acetic acid and lugol’s iodine, and then used the data to train a support vector machine (SVM) model to distinguish cervical intraepithelial neoplasia (CIN) from normal and benign tissue [32]. The proposed framework achieved a sensitivity, specificity, and accuracy of 81.3%, 78.6%, and 80.0%, respectively, achieving better performance than expert physicians using the same dataset. In the same year, Miyagi et al. developed a CNN for classification of cervical squamous intraepithelial lesions from colposcopy images of 330 patients, 97 with low-grade squamous intraepithelial lesions (LSILs) and 213 with HSILs, who underwent colposcopy and lesion biopsy [33]. The CNN differentiated HSILs from LSILs with higher accuracy (82.3% vs. 79.7%) and specificity (88.2% vs. 77.3%), although with slightly lower sensitivity (80.0% vs. 83.1%). A study by the same group in 2020 included the results of human papilloma virus (HPV) testing [34]. The trained CNN revealed an accuracy of 94.1%, higher than gynecologists’ 84.3% global accuracy. This study was one of the first to include additional variables in order to increase the diagnostic accuracy of the CNN.
In 2020, Yuan and colleagues worked on a database composed of 22,330 cases, including 10,365 normal cases, 6357 LSIL cases, and 5608 HSIL cases [35]. Based on a dataset of three frames per case, they developed a ResNet CNN for differentiating between normal images and dysplastic lesions (LSILs or HSILs). The CNN revealed 85% sensitivity, 82% specificity, and 93% accuracy. Also, they created a U-Net CNN capable of delimitating squamous lesions (LSILs or HSILs) in acetic acid and iodine images. The model had 84.7% sensitivity in acetic acid images and 61.6% in lugol’s iodine images. These lesion delimitation models are of utmost importance for guiding colposcopy-based biopsies. Finally, the group developed a MASK-R CNN model to detect HSILs. The model detected HSILs with 84.7% sensitivity in both acetic acid and iodine images, accurately identifying lesions that benefit from treatment.
A Chinese group carried out a study to develop and validate a Colposcopic Artificial Intelligence Auxiliary Diagnostic System (CAIADS) using digital records of 19,435 patients, including colposcopy images and pathological results, which was considered the gold standard [36]. Agreement between CAIADS-graded colposcopy and pathology findings was higher than in expert-interpreted colposcopy (82.2% vs. 65.9%). The CAIADS model was able to increase its diagnostic accuracy after considering patients’ related factors (such as previous cytology results). The new model also revealed a superior ability to predict biopsy sites, with a median mean-intersection-over-union (mIoU) of 0.758.
In 2021, Fu et al. intended to create a model incorporating the results of HPV typing, cytological examination, and colposcopy analysis [37]. First of all, they acquired colposcopy images and created a multiple-image-based DL model using a multivariable logistic regression (MLR), presenting an area under the curve (AUC) of 0.845. Then, the results of the cytology test and HPV test were used to build an ML model, with an AUC of 0.837. Finally, they built a cross-modal integrated model using ML, through combining the multiple-image-based DL model and the Cytology–HPV joint diagnostic model. The authors proved the synergetic benefits of the ensembled model, presenting a higher AUC of 0.921. A ShuffleNet-based cervical precancerous lesion classification method based on colposcopy images was developed by Fang and colleagues [38]. The image dataset was classified into five categories, namely normal, cervical cancer, LSILs (CIN1), HSILs (CIN2/CIN3), and cervical neoplasm. In this dataset, the colposcopy images were expanded to reduce the impact of uneven distribution between the lesions’ categories, Additionally, the ShuttleNet network was compared with other CNNs (like the RestNet or the DenseNet). The new CNN model presented a global accuracy of 81.23%, with an AUC of 0.99. A recent study by Chen et al. collected images from 6002 colposcopy examinations of normal cervixes and those with LSILs and HSILs [39]. A new model based on EficcientNet-B0 using Gate Recurrent Unit was developed in order to accurately identify HSILs. The CNN revealed a sensitivity of 93.6%, specificity of 87.6%, and accuracy of 90.6% in distinguishing between HSILs, LSILs, and normal-cervix images.
Additionally, the diagnosis of cervical cancer can also be guided using magnetic resonance imaging (MRI). Urushibara et al. designed a study including 418 patients, 177 patients with pathologically confirmed cervical cancer and 241 patients without cancer, who underwent MRI between 2013 and 2020 [40]. They compared the performance of a DL architecture, called Xception, with experienced radiologists in the diagnosis of cervical cancer on sagittal T2-weighted images. The CNN presented higher sensitivity (88.3% vs. 78.3–86.7%) and accuracy (90.8% vs. 86.7–89.2%), with similar specificity.
The development of AI models in cervical cancer diagnosis can also be accomplished at the histological level. In fact, in 2019, Sompawong and colleagues applied a Mask Regional Convolutional Neural Network (Mask R-CNN) to analyze cervical cells using liquid-based histological slides and screening for abnormal nuclear features [41]. The proposed algorithm achieved an accuracy of 91.7%, sensitivity of 91.7%, and specificity of 91.7%. In the same year, a group of Indian pathologists trained a CNN to identify abnormal features from liquid-based cytology (LBCC) smears, using 2816 images—816 presenting abnormal features, indicating LSILs or HSILs, and 2000 normal images, containing benign epithelial cells and reactive changes [42]. The referred model yielded a sensitivity of 95.6%, with 79.8% specificity. In addition, its high negative predictive value of 99.1% makes it a potentially valuable tool for cervical cancer screening. The technological development was accompanied by a multicenter observational study that evaluated the performance of AI-assisted cytology for the detection of CIN or cancer [43]. The group used 188,542 digital cytological images to train a supervised DL algorithm. The DL model detected 92.6% of CIN 2 and 96.1% of CIN 3, showing an equivalent sensitivity but higher specificity compared to skilled senior cytologists.
In fact, a validated AI-assisted cytology system, called Landing CytoScanner®, was enrolled in a cohort study including 0.7 million women [44]. Women with abnormal results in both AI-assisted and manual readings were diagnosed using colposcopy and biopsy. The outcomes were of histologically confirmed CIN of grade 2 or worse (CIN2+). The agreement rate between AI and the manual reading was 94.7% and the kappa value was 0.92. The large number of images analyzed contributed to the robustness of this experiment. Given its ability to exclude most normal cytology, with increased sensitivity compared with manual cytology readings, the results support the AI-based cytology system for primary screening of cervical cancer in a large-scale population. More recently, a Chinese group studied the diagnostic performance of an artificial intelligence-enabled liquid-based cytology (AI-LBC) in triaging women with HPV [45]. AI-LBC achieved sensitivity for the detection of CIN2+ comparable to that of experienced cytologists (86.49% vs. 83.78%), but significantly higher in specificity (51.33% vs. 40.93%). Similar results were observed for CIN3+. Moreover, the AI-LBC reduced colposcopy referral by 10%, compared with cytologists, making the process more effective by reducing the number of false positives in the cytological evaluation. Even though there are positive conclusions, prospective designs are needed to test the triaging performance of the developed model.
In order to increase the diagnostic accuracy of cervical lesions, new image methods have been evaluated. High-resolution endomicroscopy (HRME) consists of a fiber optic fluorescence microscope capable of acquiring nuclear images in vivo. In 2022, Brenes et al. used a dataset of images from over 1600 patients to train, validate, and test a CNN algorithm to diagnose CIN2+ cases from HRME images [46]. The proposed method consistently outperformed the current gold-standard methods, achieving an accuracy of 87%, with a sensitivity of 94% and specificity of 58%. By incorporating the HPV status, specificity increased to 71%.
Finally, AI-models can also provide prognostic information, guiding therapeutic decision. In 2019, Matsuo et al. compared the performance of a DL model with four survival-analysis models, including the Cox proportional hazard regression model, the mainstay for survival analyses in oncologic research in predicting survival in women with cervical cancer [47]. The study included 768 women, with a median follow-up time of 40.2 months. The new model exhibited superior performance, outperforming the prediction models for overall survival, but with similar results in predicting progression-free survival. The prognostic information given using DL algorithms was also evaluated in a retrospective study evaluating 157 women who developed recurrent cervical cancer among 431 women with cervical cancer diagnosed between January 2008 and December 2014 [48]. Predictions of 3- and 6-month survival after recurrence were compared between the current approach (linear regression model) and their experimental approach (DL neural network model). The DL model inputs included some clinical and laboratorial parameters and achieved significantly better prediction for 3-month (AUC 0.747 vs. 0.652) and 6- month (AUC 0.724 vs. 0.685) survival. Better predictions of limited life expectancy in women with recurrent cervical cancer pave the way for even more personalized clinical decisions, thus helping clinicians to individually adjust the level of care provided.
Table 1. Summary of Studies about AI implementation in colposcopy. Sn, sensititivy; Sp, specificity; AUC, area under the curve; CIN, cervical intraepithelial neoplasia; HSIL, high-grade squamous intraepithelial neoplasia; LSIL, low-grade squamous intraepithelial neoplasia; N, normal; VIA, visual inspection with acetic acid; VILI, visual inspection lugol iodine. NK—not known.
Author, Year Study Aim Patients n Frames n Pathologic Confirmation AI Methoad Dataset Method Analysis Method Catego-Ries Performance Metrics %
Sn Sp AUC
Mehlhorn, 2012,
Germany
[28]
Detection of CIN 2/3 lesions 198 375 frames (VIA)
Normal: 39
CIN 1: 41
CIN 2: 99
CIN 3: 19
Yes Color texture analysis methods frame annotation in VIA
(normal vs. CIN I vs. CIN II-III)
n-fold cross validation HSIL (CIN 2 or CIN3) 85 75 80
Asiedu, 2019,
USA
[32]
Differentiating normal vs. abnormal (CIN+) 134 Not known
Only number of patients per category
Yes SVM frame annotiation in VIA and VILI
(VILI/VIA positive vs. VILI/VIA negative)
5-fold cross validation
(80–20%)
Abnormal (LSIL or HSIL) 81 79 80
Miyagi, 2019,
Japan
[33]
Differentiating LSIL vs. HSIL 330 1 frame per colposcopy (VIA)
LSIL: 97
HSIL: 213
Yes ResNet frame labeling in acid free
(LSIL vs. HSIL)
5-fold cross validation LSIL vs. HSIL 80 88 83
Yuan, 2020,
China
[35]
Differentiating normal vs. abnormal (LSIL+) 22,330 3 frames per colposcopy (AF, VIA and VILI)
Normal: 10,365 × 3
LSIL: 6357 × 3
HSIL: 5608 × 3
Yes ResNet frame annotation in acid-free, VIA and VILI
(normal vs. LSIL vs. HSIL)
Train–test validation
(80–10–10%)
Abnormal (LSIL or HSIL) 85 82 93
Predicting the area of lesion (LSIL+) 11,198 11,198 VIA frames + 11,198 VILI frames
Normal: NK
LSIL: NK
HSIL: NK
Yes U-Net VIA 85 NK NK
VILI 62 NK NK
Detection of HSIL 11,198 Yes MASK R VIA 85 NK NK
VILI 85 NK NK
Xue,
2020,
China
[36]
Differentiating normal vs. LSIL vs. HSIL vs. cancer 19,435 101,7267 acid-free frames
Normal: NK
LSIL: NK
HSIL: NK
Cancer: NK
Yes U-Net + YOLO frame annotation in acid-free
(normal vs. LSIL vs. HSIL vs. Cancer)
Train–test validation
(70–10–20%)
LSIL+ 87 49 69
HSIL+ 66 90 78
Chen, 2022,
China
[39]
Differentiating LSIL vs. HSIL 6002 18,006 frames (AF, VIA and VILI) Yes E-B0 with GRU frame labeling in acid-free, VIA and VILI
(LSIL vs. HSIL)
Train–test validation
(60–20–20%)
LSIL vs. HSIL 88 94 91
Fang, 2022,
China
[38]
Differentiating normal vs. cervical cancer vs. LSIL vs. HSIL vs. cervical neoplasm 1189 6996 acid-free frames
Normal: 2352
LSIL: 780
HSIL: 2532
Cervical cancer: 408
Cervical neoplasm: 924
Not mentioned ShuffleNet frame labeling in acid free
(normal vs. LSIL vs. HSIL vs. cervical cancer vs. cervical neoplasm)
+
data augmentation
train–test
(90–10%)
N vs. all 90 NK NK
LSIL vs. all 86 NK NK
HSIL vs. all 82 NK NK
Cervical neoplasm   NK NK
Cervical cancer   NK NK

This entry is adapted from the peer-reviewed paper 10.3390/jcm13041061

This entry is offline, you can click here to edit this entry!
Video Production Service