1. Introduction
According to the Global Cancer Observatory, primary liver cancer is ranked the third most frequent cause of death and the sixth most commonly diagnosed cancer in 2020
[1]. Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer, accounting for approximately 75–85% of cases
[2], representing a significant public health burden worldwide. The incidence of HCC is most often linked with chronic liver disease, and cirrhosis is the primary risk factor, with one-third of cirrhotic patients reported to develop liver cancer during their lifetime
[2]. The most common cause of chronic liver disease in Europe is the hepatitis C virus, followed by excessive alcohol intake
[3]. In addition, there is a male predominance compared to women of 2:1
[2].
HCC has been the main indication for transplantation in patients with oncologic disease. Together with NASH and NAFLD, it is described as the fastest-rising indication for hepatic transplant
[4][7]. In theory, it is the optimal treatment option because it has a double role of eliminating the underlying liver disease while removing the lesion
[5][8]. However, the selection of transplant candidates that have developed HCC needs to be rigorous as there is a general organ shortage. The United Network for Organ Sharing (UNOS) has described a drop-out of 20% in patients awaiting transplantation
[6][9]. Therefore, extended donor criteria have been adopted to reduce these figures, like older donors, fatty liver, and cardiac arrest donors with inevitable inferior post-transplant outcomes
[6][9]. These factors further stress the importance of patient selection and organ allocation to reduce mortality and improve post-transplantation survival.
The demand for precision medicine and personalized treatments, together with technological advances, has led to an increasing amount of research regarding the application of artificial intelligence (AI) to medical images. The term and technology are not new, as the first artificial neuron was described in 1943
[7][10]. Today, AI is a large field of study incorporating algorithms capable of solving tasks that normally require human intelligence. Machine learning (ML) is a subset of AI that involves extracting patterns from data without explicit programming
[8][11]. As the algorithm has a more complex structure with multiple components, or “layers”, the term deep learning (DL) is used as a subset of ML
[8][11].
2. AI-Aided Evaluations in Candidates for LT with HCC
2.1. Detection
Detection involves applying a bounding box to the region of interest in the processed images (e.g., lesions, organs, etc.). It is often a preliminary step of more complex algorithms that use a combination of detection-segmentation classification.
Major international guidelines recommend ultrasound (US) as the main imaging surveillance tool for patients with cirrhosis
[9][10][26,29]. US has a detection sensitivity for HCC of around 84%, but with a substantial drop for early-stage lesions, to almost 47%
[11][46]. This is essential as these lesions have the highest likelihood of long-term cures using radical treatments
[5][8]. However, contrast-enhanced (CE) CT and MRI are not cost-effective for the general surveillance of HCC, except for patients awaiting transplants, according to the EASL guidelines
[9][26]. Therefore, CT is more widely used as it has lower costs, faster acquisition times and less susceptibility to motion artifacts. However, it uses ionizing radiation with lower soft tissue contrast
[12][47]. Although MRI has increased costs and acquisition times, it offers superior tissue contrast and can also use hepatospecific contrast agents, increasing sensitivity
[12][47].
An AI-detection tool would be essential in monitoring patients on the transplant list since the presence of HCC implies the accordance with MELD (Model for End-Stage Liver Disease) exception points
[13][48], which changes the prioritisation on the transplant list and can permit an earlier treatment. Furthermore, the importance of early detection cannot be overstated, as it can impact survival. This is relevant as more patients will be identified with smaller lesions within Milan criteria, with favourable 5-year survival rates of around 65–80%
[9][26]. Thus, an integrated AI tool for lesion detection could improve diagnostic accuracy and improve transplant patient stratification.
In US imaging, the number of applications dedicated to focal liver lesions is reduced mainly due to limited datasets available and because liver lesion characteristics often overlap
[14][49]. However, such a tool can aid those performing US examinations, especially in centres with limited experience. Tiyarattanachai et al.
[15][50] prospectively evaluated US images from 334 patients using a RetinaNet DL model, obtaining detection rates for focal liver lesions as high as 89.8%, surpassing that of clinicians. For HCC, the detection rate was 100%, but only 23 cases were included in the study. Lee et al.
[16][51] used a CNN to detect HCC in multiphase CECT imaging from 302 CT studies using all three phases (arterial, venous, and delayed) with a sensitivity of 93.88%. Using multiphase CECT (pre-contrast, arterial, venous, and delayed), Kim et al.
[17][52] trained and tested a DL model using data from 1320 patients with either cirrhosis or chronic B virus hepatitis to detect HCC. The sensitivity varied according to size, with 33.3% for lesions <10 mm, 74.7% for those between 10–20 mm and 95.9% for lesions >20 mm, with an overall sensitivity of 84.8%. The most frequent cause of the error was an atypical enhancement pattern. Kim et al.
[18][53] studied data from 549 patients with HCC who underwent MR imaging with gadoxetic acid (Gd-EOB-DTPA) to train and test a DL model for HCC detection. Using the hepatobiliary phase, the application had a sensitivity of 87%. Fabijańska et al.
[19][54] obtained a sensitivity of 90.8% for HCC detection in cirrhotic patients. The DL model used integrated T1 dynamic acquisitions with extracellular contrast (non-contrast, arterial and late phase), but the dataset was small, with only nine patients. Integrating all three post-contrast acquisition phases proved superior to using either phase alone. An example of how a detection algorithm works is pictured in
Figure 1, with bounding boxes (red) being applied to detected lesions.
Figure 1.
Detection algorithm using ultrasound (
A
), MRI (
B
) and CT (
C
) imaging in the arterial phase.
2.2. Segmentation
Segmentation involves the labelling of pixels in an image to delineate with great precision a region of interest (e.g., lesions, viable tissue in tumour, organs, etc.). The gold standard is represented by manual segmentation done by radiologists. However, this is time-consuming and prone to inter-reader variability
[20][55]. Therefore, the evaluation of segmentation performance is most often done using the Dice-Sørensen coefficient, with results varying between 0–1, with 1 meaning complete overlap.
Liver/liver lesion segmentation with CT represents the main interest regarding AI applications to hepatic imaging
[21] as it shows great promise to optimize this process and provide fast, standardized segmentations. Some of the first grand challenges for liver segmentation were organised during the Medical Image Computing and Computer Assisted Intervention Conference (MICCAI) in 2007
[22][56] and 2008
[23][57], where only conventional ML methods were used. A shift was seen during the Liver Tumor Segmentation Challenge (LITS) in 2017
[24][58], where most applications were based on DL. The difficulty lies in the variable liver and liver lesion density and shape, similar densities with surrounding organs like the spleen, gastrointestinal tract, and heart, and the presence of artefacts. Furthermore, anatomical variants are common imaging findings, like accessory fissures or lobes, elongated left liver lobe and Riedel lobe
[25][59]. With cirrhosis, the structure is even more heterogenous, and the contours are irregular, which makes segmentations even more difficult.
Hepatic segmentation is the preferred method for liver volumetry
[20][55]. As living-donor liver transplantation (LDLT) becomes more widespread, it is mandatory to do volumetric evaluations before surgery, as inadequate graft volume is the main contraindication to LDLT
[26][60]. An inaccurate transplanted liver size can cause a small-for-size syndrome with functional insufficiency, leading to death
[20][55]. The minimum remnant liver volume for the adult population is 30%, provided there is no underlying liver dysfunction
[27][61]. Therefore, the recipient’s ratio of graft size to standard liver volume according to body surface area should be over 40–50%
[20][26][55,60]. An example of whole liver segmentation (A) and right-hemiliver/left-hemiliver segmentation (B) is provided in
Figure 2.
Figure 2.
Volumetric measurements before LDLT; (
A
). Whole liver segmentation; (
B
). Right hemiliver/left hemiliver segmentation.
Although there is great interest in developing DL models for hepatic segmentation, respecting the Couinaud
[28][62] functional liver segmentation according to vascular supply is mandatory for clinical applications. Tian et al.
[29][63] developed a DL method (GLC-UNet) to segment the liver according to Couinaud using 193 CT scans manually annotated by radiologists. The model obtained a DICE score of 92.46%. Wang et al.
[30][64] used a cascaded neural network (ARH-CNet) to segment the liver according to Couinaud from 193 CT scans manually annotated by radiologists. The model obtained a DICE score of 84%. Using MR imaging, Han et al.
[31][65] developed a 3D convolutional neural network on portal phase acquisitions from 744 scans. The average DICE score was 90.2%, and the dataset included cirrhotic patients. The authors also experimented with the localization of lesions according to segments with a 93.4% accuracy.
Another factor that influences transplant outcome is the presence of steatosis in the donor liver, which may lead to graft dysfunction and biliary and vascular complications
[32][66]. The cut-off varies between 10 to 30%
[26][60]. The gold standard for steatosis diagnosis is biopsy which only evaluates a tiny portion of parenchyma and is subject to inter-pathologist subjectivity
[33][67]. MRI proton density fat fraction (PDFF) has been shown to have a very good diagnostic performance for liver fat assessment and grading
[34][68], evaluating the whole liver structure. Thus, there is a need to develop AI models with more complex roles of both whole liver segmentation and fat quantification. Jimenez-Pastor et al.
[35][69] developed a DL method on 183 MRI multi-echo chemical shift encoded (MECSE) liver studies with the ability to segment the liver and provide fat and iron quantifications. The DICE score for segmentation was 93%, and the model showed a high correlation and low relative error compared to manual fat and iron quantifications. An example of a model for PDFF segmentation and quantification is presented in
Figure 3, with an analysis of the whole liver structure on multiple slices.
Figure 3. Exemplification of fat-fraction automatic quantification on MRI PDFF acquisitions with whole liver segmentation (1) at two different levels (A,B).
HCC lesion segmentation with volumetric data extraction can also provide an aid to better select patients eligible for transplant with total tumour volume (TTV) as an inclusion criterion with a threshold of 115 cm
3 [36][38]. Bousabarah et al.
[37][70] analysed 174 patients with HCC scanned with MR imaging using a DL method (U-Net) to segment the liver and the lesions. The model used T1 postcontrast acquisitions (arterial, venous, and delayed) using extracellular agents and obtained a DICE score of 91% for liver segmentation and 68% for HCC segmentation. As volumetric assessment becomes automatic with an AI model, more precise and quantifiable inclusion criteria can be developed. For example, the LITS challenge
[24][58] included a tumor burden metric as part of the AI algorithm segmentation accuracy evaluation (calculated as voxels of tumour/voxels of the liver). An example of HCC segmentation and volumetric measurements using CECT is provided in
Figure 4.
Figure 4.
HCC with volumetric measurements; (
A
). Arterial phase; (
B
). Delayed phase segmentation.
Automatic segmentation algorithms that help with transplant recipients can also be applied to assess sarcopenia, characterised by the loss of skeletal muscle mass and function. Sarcopenia impacts survival in the liver transplant setting as it is an independent predictor of orthotopic liver transplantation outcome
[38][71], associated with higher mortality
[39][72]. The quantitative assessment of body composition (defined as the percentage of muscle, fat, bone, and water) is usually done at the level of the third lumbar vertebrae
[39][72] by segmentation of muscle, adipose tissue, and bone. AI can reduce segmentation times by providing automatic measurements and more standardised assessment techniques. Most of the research regarding sarcopenia evaluation using AI uses DL methods
[40][73]. For example, blanc-Durand et al.
[41][74] used a convolutional neural network and obtained a Dice score of 97% in a study done on 1025 CT scans.
2.3. Classification
2.3.1. Microvascular Invasion
Microvascular invasion (MVI) is recognised as an essential factor for survival in patients with HCC after LT, and its presence doubles the risk of recurrence
[42][33]. It is defined as tumour present in a vessel lined by endothelium seen by microscopy
[43][75]. The ability to detect this feature before a transplant would allow for better transplant list stratification and risk assessment.
Chen et al.
[44][76] studied 415 patients with small HCC (<3 cm) from three independent institutions. They developed a radiomics signature from DCE MRI with an intracellular agent (Gd-EOB-DTPA) and DWI that predicted MVI with an AUC of 0.971. The HBP and DWI images were most relevant for MVI prediction. Jiang et al.
[45][77] analysed 405 patients with HCC and triple-phase CT acquisitions (arterial, porto-venous, and delayed phase) using a 3D CNN with an AUC of 0.906. They compared these results with a radiomics model that included clinicopathological data, which obtained a lower AUC of 0.887. Sun et al.
[46][78] used DL and AFP information to study 321 patients with HCC and DCE MR imaging with intracellular contrast (Gd-EOB-DTPA). They obtained an AUC of 0.824 in predicting MVI. An ablation study was done to demonstrate which is the most relevant acquisition for determining MVI. A combination of non-contrast T1, delayed, and porto-venous phases showed the best results, while DWI had less impact. Zhou et al.
[47][79] analysed 114 patients undergoing MR imaging using extracellular contrast (Gd-DTPA) with T1, arterial and venous phases. The data from all three contrast acquisitions was processed using a 3D CNN that obtained an AUC for predicting an MVI of 0.926. They also tested the acquisition phases separately and observed that the arterial phase had the best performance (AUC of 0.855).
2.3.2. HCC Grading Prediction
Hepatocarcinoma grade is a biological marker for the aggressiveness of tumors, and, like MVI, it is an important prognostic indicator of recurrence for transplanted patients
[48][80]. The most common classification is the Edmondson and Steiner (ES) according to the degree of differentiation (from well to undifferentiated)
[49][81]. In transplant patients, a biopsy is indicated for excluding undifferentiated and poorly differentiated HCC in the Hangzhou criteria for lesions >8 cm, AFP < 400 ng/mL, and the Toronto Criteria for lesions beyond Milan. This marker can impact prognosis and allow for better patient stratification, as size, and histopathological differentiation are significant independent factors for survival
[48][80].
Mao et al.
[50][82] analysed 297 patients with HCC to develop a radiomics model that classifies lesions according to ES into low-grade or high-grade. CECT imaging data from dual-phase acquisitions (arterial and venous) and clinicopathological data were processed, and the application reached an AUC of 0.801. Even though arterial phase features showed more relevance for the prediction task, using both arterial and venous phases proved superior. Using non-contrast MR imaging and clinical data, Wu et al.
[51][83] studied 170 patients with HCC to train a radiomics model that classifies lesions according to ES grade into low or high. The imaging protocol consisted of non-contrast T1 and T2 weighted images combined with clinical data and obtained an AUC of 0.8. Compared to a model that relied on imaging alone, the combined clinico-radiological model proved superior (0.742 versus 0.800). Han et al.
[52][84] developed a combined clinical and imaging radiomics model to assess HCC grade using hepatospecific DCE MRI (Gd-EOB-DTPA) with T1-weighted, T2-weighted, hepatobilliary and portovenous imaging. The model analysed 137 patients with the hepatobiliary phase having the most significant impact on prediction, obtaining the highest AUC of 0.8. Zhou et al.
[53][85] developed a model to predict ES grading on DWI MR images using a 3D CNN on a cohort of 98 patients obtaining an AUC of 0.83. MR acquisitions consisted of DWI using 0, 100 and 600 s/mm
2 b-values and generated ADC maps. The highest b-values proved to be more valuable for classification. Zhou et al.
[54][86] used a deep neural network (SE-DenseNet) to grade HCC lesions using the ES system from DCE MR images from a dataset of 75 patients. They used arterial, venous, and delayed phase images and obtained an AUC of 0.83. They focused more on comparing their model to other neural networks like DenseNet, ResNet and AlexNet, which performed worse.
2.3.3. Molecular Evaluation
Several immunohistochemical markers can offer further information regarding prognosis or improve the positive diagnosis of HCC. Unfortunately, these can only be obtained from biopsy specimens or resected lesions.
Glypican 3 (GPC3) is present on the cell surface and has been included in the panel of markers for HCC diagnosis in highly differentiated small lesions
[9][26]. It can also act as a marker of poor prognosis
[55][87]. The presence of GPC3 in HCC lesions impacts survival as it has been associated with a higher incidence of MVI
[56][88], a reduced 5-year survival rate and disease-free survival in patients with LT
[56][57][88,89]. Gu et al.
[58][90] analysed a cohort of 293 patients with HCC that underwent MR imaging and developed a radiomics model based on clinical and imaging data to predict the presence of GPC3. MR protocol consisted of T1-weighted postcontrast acquisitions using extracellular contrast (Gd-DTPA) with arterial, venous, and delayed phases. The model that used only imaging data obtained an AUC of 0.871, and when combined with AFP, the AUC increased to 0.914. In a recent study, Chong et al.
[59][91] studied 259 patients with HCC that underwent MR imaging with intracellular contrast (Gd-EOB-DTPA) to develop a radiomics model that predicts the presence of GPC3. The study showed that the most relevant imaging features were from T2 weighted images and T1 hepatobiliary phase, and together with clinical data, a nomogram was created that obtained an AUC of 0.943.
Cytokeratin 19 (CK19) is normally expressed in hepatic progenitor cells but not in healthy hepatocytes, and its presence in HCC lesions is a marker of aggressiveness and poor prognosis
[9][60][26,27]. In patients with HCC that have undergone transplants beyond Milan criteria, CK19 has been associated with recurrence. In contrast, patients without expression of CK19 showed similar survival rates as those within the Milan criteria
[61][92]. Zhang et al.
[62][93] studied 214 patients and developed a radiomics model using ultrasound imaging and clinical data to predict the presence of CK19. The combined model obtained an AUC of 0.867, while the one that used only imaging data had a lower AUC of 0.789, showing the relevance of combining multiple types of input data. Yang et al.
[63][94] analysed 257 patients with HCC from multiple centres that underwent MR imaging and developed a radiomics model to determine the presence of CK19+ lesions. The model with the best predictive performance used features from T2 and DWI with an AUC of 0.790. Chen et al.
[64][95] developed a radiomics model using data from 80 patients with HCC to determine the presence of CK19. The imaging protocol consisted of MR imaging with intracellular contrast (Gd-EOB-DTPA), and the data was obtained from two institutions. They obtained an AUC of 0.833 by adding clinical data like AFP to improve performance while relying on imaging alone resulted in an AUC of 0.82. Analysing the data, they found that targetoid features on imaging were correlated with the presence of CK19.