Distinguishing between benign vs. malignant bone lesions is often difficult on imaging. Many bone lesions are infrequent or rarely seen, and often only specialist radiologists have sufficient expertise to provide an accurate diagnosis. In addition, some benign bone tumours may exhibit potentially aggressive features that mimic malignant bone tumours, making the diagnosis even more difficult. The rapid development of artificial intelligence (AI) techniques has led to remarkable progress in image-recognition tasks, including the classification and characterization of various tumours. The use of AI to discriminate bone lesions on imaging has achieved a relatively good performance in various imaging modalities, with high sensitivity, specificity, and accuracy for distinguishing between benign vs. malignant lesions in several cohort studies. However, further research is necessary to test the clinical performance of these algorithms before they can be facilitated and integrated into routine clinical practice.
1. Introduction
Differentiating between benign vs. malignant bone tumours is critical for clinical decision-making and treatment planning
[1][2]. Routine imaging investigations for evaluating bone lesions include radiographs (X-ray), computed tomography (CT), positron-emission tomography combined with computed tomography (PET/CT), and magnetic resonance imaging (MRI)
[3][4]. Conventional radiography remains a key initial imaging modality and is still an optimal technique for evaluating primary bone tumours
[5][6]. It is relatively inexpensive and allows for a clear visual assessment of lesion location, margins, internal lesion matrix, and any associated periosteal reaction
[7]. Along with the patient’s age, these radiographic details are often sufficient to provide a reasonable list of differential diagnoses
[8]. However, radiographs are often limited by superimpositions, incomplete visualizations of bone cortex destruction, and inadequate assessments of adjacent soft tissue involvement
[9][10][11]. Furthermore, diagnosis of bone tumours using imaging can be complicated by other factors such as the presence of pathological fractures, which may result in benign bone lesions having potentially aggressive features that mimic malignant bone tumours
[12][13].
Emerging Artificial Intelligence (AI) tools continue to demonstrate remarkable progress in medical imaging applications, especially in the field of oncology
[14]. These applications include cancer screening and diagnosis
[15][16][17][18], diagnosis and classification
[19][20][21][22], predicting prognosis and treatment response
[23][24][25][26][27], automated segmentation
[28][29][30][31][32], and radiology-pathology correlation (radiogenomics)
[33][34][35][36]. In particular, within the field of diagnosis and classification, the ability of AI models to classify benign vs. malignant tumours has been shown to achieve high accuracy, sensitivity, and specificity in various organs, such as in the case of breast
[37][38][39], prostate
[40][41], lung
[16][42][43][44], and brain lesions
[45][46].
2. Machine Learning for Differentiating Bone Malignancy on Imaging
2.1. Machine Learning on Conventional Radiographs
Correctly classifying bone tumours on conventional radiographs is vital for clinical decision-making and guiding subsequent management
[47]. However, this is often difficult, especially in places where there is a shortage of subspecialty radiology expertise. Moreover, many bone lesions are uncommon entities, and often only a few specialist radiologists have sufficient experience to diagnose them accurately
[10][48]. In clinical practice, most radiologists rely on image pattern recognition to distinguish between benign and malignant lesions on radiographs, which is subject to bias and can sometimes lead to an erroneous interpretation
[49][50]. Some of these common radiological features include location, cortical destruction, periostitis, lesion orientation or alignment, and the zone of transition between the lesion and the surrounding bone
[51][52][53][54]. However, some benign bone lesions may demonstrate one or more aggressive features which may confound the distinction
[55][56].
Malignant bone lesions are often difficult to differentiate from other aggressive disease processes, including inflammation and infection. Ewing’s sarcoma, an aggressive malignant tumour occurring in children, is a typical example
[57]. Differentiating this entity from acute osteomyelitis is often difficult, even by trained musculoskeletal radiologists, due to its similar clinical and radiological features
[58][59][60]. Consalvo et al.
[61] developed an artificial intelligence algorithm which was able to leverage radiographic features to distinguish between Ewing sarcoma and acute osteomyelitis, achieving an accuracy of up to 94.4% on their validation set and 90.6% on a held-out test set. Although this study is limited by a small sample size, requiring the use of cross-validation and loss weighting to achieve statistical significance, it demonstrates the potential feasibility of AI techniques for differentiating infective bony lesions from malignant bone lesions on routine radiographs.
2.2. Machine Learning on Computed Tomography (CT) Imaging
Incidentally detected dense (sclerotic) lesions are a common occurrence on CT examinations in clinical practice [62]. The ability to distinguish a benign sclerotic lesion, such as an enostosis (bone island), vs. a malignant lesion, such as an osteoblastic metastasis, is crucial as it affects the treatment strategy and patient prognosis [63]. For this task, radiomics-based random forest models were created by Hong et al. [64] and achieved an AUC of up to 0.96 and an accuracy of up to 86.0% in the test sets, which was comparable to two experienced radiologists (AUCs of 0.95–0.96) and even higher compared to a radiologist in training (AUC 0.88, p = 0.03). Along with the extraction of 1218 radiomics features, the scholars showed that a model utilizing attenuation and shape-related features achieved the highest AUC, which had been postulated in several prior studies [65][66][67][68][69].
Besides differentiating between benign vs. malignant bony lesions, radiomics models can also be used to differentiate between a variety of tumour matrix types with high performance. A deep convolutional neural network (CNN) created by Y. Li et al.
[70] was able to further classify benign and malignant bone lesions into cartilaginous or osteogenic tumours using a multi-channel enhancement strategy for image processing to improve accuracy, achieving a top-1 error rate of only 0.25.
2.3. Machine Learning on Magnetic Resonance Imaging (MRI)
MRI plays a key role in aiding clinicians in discriminating between benign vs. primary malignant or metastatic bone tumours
[71]. The conventional pulse sequences
[72][73], diffusion-weighted imaging (DWI)
[74][75] with matched apparent diffusion coefficient (ADC) maps
[76][77] as well as dynamic contrast sequences (DCE)
[78][79] can predict potential malignancy with good reliability. However, some imaging features of benign vs. malignant bone lesions can overlap and this makes formulating a differential diagnosis challenging
[80]. Machine learning techniques using radiomic features on MRI have been shown to have high performance for predicting benign vs. malignant bone lesions.
Specific to cartilaginous bone lesions, machine learning has been studied for the differentiation of various grades of chondrosarcoma. Conventional chondrosarcoma is usually divided into three categories based on pathology, where grade 1, also known as atypical cartilaginous tumours (ACTs), usually have an indolent biologic nature, whereas grades 2–3 (high-grade chondrosarcoma) are malignant bone tumours
[81] with metastatic potential and a high recurrence rate following surgical resection
[82]. Discrepancies in correct tumour grading are widespread even among experienced radiologists and pathologists, secondary to overlapping imaging and histological features, and it is for this reason that more accurate diagnostic aids are required
[83][84].
The use of intravenous contrast media (gadolinium-based) for the evaluation of bone tumours has been shown to add some specificity in tissue characterization
[85][86], although the main advantages are for the accurate evaluation of tumour extent for biopsy and treatment planning and to assess for recurrence
[87][88][89]. Interestingly, a study by Eweje et al.
[90] demonstrated that a deep learning model was able to achieve a performance similar to that of expert radiologists for classifying bone lesions at various skeletal locations without using post-contrast T1-weighted sequences, which were made available to the radiologists. The model had an accuracy of 76% vs. 73% for radiologists (
p = 0.7) for classifying benign (which includes intermediate as per WHO classification criteria) vs. malignant bone lesions. This preliminary study shows the potential utility of machine learning for the accurate diagnosis of bone tumours without requiring the administration of gadolinium-based MRI contrast media. This could be advantageous, especially in patients who have contraindications to gadolinium-based contrast (due to renal impairment or allergy) or in children with pain-related anxiety that is secondary to the placement of an intravenous cannula and the uncertain long-term implications of gadolinium deposition in children
[91][92]. Further larger studies are required to show if machine learning using a combination of non-contrast and contrast-enhanced imaging has an advantage over non-contrast imaging alone.
2.4. Machine Learning on Positiron Emission Tomography with CT (PET/CT) Imaging
PET/CT imaging is a widely used modality to differentiate malignant vs. benign tumours in a host of organ systems [93][94][95]. The most common radiotracer used in PET/CT imaging is 2-deoxy-2-18F-fluoro-β-D-glucose (18F-FDG), an analogue of glucose, with the concentrations of radiotracer accumulation in PET/CT image proportional to the metabolic activity of tissues concerning the underlying glucose accumulation and metabolism [96]. Fluorine 18–Sodium Fluoride (18F–NaF) is another radiotracer used more specifically for bone imaging in PET/CT, with the uptake proportional to blood flow in the bone and osseous remodeling [97][98][99]. Increased 18F-NaF bone uptake can be seen in an abnormal bone that undergoes higher remodeling, such as in osteoblastic or osteolytic processes and is used to differentiate various pathologies [100]. For the musculoskeletal system, the utility of PET/CT in distinguishing between malignant and benign bone tumours has also been widely studied and proved effective [101][102][103].
To improve the current diagnostic efficacy of PET/CT interpretation, Fan et al. [104] utilized texture analysis with SUVmax to construct radiomics models to distinguish between benign vs. malignant bone lesions. Texture analysis extracts information, regarding the relationship between adjacent voxels or pixels, and assesses for inhomogeneity, which can then be used to predict the likelihood of benign vs. malignant bone lesions. [105][106]. By incorporating partial texture features along with SUVmax, the developed classification model (using logistic regression) achieved an accuracy of 87.5% compared to 84.3% for nuclear medicine physicians (p = 0.03) in differentiating spinal osseous metastases from benign osseous lesions with high SUVmax values.
2.5. Potential Clinical Impact and Applications
There is significant clinical value in the ability of machine learning to differentiate between benign vs. malignant bone lesions. A retrospective study by Stacy et al.
[107] found that at least a third of patients with bone lesions referred to orthopedic oncology in a year had images that were diagnosed by radiologists as characteristic of benign tumours or non-malignant entities that did not require follow-up or referrals. Accurate AI models for the characterization of osseous lesions could therefore potentially reduce the rate of unnecessary specialist referrals and follow-up, reducing the associated healthcare costs and patient anxiety regarding a possible cancer diagnosis.
Moreover, more accurate characterizations of bone lesions would be valuable for radiologists to identify high-risk bone lesions that will benefit from biopsy to rule in malignancy with greater certainty. Unnecessary biopsies of benign bone lesions leave patients at risk of post-procedural complications, and hasty biopsy planning can increase the risk of misdiagnosis, creating unwarranted patient stress
[108][109]. Biopsies can also be non-diagnostic in up to 30% of bone lesions, which requires repeat biopsies and an associated higher risk of complications
[110]. A robust AI model that can identify benign bone lesions with a high specificity could help reduce the rate of unnecessary biopsies. This would be especially helpful for bone tumour subtypes which demonstrate similar imaging and histology features for their benign and malignant counterparts, for example, cartilaginous tumours including enchondroma (benign) vs. chondrosarcoma
[63][95].
On top of differentiating bone lesions into benign vs. malignant, machine learning methods can also help differentiate lesions secondary to post-treatment change from residual or recurrent malignant disease. This is a crucial diagnostic challenge as the treatment for the two entities are vastly different, and accurate diagnosis is also important to prevent unnecessary invasive biopsy and/or chemoradiotherapy. However, radiological evaluation of bone tumours after treatment can be quite difficult
[111][112].
The use of multiple imaging modalities to evaluate bone tumours is known to improve the accuracy of diagnosis
[3]. Machine learning models built using different modalities (multimodal) have also been shown to improve diagnostic performance
[113]. In breast radiology, Antropova et al.
[114] came out with a CNN method involving fusion-based classification using dynamic contrast enhanced-MRI, full-field digital mammography, and ultrasound. This method outperformed conventional CNN-based and CADx-based classifiers, with an AUC of 0.90.
This entry is adapted from the peer-reviewed paper 10.3390/cancers15061837