Machine Learning and Deep Learning in Spinal Injury: History
Please note this is an old version of this entry, which may differ significantly from the current revision.
Subjects: Orthopedics
Contributor: , , , , , , ,

This review highlights the significant impact of machine learning and deep learning in spinal injury care, focusing on diagnostic improvement and fracture prediction. Techniques like convolutional neural networks improve fracture detection, while machine learning methods offer prognostic insights, facilitating personalized care by analyzing imaging and clinical data.

  • spinal injuries
  • cervical fractures
  • thoracolumbar fractures
  • machine learning
  • deep learning

1. Overview of Machine Learning Applications in Spinal Care

In this review, we explored a diverse range of objectives in the studies reviewed, encompassing everything from the diagnostic precision in morphometric vertebral fracture analysis to the prediction of imminent and future vertebral fractures. This range highlights the expansive yet focused applications of machine learning and deep learning in spinal injury care. Deep learning techniques, especially CNNs, have been at the forefront in refining diagnostic approaches. These include enhancing the accuracy in the detection and classification of fractures, as well as in the critical task of differentiating between benign and malignant fractures using complex imaging data. On the other hand, machine learning methods, such as random forests and SVMs, have primarily been applied for prognostic purposes, offering valuable insights into the progression of vertebral collapse and identifying treatment-related risk factors using structured clinical data. While prognostic studies utilizing machine learning are fewer, they provide significant contributions to understanding spinal injury outcomes. This distinction highlights the tailored application of each technology to the unique challenges presented by spinal injury diagnosis and prognosis. The diversity and complexity of spinal injuries, along with their varied treatments, underscore the utility of machine learning and deep learning in facilitating more individualized care approaches.

2. Diagnostic Approaches

2.1. X-ray

Diagnostic Accuracy Using Entire X-rays

The accuracy of determining the presence of vertebral fractures using plain radiographs of the entire thoracolumbar spine can be challenging because identifying vertebral fractures in a broad region of interest might be a complex task for neural networks. A study by Chen et al. using a CNN reported an accuracy of around 73.59% for vertebral fracture identification with these X-ray images [15]. Murata et al. trained a CNN to identify thoracolumbar vertebral fractures using X-ray images and achieved accuracy, sensitivity, and specificity rates of 86.0%, 84.7%, and 87.3%, respectively [41].

Evaluation of Manually Cropped Regions

Rosenberg et al. focused on using deep learning models to diagnose traumatic thoracolumbar fractures by analyzing individual vertebra regions that were manually cropped from the lateral radiographs, finding that the ResNet18 architecture outperformed VGG16 with higher sensitivity (91%), specificity (89%), and accuracy (88%) [22].

Object Detection and Ensemble Approaches

An alternative and potentially more accurate approach may involve first using object detection algorithms to identify individual vertebrae, followed by classification or segmentation to assess their condition [16,42]. Li et al. used a deep learning ensemble on 941 lateral spine radiographs to detect osteoporotic lumbar vertebral fractures, achieving 93% accuracy, 91% sensitivity, and 93% specificity [42]. Chou et al. employed an ensemble model on 1305 thoracolumbar X-rays to identify vertebral fractures, attaining similar performance metrics—93.36% accuracy, 88.97% sensitivity, and 94.26% specificity [16]. Shen et al. developed and validated a deep-learning-based system (AI_OVF_SH) for diagnosing and grading osteoporotic vertebral fractures using plain radiographs [25]. In external validation, the system demonstrated high performance, achieving an accuracy of 96.85%, sensitivity of 83.35%, and specificity of 94.70% for detecting all types of fractures [25].

Cervical Spine

Naguib et al. developed a computer-aided diagnosis system based on deep learning algorithms, specifically AlexNet and GoogleNet, to classify cervical spine injuries as either fractures or dislocations using X-ray images [18].

Age-Specific Algorithm Performance

Alqahtani et al. reported that a software program initially trained on adult spinal fracture data, using DXA and X-ray for diagnosis, showed poor sensitivity and specificity when applied to pediatric cases. This underscores the need for algorithms to consider the age of the patient for the target condition in their training data [10,11,12].

2.2. DEXA

Mehta et al. demonstrated that using ancillary numerical data obtained from routine DEXA scans, support vector machine (SVM) analysis can accurately identify lumbar spine fractures. The method, specifically with a linear kernel SVM, yielded a high area under the curve (AUC = 0.9258), indicating that it can detect fractures without the need for additional imaging or radiation exposure [43]. Derkatch et al. found that CNNs, specifically InceptionResNetV2 and DenseNet, can accurately identify vertebral fractures in thoracolumbar regions using images from DXA scans. The CNN ensemble achieved an area under the receiver operating characteristic curve of 0.94, with a sensitivity of 87.4% and a specificity of 88.4% [17]. Monchka et al. showed that CNNs can accurately identify vertebral fractures in images from multiple types of DXA scanners [44]. The final CNN ensemble model achieved a sensitivity of 91.9%, a specificity of 99.0%, and an F1-score of 86.1% [44]. The F1-score is the harmonic mean of precision and recall, providing a balance between the two.

2.3. CT

Automated Detection Algorithms

Roux et al.’s study employed software from Zebra Medical Vision to perform opportunistic screening for vertebral fractures and osteoporosis in more than 150,000 routine lumbar spine CT scans, achieving success rates of 82% for vertebral fracture assessment and 87% for Hounsfield Unit (HU) measurements, showing the potential to enhance diagnostic accuracy in large-scale screening [23]. Tomita et al. utilized deep neural networks for the automatic detection of osteoporotic vertebral fractures in CT scans, achieving an accuracy of 89.2% and an F1-score of 90.8% [27]. The study by Inoue et al. used Faster R-CNN to detect fractures in the pelvis, ribs, and spine, offering a comprehensive approach for trauma cases [45]. However, the model’s sensitivity is lower compared to specialized spinal fracture models, suggesting room for improvement. Rueckel et al. investigated the utility of AI assistance in detecting missed thoracic findings, including vertebral fractures, in emergency whole-body CT scans [24]. The study found that 57.8% of suspicious vertebral bodies were identified solely by the AI, while radiologists alone detected 29.7%, and both AI and radiologists detected 12.5%, suggesting that AI assistance can significantly reduce the rate of missed thoracic findings in emergency settings [24].

Fracture Classification

Some models not only diagnose the presence of fractures but also classify the type of fracture. For example, Chen et al. used deep learning for high-accuracy AO (Arbeitsgemeinschaft für Osteosynthesefragen) classification of thoracolumbar fractures [14]. On the other hand, Doerr et al. developed a model to categorize vertebral morphology and determine posterior ligamentous complex integrity for the purpose of assigning Thoracolumbar Injury Classification and Severity Score (TLICS), both using CT scans [46]. Zhang et al. developed a multistage system using CNNs (U-net, GCN, 3D-ResNet) that can automatically detect and classify acute thoracolumbar vertebral body fractures on CT images with high-accuracy AO classification—achieving a sensitivity of 95.23%, an overall accuracy of 97.93%, a specificity of 98.35%, and balanced accuracy rates ranging from 79.56% to 94.5% for different fracture types according to AO classification [34].

Opportunistic Screening and Fracture Liaison

The ability to screen for vertebral fractures from CT scans taken for other purposes highlights the strength of AI in this context. Nicolaes et al. used a 3D CNN to identify thoracolumbar vertebral fractures from CT scans with 94% sensitivity and 93% specificity. The algorithm showed an AUC of 0.94, making it highly effective in opportunistically identifying vertebral fractures in routine CT scans [19]. Ong et al. evaluated the efficacy of a machine learning algorithm in identifying vertebral fractures from CT scans, revealing that the algorithm detected fractures in 19.1% of 4461 patients, outperforming hospital radiologists whose reports only mentioned 49% of these fractures [20]. Valentinitsch et al. used a random forest classifier and 3D texture features for opportunistic osteoporosis screening in thoracolumbar spine multi-detector CT scans, demonstrating high discriminatory power (AUC = 0.88) and outperforming global vertebral bone mineral density (vBMD) alone in identifying vertebral fractures [28].

3D-Based Algorithms

While 3D volume data may have higher computational costs, they hold the potential for a more accurate assessment of fracture morphology. Burns et al. used 3D volume data from CT scans to achieve high sensitivity and low false-positive rates in detecting and anatomically localizing thoracic and lumbar vertebral body fractures [13]. Zakharov et al. developed an anchor-free vertebra detection network using convolutional neural networks that effectively localizes the vertebral column in 3D CT images and simultaneously detects individual vertebrae and quantifies fractures in 2D, demonstrating strong performance with an AUC of 0.95, sensitivity of 0.85, and specificity of 0.9 on the challenging VerSe dataset containing various unseen vertebra fracture types [33]. Zhang et al. introduced a novel multi-scale attention-guided network (MAGNet) for diagnosing thoracolumbar vertebral fractures and three-column injuries, achieving an AUC of 0.884 for vertebral fracture diagnosis and 0.920 for three-column injury diagnosis, both with high precision on CT images [35].

Distinguishing Benign from Malignant Vertebral Fractures

Differentiating between osteoporotic and malignant vertebral fractures is crucial, especially among elderly patients, for appropriate treatment planning and prognosis. Goller et al. in 2023 used a CNN-based framework to distinguish between benign and malignant thoracolumbar vertebral fractures using CT-based texture features. Their study found statistically significant differences in these features between benign and malignant fractures [47]. Park et al. developed an automated segmentation algorithm for CT scans of thoracolumbar fractures, showing that its performance in predicting fracture malignancy was comparable to human expert segmentation [21].

Cervical Spine

Research on detecting cervical fractures is comparatively less abundant than studies on thoracolumbar fractures. Golla et al. in 2023 used convolutional neural networks to detect cervical spine fractures from CT scans. Their algorithm detected 87.2% of fractures with an average of 3.5 false positives per case, using a spinal canal aligned volumes of interest (VOI) approach [48]. Small et al. evaluated a CNN developed by Aidoc for detecting cervical spine fractures using CT scans, reporting a 92% accuracy rate with 76% sensitivity and 97% specificity [26]. Voter et al. evaluated the diagnostic performance of a deep learning algorithm by Aidoc for detecting cervical spine fractures in CT scans, finding a relatively low sensitivity of 54.9% but a high specificity of 94.1% [29].

2.4. MRI

Diagnosing Fresh Osteoporotic Vertebral Fracture

Yabu et al. used an ensemble method comprising VGG16, VGG19, DenseNet201, and ResNet50 to diagnose fresh osteoporotic vertebral fractures in thoracolumbar MRI scans, finding that the CNN’s performance metrics were comparable to those of two spine surgeons [30].

Distinguishing Benign from Malignant Vertebral Fractures

Yoda et al. employed a deep convolutional neural network using Xception architecture to differentiate osteoporotic and malignant vertebral fractures using MRI [32]. The model’s accuracy was statistically equal or superior to that of spine surgeons. Yeh et al. applied a ResNet50 deep learning algorithm to MRI scans of the whole spine for distinguishing between benign and malignant vertebral fractures, achieving an accuracy of 92% and potentially improving diagnostic performance for less experienced clinicians [31].

3. Prognostic Approaches

Research focused on predicting prognosis is relatively scarce compared to diagnosis and classification. Most of these few studies do not directly predict from images but use machine learning on parameters extracted from images. For instance, Cho et al. worked on predicting the progression of vertebral collapse in osteoporotic vertebral fractures. Using manually extracted parameters from X-ray and MRI images, they employed machine learning techniques like decision trees and random forests [36]. Jiang et al. employed machine learning models, specifically random survival forest and COX proportional hazard analysis, to predict new osteoporotic vertebral compression fractures after vertebral augmentation using T1W MR images [37]. Kong et al. focused on predicting osteoporotic fractures in the lumbosacral spine using X-ray images and deep learning in a longitudinal cohort study [38]. Their DeepSurv model, trained on both images and clinical features, outperformed traditional methods like FRAX and CoxPH in terms of C-index values, suggesting its potential for more accurate fracture prognosis [38]. Takahashi et al. employed machine learning models including logistic regression, decision trees, XGBoost, and RF to improve the prediction of nonunion following osteoporotic vertebral fractures in the thoracolumbar region, using MRI data from 505 patients [40]. The study found high prognostic accuracy with AUC scores of 0.860 and 0.845 for RF and XGBoost, respectively [40]. Leister et al. aimed to identify treatment-related risk factors for nonunion of odontoid fractures in the cervical spine using machine learning models, specifically XGBoost and binary logistic regression [39]. The study found moderate predictive power, with an AUC of 0.68 for the XGBoost model and 0.71 for the binary logistic regression model, suggesting their potential utility in understanding treatment-related risks [39].

4. Advantages and Disadvantages of Each Model

Machine learning and deep learning methods play a crucial role in diagnosing and predicting spinal injuries, each with its own strengths and drawbacks. Techniques such as CNNs, random forests, SVMs, XGBoost, and recurrent neural networks (RNNs) are important in this field. CNNs are particularly good at recognizing patterns in images, including object detection and segmentation, which is helpful for examining complex data like X-ray, CT, and MRI scans [49,50]. Random forests are strong at analyzing big datasets, SVMs work well in handling data with many features, and XGBoost is known for its execution speed. RNNs are especially useful for data that are ordered over time. These methods have their own limitations regarding data requirements, computing power, and how easy it is to understand their results [51]. For the advantages and disadvantages of each model related to spinal injury care, please see Table 1. Choosing the right method depends on the specific needs of spinal injury data, the computing resources available, and finding a balance between precision and interpretability.
Table 1. Advantages and disadvantages of machine learning models in the context of spinal trauma.

This entry is adapted from the peer-reviewed paper 10.3390/jcm13030705

This entry is offline, you can click here to edit this entry!
Video Production Service