1000/1000
Hot
Most Recent
Knee injuries account for the largest percentage of sport-related, severe injuries (i.e., injuries that cause more than 21 days of missed sport participation). The improved treatment of knee injuries critically relies on having an accurate and cost-effective detection. Deep-learning-based approaches have monopolized knee injury detection in MRI studies.
Category |
Models |
Description |
---|---|---|
Feature extraction |
Histogram of oriented gradient (HOG) [35] |
This is a feature descriptor used in computer vision and image processing for the purpose of object detection. The technique counts occurrences of gradient orientation in localized portions of an image. |
Generalized search tree (GIST) [30] |
GIST descriptor represents holistic spatial scene properties (spatial envelope) of an image. It summarizes gradient information on different spatial scales and orientations by splitting the image into a grid of cells on several scales and convolving each cell using a Gabor filter bank from different perspectives. |
|
Gray-level co-occurrence matrix (GLCM) [36] |
GLCM is a way of extracting second-order statistical texture features. In particular, the texture of an image is estimated by calculating how often pairs of pixels with specific values and a certain spatial relationship occur. |
|
Traditional Machine Learning |
k-nearest neighbor (K-NN) [37] |
KNN algorithm is a simple, easy-to-implement supervised ML algorithm that can be used to solve both classification and regression problems. It works by (i) finding the distances between a query and all the examples in the data, (ii) selecting the K nearest neighbors of the query, and (iii) voting for the most frequent label (in the case of classification) or averaging the labels (in the case of regression). |
Support vector machines (SVMs) [38] |
SVMs is a supervised method that identifies a hyperplane that best divides the data into two classes. To separate the two clouds of data points, there are many possible hyperplanes that could be chosen. The objective of the SVM algorithm is to find a slab that has the maximum thickness, i.e., the maximum distance between data points of the different classes. |
|
Shallow artificial neural networks (ANNs) [39] |
The ANN vaguely simulates the way the human brain analyzes and processes information. They consist of sequential layers: input, hidden and output layers. The hidden layer processes and transmits the input information to the output layer. |
|
Deep Learning |
Convolutional neural networks (CNNs) [40] |
This is a class of DL algorithms commonly used in computer vision and pattern recognition. CNNs are a specific type of neural networks that are generally composed of the following layers: (i) input layer, (ii) convolution layers, (iii) pooling layers and (iv) fully connected layers. The convolution layers use filters that perform convolution operations as they are scanning the input with respect to its dimensions. Pooling is a down-sampling operation, which is typically applied after a convolution layer. The fully connected layers operate on a flattened input where each input is connected to all neurons in the next layer and are usually found towards the end of CNN architectures to optimize objectives such as class scores. |
Region based convolutional neural networks (R-CNNs) [41] |
The method of detecting and classifying objects in an image is known as object detection. R-CNN (regions with convolutional neural networks) is a deep learning technique that blends rectangular area proposals with convolutional neural network functionality. The R-CNN algorithm is a two-stage detection method. |
|
Deep residual networks [42] |
A residual neural network (ResNet) is an ANN variant that uses residual mapping and shortcut connections to tackle the problem of vanishing and exploding gradients that is characteristic of deep CNNs. As a consequence of this, deep residual networks achieve better performance when compared to plain very deep networks, whereas their training is easier as well. Typical ResNet models are implemented with double- or triple-layer skips that contain nonlinearities such as rectified linear unit (ReLUs) and batch normalization in between. |
|
3D-CNNs [43] |
A 3D CNN is simply the 3D generalization of 2D CNNs. It takes as input a 3D volume or a sequence of 2D frames (e.g., slices in an MRI scan). Then kernels move through 3 dimensions of data producing 3D activation maps. Overall, they learn powerful representations of volumetric data. |
|
Computer Vision Transformers [44] |
When data is modelized as a sequence of embeddings, the Transformer model is a basic yet scalable technique that can be used for any type of data. Even without typical convolutional pipelines, transformers can be utilized to provide SOTA results in Computer Vision. It is a DL network that extracts inherent properties of the interest domain via the self-attention technique. |
|
Procedure |
Training |
The standard procedure involves a dataset of paired images and labels (x, y) for training and testing, an optimizer (e.g., stochastic gradient descent, Adam [45]), and a loss function to update the model parameters. The aim of the training is to find the optimal values for the network parameters so that the loss function is minimized. |
Data augmentation |
Data augmentation is a strategy that artificially generates more training samples to increase the diversity of the training data. This can be done via applying affine transformations (e.g., rotation, scaling), flipping or cropping to original labeled samples. |
|
Dropout |
Dropout is a regularization method that randomly drops some units from the neural network during training, encouraging the network to learn a sparse representation. It is used to reduce overfitting. |
|
Loss function |
The metric to assess the discrepancy between model predictions and labels is called loss function. The gradients of the loss function are used to update the weights of the neural networks. |
|
Transfer learning |
This aims to transfer knowledge from one task to another different but related target task. This is often achieved by reusing the weights of a pre-trained model, to initialize the weights in a new model for the target task. Transfer learning can help to decrease the training time and achieve lower generalization error. |
No. |
Author |
Year |
AI Model Used |
Pretrained CNN |
MRI (T) |
Localization Technique |
Validation |
Performance (Accuracy/AUC) |
Application Domain |
---|---|---|---|---|---|---|---|---|---|
1 |
Awan et al. [46] |
2021 |
CNN |
ResNet-14 |
1.5 T |
They applied normal approach to localize based upon region of interest (ROI) |
5-fold cross-validation |
92%/(healthy tear = 0.98, partial tear = 0.97 and fully ruptured tear = 0.99) |
ACL tear |
2 |
Jeon et al. [47] |
2021 |
3D CNN |
VGGNet, AlexNet, and SqueezeNet |
3 T & 1.5 T |
Custom localization technique |
5-fold cross-validation |
N/A/0.983 and 0.980 on the Chiba and Stanford knee datasets, respectively |
ACL tear |
3 |
Rizk et al. [48] |
2021 |
3D CNN |
CNN-based localization model |
1 T (54%)–1.5 T (9.7%)–3 T (36.3%) |
Custom localization technique |
ten-fold cross validation |
Meidal = N/A/0.93, Lateral = N/A/0.84 |
Meniscus tear |
4 |
Dai et al. [49] |
2021 |
TransMed |
N/A |
3 T & 1.5 T |
N/A |
120 exams |
ACL tear = 94.9%/0.98, Abnormality = 91.8%/0.976, Meniscus tear = 85.3%/0.95 |
ACL tear—Meniscus tear—Abnormalities |
5 |
Astuto et al. [50] |
2021 |
3D CNN |
N/A |
3 T |
V-Net |
Hold out (15% of sample) |
N/A/from 0.83 to 0.93 |
ACL tear—Meniscus tear—Cartilage Lession |
6 |
Fritz et al. [15] |
2020 |
DCNN |
N/A |
1.5 T (64%)–3 T (36%) |
To visually localize the tear, the software computes the class activation map (CAM) of the last convolution layer in the CNN and maps it to an axial knee image |
Hold out (10% of sample) |
Medial = (86%/0.88), Lateral = (84%/0.78), Overall = (N/A/0.96) |
Meniscus tear |
7 |
Namiri et al. [51] |
2020 |
CNN |
N/A |
3 T |
three-dimensional V-Net |
Hold out (10% of sample) |
3D-model = (89%/sensitivity of 89% and specificity of 88%), 2D-model = (92%/sensitivity of 93% and specificity of 90%) |
ACL tear |
8 |
Zhang et al. [6] |
2020 |
CNN |
3D DenseNet, VGG16, ResNet |
1.5 T (74%)–3 T (26%) |
- |
Hold out (20% of sample) |
Custom = (95.7%/0.96), ResNet = (NA/0.95), VGG16 = (NA/0.86) |
ACL tear |
9 |
Germann et al. [24] |
2020 |
DCNN |
N/A |
1.5 T–3 T |
They cropped manually |
Out of the 5802 MRI studies, 4802 were used for training, 500 for validation, and 500 for initial testing |
N/A/0.94 |
ACL tear |
10 |
Azcona et al. [52] |
2020 |
CNN |
MRNet, ResNet18, Resnet50 and ResNet152, ImageNet |
3 T (56.6%)–1.5 T (43.4%) |
- |
N/A |
NA/0.96–N/A/0.91–N/A/0.94 |
ACL tear—Meniscus tear—Abnormalities |
11 |
Chang et al. [8] |
2019 |
CNN |
ResNet |
1.5 T–3 T |
The object localization CNN was implemented as a fully convolutional network based on U-net architecture |
5-fold-cross-validation |
96.7%/0.97 |
ACL tear |
12 |
Liu et al. [53] |
2019 |
CNN |
LeNet-5, DenseNet, VGG16, AlexNet |
N/A |
They used object detection technique YOLO |
50 subjects test set (14% of the sample) |
N/A/0.98 |
ACL tear |
13 |
Couteaux et al. [54] |
2019 |
CNN |
ResNet-101, ConvNet, R-CNN |
N/A |
To localize both menisci and identify tears in each meniscus, they used the Mask R-CNN framework |
54 cases and the model with the highest validation accuracy was selected |
N/A/0.90 |
Meniscus tear |
14 |
Pedoia et al. [55] |
2019 |
2D U-Net, CNN |
N/A |
3 T |
- |
Hold out (20% of sample) |
Sensitivity of 89.81% and specificity of 81.98% |
Meniscus tear |
15 |
Roblot et al. [56] |
2019 |
CNN |
AlexNet, MRNet |
N/A |
They used object detection technique Fast RCNN & Faster RCNN |
The algorithm was thus used on a test dataset composed of 700 images for external validation |
72.5%/0.85 |
Meniscus tear |
16 |
Nicholas Bien et al. [27] |
2018 |
CNN |
AlexNET, MRNet |
3 T (56.6%)–1.5 T (43.4%) |
- |
120 exams |
86.7%/0.97–72.5%/0.85–N/A/0.94 |
ACL tear—Meniscus tear—Abnormalities |
17 |
Liu et al. [57] |
2018 |
CNN |
VGG16 |
3 T |
- |
fellowship trained musculoskeletal radiologist (R.K., with 15 years of clinical experience) |
N/A/0.92 |
Cartilage lesion |
18 |
Stajduhar et al. [58] |
2017 |
HOG + linSVM, HOG + RF, GIST + rbfSVM, GIST + RF |
N/A |
1.5 T |
Manual extraction of a rectangular ROI |
10-fold cross validation |
(Injury detection problem, complete rupture) = (N/A/0.89, N/A/0.94), (N/A/0.88, N/A/0.94), (N/A/0.889, N/A/0.91), (N/A/0.88, N/A/0.90) respectively with the models |
ACL tear |
19 |
Mazlan et al. [59] |
2017 |
SVM |
N/A |
N/A |
They use cropping technique |
Hold out (10% of sample) |
100%/N/A |
ACL tear |
20 |
Zarandi et al. [60] |
2016 |
IT2FCM, PNN |
N/A |
N/A |
- |
Hold out (20% of sample) |
0 and 1 mode: 90%/N/A Binary mode: 78%/N/A |
Meniscus tear |
21 |
Fu et al. [61] |
2013 |
SVM |
N/A |
N/A |
Active Contours without Edges method. This method combines Active Contours with Level Sets and is called ACLS |
5-Fold cross validation |
SVM model: N/A/0.73 SFFS + SVM: N/A/0.91 |
Meniscus tear |
22 |
Abdullah et al. [62] |
2013 |
BP ANN, K-NN |
N/A |
N/A |
- |
5-fold and 6-fold |
BP ANN: 94.44%/N/A k-NN: 87.83%/N/A |
ACL tear |