Automated solutions for medical diagnosis based on computer vision form an emerging field of science aiming to enhance diagnosis and early disease detection. The detection and quantification of facial asymmetries enable facial palsy evaluation. Deep learning methods allow the automatic learning of discriminative deep facial features, leading to comparatively higher performance accuracies.
Ref. | Objective | Methodology | Dataset | Performance | Conclusions/Limitations |
---|---|---|---|---|---|
[63] | Smartphone-based FP diagnostic system (five FP grades) | Linear regression model for facial landmark detection and SVM with linear kernel for classification | Private dataset of 36 subjects (23 noral−13 palsy patients) performing 3 motions | 88.9% classification accuracy | Reproducibility under different experimental conditions, as well as repeatability of measurements over a period of time, were not implemented |
[64] | Facial movement patterns recognition for FP (2 classes, i.e., normal and asymmetric) | Active Shape Models plus Local Binary Patterns (ASMLBP) for feature extraction and SVM for classification | Private dataset of 570 images of 57 subjects with 5 facial movements | Up to 93.33% recognition rate | High robustness and accuracy |
[65] | Quantitative evaluation of FP (HB scale) | Multiresolution extension of uniform LBP and SVM for FP evaluation | Private dataset of 197 subject videos with 5 facial movements | ~94% classification accuracy | Sensitive to out-plane facial movements, with significant natural bilateral asymmetry |
[51] | Facial landmarks tracking and feedback for FP assessment (HB scale) | Active Appearance Models (AAMs) for facial expression synthesis | Private dataset of frontal images of neutral and smile expressions from 5 healthy subjects | 87% accuracy | Preliminary results to demonstrate a proof of concept |
[66] | FP assessment | ANN | Private dataset of 43 videos from 14 subjects | 1.6% average MSE | Pilot study; general results follow the opinions of experts |
[67] | Facial asymmetry measurement | Measuring 3D asymmetry index | Three-dimensional dynamic scans from Hi4D-ADSIP database (stroke) | - | Extraction of 3D feature points, as well as potential for detecting facial dysfunctions |
[68] | FP classification of real-time facial animation units (seven FP grades) | Ensemble learning SVM classifier | Private dataset of 375 records from 13 patients and 1650 records from 50 control subjects | 96.8% accuracy 88.9% sensitivity 99% specificity |
Data augmentation for the imbalanced dataset issues |
[69] | FP quantification | Combination of landmarks and intensity HoG-based features and a CNN model for classification |
Private dataset of 125 images of left facial weakness, 126 images of right facial weakness, and 186 images of normal subjects | Up to 94.5% accuracy | The combination of landmarks and HoG intensity features produced the best, when compared to either landmarks or intensity features separately |
[70] | FP classification (three classes) | HOG features and a voting classifier | Private dataset of 37 videos of left weakness, 38 of right and 60 of normal subjects | 92.9% accuracy 93.6% precision 92.8% recall 94.2% specificity |
Comparison with other methods revealed the reliability of HOG features |
[71] | Facial metric calculation of face sides symmetry | Facial landmark features with cascade regression and SVM | Stroke faces dataset of 1024 images and 1081 images of healthy faces | 76.87% accuracy | Machine learning problem-specific models can lead to improved performances |
[72] | FP assessment (HB scale) | Laser speckle contrast imaging and NN classifiers | Private dataset of 80 FP patients | 97.14% accuracy | Outperforms the state-of-the-art systems and other classifiers |
[73] | FP classification (three classes) | Regional handcrafted features and four classifiers (MLP, SVM, k-NN, MNLR) | YouTube Facial Palsy (YFP) database | Up to 95.58% correct classification | Severity is higher classified in eyes and mouth regions |
[75] | Face symmetry analysis (symmetrical-asymmetrical) | Unified multi-task CNN |
AFLW database to fine tune the model and extended Cohn–Kanade (CK+) to learn face symmetry (18,786 images in total) | - | Lack of fully annotated training set, as well as the need for labeling or a synthesized training set |
[76] | FP classification (five grades) | CNN (VGG-16) | Dataset from online sources augmented to 2000 images | 92.6% accuracy 92.91% precision 93.14% sensitivity 93% F1 Score |
Deep features combined with data augmentation can lead to robust classification |
[5] | FP classification | FCN | AFLFP dataset | Normalized mean error (NME): 11.5% Mean average: 2.3% standard deviation | Comparative results indicate that deep learning methods are, overall, better than machine learning methods |
[33] | Quantitative analysis of FP | Deep Hierarchical Network | YouTube Facial Palsy (YFP) database | 5.83% NME | Line segment learning leads to an important part of deep features being able to improve the accuracy of facial landmark and palsy region detection |
[77] | Quantitative analysis of FP | Hierarchical Detection Network | YouTube Facial Palsy (YFP) database |
Up to 93% precision and 88% recall | Efficient for video-to-description diagnosis |
[78] | Unilateral peripheral FP assessment (HB scale) | Deep CNN | Private dataset of 720 labeled images of four facial expressions | 91.25% classification accuracy | Fine-tuning deep CNNs can learn specific representations from biomedical images |
[79] | FP grading | Fully 3D CNN | Private FP dataset of 696 sequences with 17 subjects | 82% classification accuracy | Very competent at learning spatio-temporal features |
[80] | AR system for FP estimation | Light-Weight Facial Activation Unit model (LW-FAU) | Private dataset from 20 subjects | - | Lack of FP benchmark models and datasets |
[81] | FP assessment (six classes) | FNPARCELM-CCNN method | YouTube Facial Palsy (YFP) database |
85.5% accuracy | Semi-supervised methods can distinguish different degrees of FP, even with little-labeled data |
[82] | FP detection and classification | Deep feature extraction with SqueezeNet and ECOC-SVM classifier | YouTube Facial Palsy (YFP) database |
99.34% accuracy | Improvement in FP detection from a small dataset |
[83] | Part segmentation | Point-Net++ and PointCNN | CT images of 33 subjects | 99.19% accuracy 89.09% IOU |
Geometric deep learning can be efficient |
[84] | FP asymmetry analysis | Proposed deep architecture | YouTube Facial Palsy (YFP) database |
93.8% IOU | Poor with bearded faces due to a lack of such training data images |
From the information included in Table 4, useful conclusions can be drawn. The lack of available datasets designated for palsy detection and evaluation is obvious. Most research teams develop their own private sets to test their algorithms. The most used public dataset among the referenced works is the YFP dataset; however, it refers to a limited video dataset. The videos are converted into image sequences; however, low dysfunctions cannot be easily visible from only one image and, thus, a sequence of frames needs to be examined to draw conclusions. Moreover, the dataset is labeled but facial landmark points are not annotated. From Table 4, it can be observed that deep learning methods lead to better performance results compared to machine learning methods or methods relying on hand-crafted features.
This entry is adapted from the peer-reviewed paper 10.3390/axioms12121091