Human Emotion Recognition System

Human Emotion Recognition System: History

Please note this is an old version of this entry, which may differ significantly from the current revision.

Contributor:

Emotion recognition becomes an important aspect in the development of human-machine interaction (HMI) systems. Positive emotions impact our lives positively, whereas negative emotions may cause a reduction in productivity. Emotionally intelligent systems such as chatbots and artificially intelligent assistant modules help make our daily life routines effortless. Moreover, a system which is capable of assessing the human emotional state would be very helpful to assess the mental state of a person.

electrocardiogram (ECG)
emotion classifier
emotion recognition system
human-machine interaction (HMI)
SVM
RF

1. Introduction

Emotion is an important aspect of human consciousness, which drives our mental state even subconsciously. The emotional state ensures mental well-being, as well as our overall health. The human emotional state is a result of chemical changes in the brain which affect the whole body and its overall expression and actions. They affect feelings, which certainly act as an important parameter which differentiates humans from other species. We feel a diverse range of emotions which may be often situational and might be triggered by outside events. A constant state of sadness for a long time causes depression. A state of severe mental illness may also result in physical illness. When a person is angry, the temperature of the body increases and may even lead to shivering. The blood pressure of a person fluctuates in cases of intense happiness and sadness. A state of intense fear results in sweating and an increase in heart rate. The state of disgust and surprise may lead to a reduction in the heart rate. A pleasant activity relaxes our mood and reduces stress which leads to a reduction in heart rate in comparison to a hyper state. The rhythm of the heart changes with emotions. Researchers have tried to correlate various facial features, speech signals, and audiovisual features, as well as physiological signals such as EEG (electroencephalogram), ECG, GSR (galvanic skin response), and respiration signals, with changes in emotions. Broadly, human emotion recognition systems are categorized into non-physiological- and physiological-based systems. Non-physiological systems utilize facial expressions, speech, audio, and video of the subject when exposed to elicit emotions through an external stimulus. As these features can be masked, for example, a happy person can pretend to have a serious and sad facial expression, as well as a sad person can pretend to have a smiling face, a physiological-based system is also of merit. This second method utilizes physiological signals such as ECG, EEG, GSR, and breathing signals as feature datasets to classify human emotions. The physiological signals are involuntary in their source of generation, hence they cannot be masked or controlled by the subject. Much work has been reported using non-physiological methods of emotion recognition.

2. Human Emotion Recognition Systems

With the advancement of technology, healthcare services have become more patient-oriented. The implementation of IoT (Internet of Things) and AI (artificial intelligence) with ML (machine learning)-based systems enables us to provide the required preventive care. They are widely used to develop smart healthcare systems for biomedical applications. The detection of diseases, from the very dangerous to the least dangerous, is being conducted using ML techniques. Machine-learning models can process a huge amount of patient data, including medical history, from many hospitals very quickly and are used in the detection and classification of diseases. For example, H. Zhu et al. presented an effective Siamese-oriented region proposal network (Siamese-ORPN) for visual tracking, and the authors proposed an efficient method of feature extraction and feature fusion ^[1]. W. Nie et al. illustrated a dialogue emotion detection system based on the variation of utterances represented in the form of graph models. The general knowledge conversation gestures in addition to the dialogue were utilized to enhance emotion detection accuracy. A self-supervised learning method was used and optimization techniques were proposed ^[2]. Zhiyong Xiong et al. developed a physiotherapy tool named SandplayAR and evaluated its impact on 20 participants with mild anxiety disorder. They showed the potential of the SandplayAR system with augmented sensory stimulation to be used as an efficient psychotherapy tool ^[3]. Rehab A. Rayan et al. briefly reviewed the potential of IoT and AI with ML technologies in biomedical applications. It was established by the authors that technology accelerates the transition from hospital-centered care to patient-centered care. With the development of ML techniques, healthcare devices are capable of handling, storing, and analyzing big data automatically and quickly ^[4]. Giacomo Peruzzi et al. developed a small, portable microcontroller-based sleep bruxism detection system. The system is capable of classifying the sound of bruxism from other audio signals and can detect the condition remotely at the patient’s house using a CNN-based ML technique ^[5]. Yuka Kutsumi et al. collected bowel sound (BS) data from 100 participants using a smartphone. They developed a CNN model which is capable of classifying BSs with an accuracy of 98% to comment on the gut health of a person ^[6]. Renisha Redij et al. also illustrated the application of AI in classifying BSs. The literature explains the relevance and potential of AI-enabled systems to change the GI practice with patient care ^[7]. Rita Zgheib et al. reviewed the importance of artificial intelligence with machine learning and semantic reasoning in the current scenario of digital healthcare systems. They illustrated and analyzed the relevance of AI and ML technologies in handling the COVID-19 pandemic ^[8]. Yash Jain et al. developed an ML-based healthcare management system which can act as a virtual doctor to provide a preliminary diagnosis based on the information provided by the subject. The CNN-based ML technique was utilized as well as a GUI interface developed. The system also includes an emotion classifier system. Such technologies are surely going to contribute to digital healthcare management and systems in the future ^[9].

2.1. Non-Physiological Signal-Based Emotion Classifiers

The non-physiological method of emotion classification involves inputs as responses such as speech, audio, video, and facial expressions corresponding to various emotions. K. P. Seng et al. reported an audiovisual feature-extraction algorithm and emotion classification technique. An RFA neural classifier was used to fuse kernel and Laplacian matrices of the visual path. The emotion recognition for seven types of expressions achieved an accuracy of 96.11% on the CK+ database and 86.67% on the ENTERFACE05 database ^[10]. Facial expressions can be masked by the subject who may control his/her reactions. T. M. Wani et al. reviewed various speech emotion recognition (SER) techniques in brief. The publicly available databases of speech signals in different languages include the list of models developed. Various feature extraction algorithms are explained and illustrated. The relevance and details of classifiers, such as GMM, HMM, ANN, SVM, KNN, DNN, etc., in speech-based emotion recognition have been illustrated ^[11]. The research gap includes the selection of robust features and machine-learning-based classification techniques to improve the accuracy of emotion recognition systems. M. S. Hossain et al. illustrated a real-time mobile-based emotion recognition system with fewer computational requirements. Facial video was used as data which were acquired using the inbuilt camera of a mobile phone and bandlet transform was realized with the Kruskal–Wallis feature selection method. The CK and JAFFE database were used to achieve a maximum accuracy of more than 99% ^[12]. Mobile systems have limitations in their data handling capacity and computational efficiency. S. Hamsa et al. utilized a speech correlogram and used an RF-classifier-based deep-learning technique to recognize human emotions in a noisy and stressful environment. English and Arabic datasets were used and processed to extract features after noise reduction. The four different datasets used were the ESD private Arabic dataset and the SUSAS, RAVDESS, and SAVEE public English datasets, and an average accuracy of more than 80% was achieved ^[13]. S. Hamsa et al. proposed an emotionally intelligent system to identify the emotion of an unknown speaker using energy, time, and spectral features for three distinct speech datasets of two different languages. Random forest classifiers were used to classify six different kinds of emotions and achieved a maximum accuracy of 89.60% ^[14]. L. Chen et al. proposed a dynamic emotion recognition system based on facial key features using an Adaboost-KNN adaptive feature optimization technique for human–robot interaction. Adaboost, KNN, and SVM were used for emotion classification. They reported a maximum accuracy of 94.28% ^[15]. S. Thuseethan et al. proposed a deep-learning-based unknown facial expression recognition technique. They presented a CNN-based architecture for efficient testing results. Model efficacy was evaluated using the benchmark emotion dataset and achieved a maximum accuracy of 86.22% ^[16]. Hira Hameed et al. reported a contactless British sign language detection system and classified five emotions with spatiotemporal features acquired employing a radar system. They used deep-learning models such as InceptionV3, VGG16, and VGG19 to achieve a maximum accuracy of 93.33% ^[17]. In ^[18], the authors presented a contextual cross-modal transformer module for the fusion of textual and audio modalities operated on IEMOCAP and MELD datasets to achieve a maximum accuracy of 84.27%. In ^[19], the authors illustrated a speech recognition technique on frequency domain features of an Arabic dataset using SVM, KNN, and MLP techniques to achieve a maximum recognition accuracy of 77.14%. In ^[20], the authors proposed a fusion model both at the feature level (with an LSTM network) and decision level for happy emotion and achieved a maximum accuracy of 95.97%. The non-physiological signals are maskable by the subject easily. Facial expressions can be controlled, as well as speech tones can be modulated intentionally.

2.2. Physiological Signal-Based Emotion Classifiers

Researchers have also explored the physiological method of emotion detection. Mainly, EEG and GSR signals have been used to develop classifier models. Even then, unimodal ECG-based human recognition systems offering high accuracy using contactless acquisition of the ECG signal are still not much explored. In contactless systems, ECG data acquisition with minimum artefacts has always been a challenge to researchers.

M. R. Islam et al. conducted an extensive review of EEG-based emotion recognition techniques in two categories, deep-learning- and shallow-learning-based models. A very detailed list of features used by researchers for the development of emotion classification models was reported. The paper analyzed the relevance of features, classifier models, and publicly available datasets. The authors minutely identified the advantages, as well as the issues, of each technique reported in the domain and suggested possible methods to overcome them ^[21]. E. P. Torres et al. illustrated RF and deep-learning algorithms to classify emotional states in stock trading behavior using features (five frequency bands, DE, DASM, and RASM) of an EEG signal. The relevance of each feature was identified by a chi-square test, and a maximum accuracy of 83.18% was achieved ^[22]. T. Song et al. developed a multimodal physiological signal database which includes EEG, ECG, GSR, and respiration signals. Some video clips were selected to induce emotions, and SVM and KNN were used to classify emotions. Moreover, they proposed a novel A-LSTM for more distinctive features and shared their database publicly. The Spearman correlation coefficient was used to identify negative and positive correlated emotions ^[23]. L. D. Sharma et al. used publicly available databases of ECG and EEG signals. Features were extracted using decomposition into reconstructed components using sliding mode spectral analysis techniques and machine-learning techniques for classification. Two publicly available databases, DREAMER and AMIGOS, were analyzed to achieve a maximum accuracy of 92.38% ^[24]. G. Li et al. used an EEG-signal-based SEED dataset and experimentally performed batch normalization. An LR classifier was implemented on PSD features of the EEG signals to improve the recognition accuracy of the system by up to 89.63% ^[25]. A. Goshvarpour et al. examined the effectiveness of the matching pursuit algorithm in emotion recognition problems. They acquired ECG and GSR data of 16 students (a smaller number of subjects) by exposing them to emotional music clips and developed an accurate emotion recognition system based on machine-learning classification tools (such as PCA-KNN) and discriminant analysis. They achieved a 100% accuracy rate and concluded that ECG is a more effective parameter to classify emotions in comparison with GSR ^[26]. The sample size taken was small. Huanpu Yin et al. presented a contactless IoT user identification and emotion recognition technique. A multi-scale neural network with a mmWave radar system was designed for accurate and robust sensing and achieved a maximum accuracy of 87.68% ^[27]. Alex Sepulveda et al. established the use of ECG signal features extracted from the AMIGOS database using wavelet transform and classified emotions using KNN, SVM, ensemble, etc., to achieve a maximum accuracy of 89.40% ^[28]. Muhammad Anas Hasnul et al. reviewed emotion recognition systems based on ECG signals and established the emotional aspect of various heart nodes. They highlighted systems with an accuracy of more than 90% and also validated the publicly available databases ^[29]. In ^[30], the authors included a comment on the relationship between emotional state and personality traits in which EEG, ECG, and GSR with facial features were utilized to establish a non-linear correlation. In ^[31], the authors proposed a deep fusion multimodal model to enhance the accuracy of class separation for emotion recognition on DEAP and DECAF datasets and achieved a maximum accuracy of 70%. An emotion recognition system with multiple modalities such as facial expression, GSR, and EEG using LUMED-2 and DEAP datasets to classify seven emotions with a maximum accuracy of 81.2% has been reported ^[32]. In ^[30], the authors reported a hybrid sensor fusion approach to develop a user-independent emotion recognition system. WMD-DTW (a weighted multi-dimensional DTW) and KNN were used on E4 and MAHNOB datasets to achieve a maximum accuracy of 94% ^[33]. A real-time IoT and LSTM-based emotion recognition system utilizing physiological signals was reported to classify emotions by up to an F-score of 95% achieved using deep-learning techniques ^[34]. Many of the above have either preferred using publicly available data or the subjects they have investigated have been too few. The classification accuracies achieved in the reported work have room for improvement. The subjects investigated in ^[26] are too few to claim 100% accuracy in which the classifier generalization may be questionable.

This entry is adapted from the peer-reviewed paper 10.3390/diagnostics13122097

References

Zhu, H.; Xue, M.; Wang, Y.; Yuan, G.; Li, X. Fast Visual Tracking With Siamese Oriented Region Proposal Network. IEEE Signal Process. Lett. 2022, 29, 1437–1441.
Nie, W.; Bao, Y.; Zhao, Y.; Liu, A. Long Dialogue Emotion Detection Based on Commonsense Knowledge Graph Guidance. IEEE Trans. Multimed. 2023, 1–15.
Xiong, Z.; Weng, X.; Wei, Y. SandplayAR: Evaluation of psychometric game for people with generalized anxiety disorder. Arts Psychother. 2022, 80, 101934.
Rayan, R.A.; Zafar, I.; Rajab, H.; Zubair, M.A.M.; Maqbool, M.; Hussain, S. Impact of IoT in Biomedical Applications Using Machine and Deep Learning. In Machine Learning Algorithms for Signal and Image Processing; Wiley: Hoboken, NJ, USA, 2022; pp. 339–360. ISBN 9781119861850.
Peruzzi, G.; Galli, A.; Pozzebon, A. A Novel Methodology to Remotely and Early Diagnose Sleep Bruxism by Leveraging on Audio Signals and Embedded Machine Learning. In Proceedings of the 2022 IEEE International Symposium on Measurements & Networking (M&N), Padua, Italy, 18–20 July 2022; pp. 1–6.
Kutsumi, Y.; Kanegawa, N.; Zeida, M.; Matsubara, H.; Murayama, N. Automated Bowel Sound and Motility Analysis with CNN Using a Smartphone. Sensors 2022, 23, 407.
Redij, R.; Kaur, A.; Muddaloor, P.; Sethi, A.K.; Aedma, K.; Rajagopal, A.; Gopalakrishnan, K.; Yadav, A.; Damani, D.N.; Chedid, V.G.; et al. Practicing Digital Gastroenterology through Phonoenterography Leveraging Artificial Intelligence: Future Perspectives Using Microwave Systems. Sensors 2023, 23, 2302.
Zgheib, R.; Chahbandarian, G.; Kamalov, F.; El Messiry, H.; Al-Gindy, A. Towards an ML-based semantic IoT for pandemic management: A survey of enabling technologies for COVID-19. Neurocomputing 2023, 528, 160–177.
Jain, Y.; Gandhi, H.; Burte, A.; Vora, A. Mental and Physical Health Management System Using ML, Computer Vision and IoT Sensor Network. In Proceedings of the 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 5–7 November 2020; pp. 786–791.
Seng, K.P.; Ang, L.-M.; Ooi, C.S. A Combined Rule-Based & Machine Learning Audio-Visual Emotion Recognition Approach. IEEE Trans. Affect. Comput. 2018, 9, 3–13.
Wani, T.M.; Gunawan, T.S.; Qadri, S.A.A.; Kartiwi, M.; Ambikairajah, E. A Comprehensive Review of Speech Emotion Recognition Systems. IEEE Access 2021, 9, 47795–47814.
Hossain, M.S.; Muhammad, G. An Emotion Recognition System for Mobile Applications. IEEE Access 2017, 5, 2281–2287.
Hamsa, S.; Iraqi, Y.; Shahin, I.; Werghi, N. An Enhanced Emotion Recognition Algorithm Using Pitch Correlogram, Deep Sparse Matrix Representation and Random Forest Classifier. IEEE Access 2021, 9, 87995–88010.
Hamsa, S.; Shahin, I.; Iraqi, Y.; Werghi, N. Emotion Recognition from Speech Using Wavelet Packet Transform Cochlear Filter Bank and Random Forest Classifier. IEEE Access 2020, 8, 96994–97006.
Chen, L.; Li, M.; Su, W.; Wu, M.; Hirota, K.; Pedrycz, W. Adaptive Feature Selection-Based AdaBoost-KNN With Direct Optimization for Dynamic Emotion Recognition in Human–Robot Interaction. IEEE Trans. Emerg. Top. Comput. Intell. 2021, 5, 205–213.
Thuseethan, S.; Rajasegarar, S.; Yearwood, J. Deep Continual Learning for Emerging Emotion Recognition. IEEE Trans. Multimed. 2021, 24, 4367–4380.
Hameed, H.; Usman, M.; Tahir, A.; Ahmad, K.; Hussain, A.; Imran, M.A.; Abbasi, Q.H. Recognizing British Sign Language Using Deep Learning: A Contactless and Privacy-Preserving Approach. IEEE Trans. Comput. Soc. Syst. 2022, 1–9.
Yang, D.; Huang, S.; Liu, Y.; Zhang, L. Contextual and Cross-Modal Interaction for Multi-Modal Speech Emotion Recognition. IEEE Signal Process. Lett. 2022, 29, 2093–2097.
Aljuhani, R.H.; Alshutayri, A.; Alahdal, S. Arabic Speech Emotion Recognition from Saudi Dialect Corpus. IEEE Access 2021, 9, 127081–127085.
Samadiani, N.; Huang, G.; Hu, Y.; Li, X. Happy Emotion Recognition from Unconstrained Videos Using 3D Hybrid Deep Features. IEEE Access 2021, 9, 35524–35538.
Islam, M.R.; Moni, M.A.; Islam, M.M.; Rashed-Al-Mahfuz, M.; Islam, M.S.; Hasan, M.K.; Hossain, M.S.; Ahmad, M.; Uddin, S.; Azad, A.; et al. Emotion Recognition from EEG Signal Focusing on Deep Learning and Shallow Learning Techniques. IEEE Access 2021, 9, 94601–94624.
Torres, E.P.; Torres, E.A.; Hernandez-Alvarez, M.; Yoo, S.G. Emotion Recognition Related to Stock Trading Using Machine Learning Algorithms With Feature Selection. IEEE Access 2020, 8, 199719–199732.
Song, T.; Zheng, W.; Lu, C.; Zong, Y.; Zhang, X.; Cui, Z. MPED: A Multi-Modal Physiological Emotion Database for Discrete Emotion Recognition. IEEE Access 2019, 7, 12177–12191.
Sharma, L.D.; Bhattacharyya, A. A Computerized Approach for Automatic Human. IEEE Sens. J. 2021, 21, 26931–26940.
Li, G.; Ouyang, D.; Yuan, Y.; Li, W.; Guo, Z.; Qu, X.; Green, P. An EEG Data Processing Approach for Emotion Recognition. IEEE Sens. J. 2022, 22, 10751–10763.
Goshvarpour, A.; Abbasi, A.; Goshvarpour, A. An accurate emotion recognition system using ECG and GSR signals and matching pursuit method. Biomed. J. 2017, 40, 355–368.
Yin, H.; Yu, S.; Zhang, Y.; Zhou, A.; Wang, X.; Liu, L.; Ma, H.; Liu, J.; Yang, N. Let IoT Knows You Better: User Identification and Emotion Recognition through Millimeter Wave Sensing. IEEE Internet Things J. 2022, 10, 1149–1161.
Sepúlveda, A.; Castillo, F.; Palma, C.; Rodriguez-Fernandez, M. Emotion Recognition from ECG Signals Using Wavelet Scattering and Machine Learning. Appl. Sci. 2021, 11, 4945.
Hasnul, M.A.; Aziz, N.A.A.; Alelyani, S.; Mohana, M.; Aziz, A.A. Electrocardiogram-Based Emotion Recognition Systems and Their Applications in Healthcare—A Review. Sensors 2021, 21, 5015.
Subramanian, R.; Wache, J.; Abadi, M.K.; Vieriu, R.L.; Winkler, S.; Sebe, N. Ascertain: Emotion and personality recognition using commercial sensors. IEEE Trans. Affect. Comput. 2018, 9, 147–160.
Zhang, X.; Liu, J.; Shen, J.; Li, S.; Hou, K.; Hu, B.; Gao, J.; Zhang, T. Emotion Recognition from Multimodal Physiological Signals Using a Regularized Deep Fusion of Kernel Machine. IEEE Trans. Cybern. 2021, 51, 4386–4399.
Cimtay, Y.; Ekmekcioglu, E.; Caglar-Ozhan, S. Cross-subject multimodal emotion recognition based on hybrid fusion. IEEE Access 2020, 8, 168865–168878.
Albraikan, A.; Tobon, D.P.; El Saddik, A. Toward User-Independent Emotion Recognition Using Physiological Signals. IEEE Sens. J. 2019, 19, 8402–8412.
Awais, M.; Raza, M.; Singh, N.; Bashir, K.; Manzoor, U.; Islam, S.U.; Rodrigues, J.J.P.C. LSTM-Based Emotion Detection Using Physiological Signals: IoT Framework for Healthcare and Distance Learning in COVID-19. IEEE Internet Things J. 2021, 8, 16863–16871.

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.