Confusion Analysis in Learning Based on EEG Signals: Comparison
Please note this is a comparison between Version 2 by Lindsay Dong and Version 1 by Yu Liang.

Human–computer interaction (HCI) plays a significant role in modern education, and emotion recognition is essential in the field of HCI. The potential of emotion recognition in education remains to be explored. Confusion is the primary cognitive emotion during learning and significantly affects student engagement. Recent studies show that eElectroencephalogram (EEG) signals, obtained through electrodes placed on the scalp, are valuable for studying brain activity and identifying emotions. 

  • human–computer interaction
  • electroencephalographic
  • emotion recognition

1. Introduction

In modern education, human–computer interaction (HCI) plays a crucial role, with emotion recognition being particularly significant in the field of HCI. By accurately identifying and understanding students’ emotional states, educational systems can better respond to their needs and provide personalized support. Emotion recognition technology can assist educators in determining whether students are experiencing confusion, frustration, or focus during the learning process, enabling timely adoption of appropriate teaching strategies and supportive measures [1,2,3][1][2][3]. Therefore, the importance of emotion recognition in HCI and education is self-evident. It optimizes the teaching process, enhances learning outcomes, and provides students with more personalized support and guidance. Confusion is more common than other emotions in the learning process [4,5,6][4][5][6]. Although confusion is an unpleasant emotion, addressing confusion during controllable periods has been shown to be beneficial for learning [7,8,9][7][8][9], as it promotes active student engagement in learning activities. However, research on learning confusion is still in its early stages and requires further exploration.
Electroencephalography (EEG) is considered a physiological indicator of the aggregated electrical activity of neurons in the human brain’s cortex. EEG is employed to record such activities and, compared to non-physiological indicators like facial expressions and gestures, offers a relatively objective assessment of emotions, making it a reliable tool for emotion recognition [10].
Traditionally, the classification of EEG signals relies on manual feature extractors and machine learning classifiers [11], such as Naive Bayes, SVM, and Random Forest. Although deep-learning architectures are a more recent introduction, they have consistently improved performance [12]. Convolutional Neural Networks (CNNs) and Long Short-Term Memory Networks (LSTMs) are the primary architectures employed [13]. However, employing CNNs for feature extraction primarily focuses on local aspects, hindering temporal information perception. Although LSTM-based approaches exhibit commendable performance, they also struggle with global temporal representation. Various attempts with end-to-end hybrid networks [14] have been made. However, these endeavors have resulted in models with excessively intricate architectures, leading to sluggish convergence rates or even failures to converge. Furthermore, end-to-end methodologies lack the advantages of conventional feature extraction methods in representing EEG signals. The Transformer [15] has showcased its formidable capabilities in natural language processing (NLP), owing to its significant advantage in comprehending global semantics. However, its application in EEG systems is still an area that requires further exploration.

2. Confusion Analysis in Learning Based on EEG Signals

Confusion in learning refers to feeling perplexed or uncertain while absorbing knowledge or solving problems. Given its shared attributes with emotions, it is a nascent study area, primarily exploring confusion’s classification as an emotion or affective state. Confusion is deemed a cognitive emotion, indicating a state of cognitive imbalance [9,16][9][16]. Individuals are encouraged to introspect and deliberate upon the material to redress this imbalance and facilitate progress, enabling a more profound comprehension. Consequently, when confused, individuals tend to activate profound cognitive processes to pursue enhanced learning outcomes. The investigation into confusion within the learning context remains in its preliminary stages. Using EEG to recognize human emotions during various activities, including learning, is an area currently being explored. Recent research has focused on using electroencephalography to study cognitive states and emotions for educational purposes. These studies focus on attention or engagement [17[17][18],18], cognitive load, and some basic emotions such as happiness and fear. For example, researchers [19] used an EEG-based brain-computer interface (BCI) to record EEG in the FP1 region to track changes in attention. By utilizing visual and auditory cues, such as rhythmic hand raising, adaptive proxy robots can help students shift their attention when their attention falls below a preset threshold. The research results indicate that this BCI can improve learning performance. Most traditional EEG-based classification methods rely on two steps: feature extraction and classification, and emotion classification is no exception. Many researchers have focused on exploring effective EEG features for classification, and the advancement of machine learning methods and technologies has significantly contributed to the development of these traditional methods. There have been attempts using the Common Spatial Pattern (CSP) algorithm [20], such as the FBCSP algorithm [21], which filters signals through filter banks, computes CSP energy features for each signal output through time filters, and then selects and classifies these features. Despite enhancements to the original CSP method, these techniques solely focus on analyzing the CSP energy dimension, disregarding the incorporation of temporal contextual information. Kaneshiro et al. [11] proposed Principal Component Analysis (PCA), extracting feature vectors of specific sizes from minimally preprocessed EEG signals, followed by training a classifier based on Linear Discriminant Analysis (LDA). Karimi-Rouzbahani et al. [22] explored the discriminative power of many statistical and mathematical features, and their experiments on three datasets showed that multi-valued features like wavelet coefficients and the theta frequency band performed better. Zheng et al. [23] investigated the pivotal frequency bands and channels of multi-channel EEG data in emotion recognition. Jensen & Tesche [24] and Bashivan et al. [25] demonstrated through experiments that cortical oscillatory activity associated with memory operations primarily exists in the theta (4–7 Hz), alpha (8–13 Hz), and beta (13–30 Hz) frequency bands. The studies above utilize traditional machine learning classifiers to explore critical frequency bands and channels; nevertheless, traditional machine learning classifiers do not demonstrate any performance advantages. In addition, separately optimizing feature extraction and classifier could potentially result in suboptimal global optimization. Compared to traditional methods, end-to-end deep networks eliminate the need for manual feature extraction. For most EEG applications, it has been observed that shallow models yield good results, while deep models might lead to performance degradation [12,13][12][13]. Especially for classification based on CNNs, despite the shallow architectures of CNNs with few parameters, they have been widely utilized: DeepConvNet [12], EEGNet [26], ResNet [27], and other variants [28]. However, due to the limitations imposed by kernel size, CNNs can learn features with local receptive fields. However, they cannot capture the crucial long-term dependencies for time series analysis. Furthermore, Recurrent neural networks(RNNs) and long short-term memory(LSTM) are introduced to capture the temporal features of EEG classification [29,30][29][30]. However, these models cannot be trained in parallel, and the dependencies calculated by hidden states quickly vanish after a few time steps, making it challenging to capture global temporal dependencies. Moreover, end-to-end methods insist on utilizing deep networks to learn from raw signals, often overlooking the advantages of manual feature extraction, and complex networks can lead to difficulties in model convergence.

References

  1. Xu, T.; Wang, J.; Zhang, G.; Zhang, L.; Zhou, Y. Confused or not: Decoding brain activity and recognizing confusion in reasoning learning using EEG. J. Neural Eng. 2023, 20, 026018.
  2. Peng, T.; Liang, Y.; Wu, W.; Ren, J.; Pengrui, Z.; Pu, Y. CLGT: A graph transformer for student performance prediction in collaborative learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 15947–15954.
  3. Liang, Y.; Peng, T.; Pu, Y.; Wu, W. HELP-DKT: An interpretable cognitive model of how students learn programming based on deep knowledge tracing. Sci. Rep. 2022, 12, 4012.
  4. Baker, R.S.; D’Mello, S.K.; Rodrigo, M.M.T.; Graesser, A.C. Better to be frustrated than bored: The incidence, persistence, and impact of learners’ cognitive–affective states during interactions with three different computer-based learning environments. Int. J. Hum.-Comput. Stud. 2010, 68, 223–241.
  5. Han, Z.M.; Huang, C.Q.; Yu, J.H.; Tsai, C.C. Identifying patterns of epistemic emotions with respect to interactions in massive online open courses using deep learning and social network analysis. Comput. Hum. Behav. 2021, 122, 106843.
  6. Lehman, B.; Matthews, M.; D’Mello, S.; Person, N. What are you feeling? Investigating student affective states during expert human tutoring sessions. In Proceedings of the International Conference on Intelligent Tutoring Systems, Montreal, QC, Canada, 23–27 June 2008; Springer: Berlin/Heidelberg, Germany, 2008; pp. 50–59.
  7. Lehman, B.; D’Mello, S.; Graesser, A. Confusion and complex learning during interactions with computer learning environments. Internet High. Educ. 2012, 15, 184–194.
  8. D’Mello, S.; Lehman, B.; Pekrun, R.; Graesser, A. Confusion can be beneficial for learning. Learn. Instr. 2014, 29, 153–170.
  9. Vogl, E.; Pekrun, R.; Murayama, K.; Loderer, K.; Schubert, S. Surprise, curiosity, and confusion promote knowledge exploration: Evidence for robust effects of epistemic emotions. Front. Psychol. 2019, 10, 2474.
  10. Gunes, H.; Piccardi, M. Bi-modal emotion recognition from expressive face and body gestures. J. Netw. Comput. Appl. 2007, 30, 1334–1345.
  11. Kaneshiro, B.; Perreau Guimaraes, M.; Kim, H.S.; Norcia, A.M.; Suppes, P. A representational similarity analysis of the dynamics of object processing using single-trial EEG classification. PLoS ONE 2015, 10, e0135697.
  12. Schirrmeister, R.T.; Springenberg, J.T.; Fiederer, L.D.J.; Glasstetter, M.; Eggensperger, K.; Tangermann, M.; Hutter, F.; Burgard, W.; Ball, T. Deep learning with convolutional neural networks for EEG decoding and visualization. Hum. Brain Mapp. 2017, 38, 5391–5420.
  13. Roy, Y.; Banville, H.; Albuquerque, I.; Gramfort, A.; Falk, T.H.; Faubert, J. Deep learning-based electroencephalography analysis: A systematic review. J. Neural Eng. 2019, 16, 051001.
  14. Yang, Y.; Wu, Q.; Qiu, M.; Wang, Y.; Chen, X. Emotion recognition from multi-channel EEG through parallel convolutional recurrent neural network. In Proceedings of the IEEE 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–7.
  15. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008.
  16. Xu, T.; Zhou, Y.; Wang, Z.; Peng, Y. Learning emotions EEG-based recognition and brain activity: A survey study on BCI for intelligent tutoring system. Procedia Comput. Sci. 2018, 130, 376–382.
  17. Huang, J.; Yu, C.; Wang, Y.; Zhao, Y.; Liu, S.; Mo, C.; Liu, J.; Zhang, L.; Shi, Y. FOCUS: Enhancing children’s engagement in reading by using contextual BCI training sessions. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Toronto, ON, Canada, 26 April–1 May 2014; pp. 1905–1908.
  18. Xu, T.; Wang, X.; Wang, J.; Zhou, Y. From textbook to teacher: An adaptive intelligent tutoring system based on BCI. In Proceedings of the 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Guadalajara, Mexico, 1–5 November 2021; pp. 7621–7624.
  19. Xu, J.; Zhong, B. Review on portable EEG technology in educational research. Comput. Hum. Behav. 2018, 81, 340–349.
  20. Ramoser, H.; Muller-Gerking, J.; Pfurtscheller, G. Optimal spatial filtering of single trial EEG during imagined hand movement. IEEE Trans. Rehabil. Eng. 2000, 8, 441–446.
  21. Ang, K.K.; Chin, Z.Y.; Wang, C.; Guan, C.; Zhang, H. Filter bank common spatial pattern algorithm on BCI competition IV datasets 2a and 2b. Front. Neurosci. 2012, 6, 39.
  22. Karimi-Rouzbahani, H.; Shahmohammadi, M.; Vahab, E.; Setayeshi, S.; Carlson, T. Temporal codes provide additional category-related information in object category decoding: A systematic comparison of informative EEG features. bioRxiv 2020.
  23. Zheng, W.L.; Lu, B.L. Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks. IEEE Trans. Auton. Ment. Dev. 2015, 7, 162–175.
  24. Jensen, O.; Tesche, C.D. Frontal theta activity in humans increases with memory load in a working memory task. Eur. J. Neurosci. 2002, 15, 1395–1399.
  25. Bashivan, P.; Bidelman, G.M.; Yeasin, M. Spectrotemporal dynamics of the EEG during working memory encoding and maintenance predicts individual behavioral capacity. Eur. J. Neurosci. 2014, 40, 3774–3784.
  26. Lawhern, V.J.; Solon, A.J.; Waytowich, N.R.; Gordon, S.M.; Hung, C.P.; Lance, B.J. EEGNet: A compact convolutional neural network for EEG-based brain–computer interfaces. J. Neural Eng. 2018, 15, 056013.
  27. Tian, T.; Wang, L.; Luo, M.; Sun, Y.; Liu, X. ResNet-50 based technique for EEG image characterization due to varying environmental stimuli. Comput. Methods Programs Biomed. 2022, 225, 107092.
  28. Kalafatovich, J.; Lee, M.; Lee, S.W. Decoding visual recognition of objects from eeg signals based on attention-driven convolutional neural network. In Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Toronto, ON, Canada, 11–14 October 2020; pp. 2985–2990.
  29. Chowdary, M.K.; Anitha, J.; Hemanth, D.J. Emotion recognition from EEG signals using recurrent neural networks. Electronics 2022, 11, 2387.
  30. Lu, P. Human emotion recognition based on multi-channel EEG signals using LSTM neural network. In Proceedings of the IEEE 2022 Prognostics and Health Management Conference (PHM-2022 London), London, UK, 27–29 May 2022; pp. 303–308.
More
Video Production Service