Predictive Modeling of Student Dropout in MOOCs: History
Please note this is an old version of this entry, which may differ significantly from the current revision.

The features of massive open online courses (MOOCs), such as internet-based massiveness, openness, and flexible learning, create a unique blend of a large number of learners, making the prediction of learner success (as well as providing support based on these predictions) particularly challenging. 

  • Massive Open Online Courses
  • prediction
  • MOOCs

1. Massive Open Online Courses (MOOCs)

MOOCs are a new distance education model established in 2008 by George Siemens. Many MOOC platforms offer educational design with video lectures, announcements, forums, and assessments (quizzes, assignments, etc.) [1][2]. Some MOOCs allow students to progress at their own pace, while others follow a predetermined schedule [3]. The acquisition of completion certificates serves as a motivation for many students [4]. Most students fail to complete MOOCs successfully, even if they intend to do so [5]. The challenges of MOOCs include the absence of a supporting and guiding instructor [6], limited social interaction between teachers and students [7], and the most critical challenge lies in the high dropout rates observed within the MOOC environment [3]. In the literature, dropout rate percentages are cited as 93.5% [8] or 91–93% [9].
Many studies have extensively explored the phenomenon of dropout rates in MOOC courses. Several researchers have indicated significantly low percentages in completion rates, falling below 10% and even as low as 5% [3][10][11][12]. Although these figures may appear alarming, such an assessment is based on the assumption that enrollment in a MOOC is comparable to enrolling in a traditional course, which is not always the case, as the intentions of individuals enrolling in a MOOC differ—some seek professional development. In contrast, others pursue simple information or entertainment [11]. However, it is essential to acknowledge some positive exceptions to the high dropout rates. For instance, programming MOOCs demonstrated retention rates above 60% [3][13].
Identifying and exploring factors directly influencing the attrition of students from MOOCs will enable researchers and educators to examine novel strategies and techniques to enhance students’ persistence and successful course completion. Dalipi et al. [14] have categorized these factors contributing to high dropout rates into those associated with students (such as lack of motivation, poor time management, inadequate background knowledge, and skills) and those related to MOOCs (course design, lack of interactions, hidden costs).
Ihantola et al. [3] conducted a study to investigate the attrition rates of students in MOOCs with flexible versus strict scheduling. The findings revealed that students enrolled in MOOCs with flexible scheduling were more likely to drop out early compared to those in MOOCs with rigid schedules. In their study, approximately 17% of students abandoned the course within the first week, while the corresponding rate for the flexible MOOC was 50%. However, after the initial week, the dropout behavior between the two versions of MOOCs became nearly similar.
Furthermore, the researchers observed that both versions of MOOCs had students who completed all computer programming assignments within a week but did not continue further, possibly due to perceiving a heavy workload. Therefore, the authors suggest that identifying the profiles of students who benefit from each type of MOOC could lead to novel, more effective methods of organizing and grading courses. Previous studies have also identified that the lack of a sense of community and ineffective social interactions and collaborations contribute to the high attrition rates in MOOCs [7][14][15].
Hone and El Said [16] also investigated factors influencing retention in MOOCs and found that 32.2% of students successfully completed their preferred courses, a rate surpassing the average completion rate. The main driver for their completion was the satisfaction derived from the course content, which was perceived as unique and not readily available elsewhere. However, non-completers identified several reasons for their discontinuation, including feelings of isolation due to inadequate communication channels, perceived complexity and technical difficulties of the courses, and a lack of engagement. A related study by Zhang [17] explored how to enhance MOOC attractiveness by aligning courses with students’ regulatory foci. The observations made in the study indicate that students with promotion-focused mindsets were more influenced by advocates emphasizing gains and positive outcomes, while prevention-focused students responded better to advocates stressing the avoidance of losses.

2. Prediction of Dropout and the SRL Factor

According to Gardner and Brooks (2018) [2], regarding the statistical models used to map features to predictions, supervised learning techniques are extensively used in predictive student modeling in MOOCs, compared to unsupervised approaches, as student dropout/stopout is easily observable. In supervised learning, machines are trained using well “labelled” training data, and on the basis of that data, machines predict the output. Authors chose to present indicatively popular techniques for MOOC learner modeling, with very good empirical performance in large-scale MOOC modeling studies, (e.g., Dass et al., 2021 [18]). LR and SVM are among the most used, while NB, kNN, and DT are less frequently used in surveys ([2]). There is no pattern found for an algorithm to be distinguished compared to others. According to Herrmannova et al., 2015 [19], each model captures different properties of input data and the results are complementary.
Self-regulated learning (SRL) is a complex multidimensional phenomenon often described by a set of individual cognitive, social, metacognitive, and behavioral processes embedded in a cyclical model. Students with limited application of SRL strategies do not perform well [1][20][21][22]. Zimmerman [23] proposed a cyclical model consisting of three interrelated phases of the learning process. In the forethought phase, self-regulated learners set learning goals and design the strategy for their learning. This is followed by the performance and control phase, during which self-regulated learners employ strategies to process the learning material. They seek help when needed, manage their time, structure their environment, and monitor their learning processes. In the third phase of self-reflection, self-regulated learners evaluate their performance and adjust their strategies to achieve their learning goals [1][18][22][24][25].
Research using questionnaires has shown positive correlations between the mentioned SRL activity and the completion of MOOCs [24]. Despite the limitations of self-reported data [26], the large sample size enhances their usefulness. On the other hand, the use of trace data in measuring SRL has increased, but their interpretation remains challenging [26]. Jansen et al. [24] propose the combined use of SRL data from traces and questionnaires. Timely SRL support interventions in MOOCs should be considered significant pedagogical tools that contribute to achieving positive outcomes for students [20][21].
The features of MOOCs, such as internet-based massiveness, openness, and flexible learning, create a unique blend of a large number of learners, making the prediction of learner success (as well as providing support based on these predictions) particularly challenging. Several researchers have developed prediction models by employing machine learning (ML) algorithms [27] and adopting supervised, unsupervised, and semi-supervised architectures [28]. Deep learning methods are also utilized for predicting dropout. For instance, Moreno-Marcos et al. [29] applied a combination of random forest (RF), generalized linear model (GLM), support vector machines (SVM), and decision trees (DT). Feng et al. [10] utilized logistic regression (LR), support vector machine with a linear kernel (SVM), random forest (RF), gradient boosting decision tree (GBDT), and a three-layer deep neural network (DNN) for their analysis.
Diverse features, even in limited quantities, provide a more comprehensive, multidimensional view of learners and can improve the quality of predictive models. Collecting additional data, especially during the initial weeks of a course, enhances prediction performance [2][28]. For successful timely interventions, predictive models need to be transferable, meaning they perform well on new course iterations by utilizing historical data [30][31][32]. Some researchers have examined specific aspects of SRL and observed its impact on predicting success, including goal setting and strategic planning [6], student-programmed plans [33], the combination of self-reported SRL strategies and patterns of interaction sequences, demographic features, and intentions [34]. For example, Kizilcec et al. (2017) [6] investigated which specific SRL strategies predict the attainment of personal course goals and how they are mapped in specific interactions with MOOCs’ online content. Yielding to form a longitudinal account of SRL, authors combined learners’ self-reported SRL strategies and characteristics, achievement data, and records of individual’s engagement with the course’s content. Using multiple linear and logistic regression modeling, Maldonado-Mahauad et al. (2018) [34] concluded that specific self-reported SRL strategies (“goal setting”, “strategic planning”, “elaboration” and “help seeking”), complex behavioral data on MOOC’s platform like meaningful online activity sequence patterns, self-reported prior experience and level of interest in MOOC’s assessments as well as total time spent online belong to factors that contribute to the prediction of MOOC learners’ success.

This entry is adapted from the peer-reviewed paper 10.3390/computers12100194

References

  1. Hsu, S.Y. An Experimental Study of Self-Regulated Learning Strategies Application in MOOCs. Ph.D. Thesis, Teachers College, Columbia University, New York, NY, USA, 2021.
  2. Gardner, J.; Brooks, C. Student success prediction in MOOCs. User Model. User-Adapt. Interact. 2018, 28, 127–203.
  3. Ihantola, P.; Fronza, I.; Mikkonen, T.; Noponen, M.; Hellas, A. Deadlines and MOOCs: How Do Students Behave in MOOCs with and without Deadlines. In Proceedings of the 2020 IEEE Frontiers in Education Conference (FIE), Uppsala, Sweden, 21–24 October 2020; IEEE: Piscateville, NJ, USA, 2020; pp. 1–9.
  4. Chuang, I.; Ho, A. HarvardX and MITx: Four years of open online courses-fall 2012-summer 2016. 2016. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2889436 (accessed on 1 June 2023).
  5. Kizilcec, R.F.; Schneider, E. Motivation as a lens to understand online learners: Toward data-driven design with the OLEI scale. ACM Trans. Comput.-Hum. Interact. (TOCHI) 2015, 22, 1–24.
  6. Kizilcec, R.F.; Pérez-Sanagustín, M.; Maldonado, J.J. Self-regulated learning strategies predict learner behavior and goal attainment in Massive Open Online Courses. Comput. Educ. 2017, 104, 18–33.
  7. Zheng, S.; Rosson, M.B.; Shih, P.C.; Carroll, J.M. Designing MOOCs as interactive places for collaborative learning. In Proceedings of the Second (2015) ACM Conference on Learning@ Scale, Vancouver, BC, Canada, 14–18 March 2015; pp. 343–346.
  8. Jordan, K. Initial trends in enrolment and completion of massive open online courses. Int. Rev. Res. Open Distrib. Learn. 2014, 15, 133–160.
  9. Peng, D.; Aggarwal, G. Modeling mooc dropouts. Entropy 2015, 10, 1–5.
  10. Feng, W.; Tang, J.; Liu, T.X. Understanding dropouts in MOOCs. Proc. AAAI Conf. Artif. Intell. 2019, 33, 517–524.
  11. Eriksson, T.; Adawi, T.; Stöhr, C. “Time is the bottleneck”: A qualitative study exploring why learners drop out of MOOCs. J. Comput. High. Educ. 2017, 29, 133–146.
  12. Reich, J. MOOC completion and retention in the context of student intent. EDUCAUSE Rev. Online 2014.
  13. Lepp, M.; Luik, P.; Palts, T.; Papli, K.; Suviste, R.; Säde, M.; Tõnisson, E. MOOC in programming: A success story. In Proceedings of the International Conference on e-Learning, Belgrade, Serbia, 28–29 September 2017; pp. 138–147.
  14. Dalipi, F.; Imran, A.S.; Kastrati, Z. MOOC dropout prediction using machine learning techniques: Review and research challenges. In Proceedings of the 2018 IEEE Global Engineering Education Conference (EDUCON), Santa Cruz de Tenerife, Spain, 17–20 April 2018; IEEE: Piscateville, NJ, USA, 2018; pp. 1007–1014.
  15. Zheng, S.; Rosson, M.B.; Shih, P.C.; Carroll, J.M. Understanding student motivation, behaviors and perceptions in MOOCs. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work and Social Computing, Vancouver, BC, Canada, 13–18 March 2015; pp. 1882–1895.
  16. Hone, K.S.; El Said, G.R. Exploring the factors affecting MOOC retention: A survey study. Comput. Educ. 2016, 98, 157–168.
  17. Zhang, J. Can MOOCs be interesting to students? An experimental investigation from regulatory focus perspective. Comput. Educ. 2016, 95, 340–351.
  18. Dass, S.; Gary, K.; Cunningham, J. Predicting student dropout in self-paced MOOC course using random forest model. Information 2021, 12, 476.
  19. Herrmannova, D.; Hlosta, M.; Kuzilek, J.; Zdrahal, Z. Evaluating weekly predictions of at-risk students at the open university: Results and issues. In Proceedings of the EDEN 2015 Annual Conference Expanding Learning Scenarios: Opening out the Educational Landscape, Barcelona, Spain, 9–12 June 2015.
  20. Callan, G.L.; Longhurst, D.; Ariotti, A.; Bundock, K. Settings, exchanges, and events: The SEE framework of self-regulated learning supportive practices. Psychol. Sch. 2021, 58, 773–788.
  21. Sebesta, A.J.; Bray Speth, E. How should I study for the exam? Self-regulated learning strategies and achievement in introductory biology. CBE—Life Sci. Educ. 2017, 16, ar30.
  22. Zimmerman, B.J. Self-efficacy: An essential motive to learn. Contemp. Educ. Psychol. 2000, 25, 82–91.
  23. Zimmerman, B.J. Investigating self-regulation and motivation: Historical background, methodological developments, and future prospects. Am. Educ. Res. J. 2008, 45, 166–183.
  24. Jansen, R.S.; van Leeuwen, A.; Janssen, J.; Conijn, R.; Kester, L. Supporting learners’ self-regulated learning in Massive Open Online Courses. Comput. Educ. 2020, 146, 103771.
  25. Zimmerman, B. Becoming learner: Self-regulated overview. Theory Into Pract. 2002, 41, 64–70.
  26. Winne, P.H. Learning analytics for self-regulated learning. In Handbook of Learning Analytics; SOLAR, Society for Learning Analytics and Research: New York, NY, USA, 2017; pp. 241–249.
  27. Cunningham, J.A. Predicting Student Success in a Self-Paced Mathematics MOOC. Ph.D. Thesis, Arizona State University, Tempe, AZ, USA, 2017.
  28. Mourdi, Y.; Sadgal, M.; El Kabtane, H.; Fathi, W.B. A machine learning-based methodology to predict learners’ dropout, success or failure in MOOCs. Int. J. Web Inf. Syst. 2019, 15, 489–509.
  29. Moreno-Marcos, P.M.; Munoz-Merino, P.J.; Maldonado-Mahauad, J.; Perez-Sanagustin, M.; Alario-Hoyos, C.; Kloos, C.D. Temporal analysis for dropout prediction using self-regulated learning strategies in self-paced MOOCs. Comput. Educ. 2020, 145, 103728.
  30. Kuzilek, J.; Zdrahal, Z.; Fuglik, V. Student success prediction using student exam behaviour. Future Gener. Comput. Syst. 2021, 125, 661–671.
  31. Wan, H.; Liu, K.; Yu, Q.; Gao, X. Pedagogical intervention practices: Improving learning engagement based on early prediction. IEEE Trans. Learn. Technol. 2019, 12, 278–289.
  32. Kuzilek, J.; Hlosta, M.; Herrmannova, D.; Zdrahal, Z.; Vaclavek, J.; Wolff, A. OU Analyse: Analysing at-risk students at The Open University. Learn. Anal. Rev. 2015, LAK15-1, 1–16.
  33. Yeomans, M.; Reich, J. Planning prompts increase and forecast course completion in massive open online courses. In Proceedings of the Seventh International Learning Analytics and Knowledge Conference, Vancouver, BC, Canada, 13–17 March 2017; pp. 464–473.
  34. Maldonado-Mahauad, J.; Pérez-Sanagustín, M.; Kizilcec, R.F.; Morales, N.; Munoz-Gama, J. Mining theory-based patterns from Big Data: Identifying self-regulated learning strategies in Massive Open Online Courses. Comput. Hum. Behav. 2018, 80, 179–196.
More
This entry is offline, you can click here to edit this entry!
ScholarVision Creations