Modeling Student Behavior With Two-Layer Hidden Markov Models



Published Sep 19, 2017
Chase Geigle ChengXiang Zhai


Massive open online courses (MOOCs) provide educators with an abundance of data describing how students interact with the platform, but this data is highly underutilized today. This is in part due to the lack of sophisticated tools to provide interpretable and actionable summaries of huge amounts of MOOC activity present in log data. To address this problem, we propose a student behavior representation method alongside a method for automatically discovering those student behavior patterns by leveraging the click log data that can be obtained from the MOOC platform itself. Specifically, we propose the use of a two-layer hidden Markov model (2L-HMM) to extract our desired behavior representation, and show that patterns extracted by such a 2L-HMM are interpretable, meaningful, and unique. We demonstrate that features extracted from a trained 2L-HMM can be shown to correlate with educational outcomes.

How to Cite

Geigle, C., & Zhai, C. (2017). Modeling Student Behavior With Two-Layer Hidden Markov Models. JEDM | Journal of Educational Data Mining, 9(1), 1-24. Retrieved from
Abstract 1402 | PDF Downloads 1314 Source Code Downloads 0


BAKER , R. S. J. D ., CORBETT , A. T., AND ALEVEN , V. 2008. More Accurate Student Modeling through Contextual Estimation of Slip and Guess Probabilities in Bayesian Knowledge Tracing. In Proceedings of the 9th International Conference on Intelligent Tutoring Systems. ITS 2008. 406–415.

CORBETT , A. T. AND ANDERSON , J. R. 1994. Knowledge Tracing: Modeling the Acquisition of Procedural Knowledge. User Modeling and User-Adapted Interaction 4, 4, 253–278.

DAVIS, D., CHEN, G., HAUFF, C., AND HOUBEN, G.-J. 2016. Gauging MOOC Learners’ Adherence to the Designed Learning Path. In Proceedings of the 9th International Conference on Educational Data Mining. EDM ’16. International Educational Data Mining Society (IEDMS), 54–61.

DEMPSTER, A. P., LAIRD, N. M., AND RUDIN, D. B. 1977. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B 39, 1,1–38.

FAUCON, L., KIDZINSKI, L., AND DILLENBOURG, P. 2016. Semi-Markov model for simulating MOOC students. In Proceedings of the 9th International Conference on Educational Data Mining. EDM 2016. International Educational Data Mining Society (IEDMS), 358–363.

FINE, S., SINGER, Y., AND TISHBY, N. 1998. The Hierarchical Hidden Markov Model: Analysis and Applications. Mach. Learn. 32, 1 (July), 41–62.

GUPTA, R., KUMAR, R., AND VASSILVITSKII, S. 2016. On Mixtures of Markov Chains. In Advances in Neural Information Processing Systems 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, Eds. Curran Associates, Inc., 3441–3449.

HAMILTON, J. D. 1990. Analysis of time series subject to changes in regime. Journal of Econometrics 45, 1, 39 – 70.

HUANG, J., DASGUPTA, A., GHOSH, A., MANNING, J., AND SANDERS, M. 2014. Superposter Behavior in MOOC Forums. In Proceedings of the First ACM Conference on Learning @ Scale. 117–126.

HUANG, X., ARIKI, Y., AND JACK, M. 1990. Hidden Markov Models for Speech Recognition. Columbia University Press, New York, NY, USA.

JEH, G. AND WIDOM, J. 2003. Scaling Personalized Web Search. In Proceedings of the 12th International Conference on World Wide Web. 271–279.

JURAFSKY, D. AND MARTIN, J. H. 2009. Speech and Language Processing (2nd Edition). Prentice-Hall, Inc., Upper Saddle River, NJ, USA.

KIZILCEC, R. F., PIECH, C., AND SCHNEIDER, E. 2013. Deconstructing Disengagement: Analyzing Learner Subpopulations in Massive Open Online Courses. In Proceedings of the Third International Conference on Learning Analytics and Knowledge. LAK ’13. 170–179.

KIZILCEC, R. F., PREZ-SANAGUSTN, M., AND MALDONADO, J. J. 2017. Self-regulated learning strategies predict learner behavior and goal attainment in Massive Open Online Courses. Computers & Education 104, 18 – 33.

KOLLER, D. AND FRIEDMAN, N. 2009. Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning. The MIT Press.

MASSUNG, S., GEIGLE, C., AND ZHAI, C. 2016. MeTA: A Unified Toolkit for Text Retrieval and Analysis. In Proceedings of ACL-2016 System Demonstrations. Berlin, Germany, 91–96.

PAGE, L., BRIN, S., MOTWANI, R., AND WINOGRAD, T. 1999. The PageRank citation ranking: bringing order to the web.

PIECH, C., BASSEN, J., HUANG, J., GANGULI, S., SAHAMI, M., GUIBAS, L. J., AND SOHL-DICKSTEIN, J. 2015. Deep Knowledge Tracing. In Advances in Neural Information Processing Systems 28. 505–513.

RABINER, L. R. 1990. Readings in Speech Recognition. Chapter A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, 267–296.

SHIH, B., KOEDINGER, K. R., AND SCHEINES, R. 2010. Unsupervied Discovery of Student Strategies. In Proceedings of the 3rd International Conference on Educational Data Mining. EDM 2010. International Educational Data Mining Society (IEDMS).

SONG, Y., KEROMYTIS, A. D., AND STOLFO, S. J. 2009. Spectrogram: A Mixture-of-Markov-Chains Model for Anomaly Detection in Web Traffic. In 16th Annual Network and Distributed System Security Symposium. NDSS. ISOC.

YPMA, A. AND HESKES, T. 2002. Automatic categorization of web pages and user clustering with mixtures of hidden markov models. In International Workshop on Mining Web Data for Discovering Usage Patterns and Profiles. Springer, 35–49.

ZHANG , D., GATICA-PEREZ, D., BENGIO, S., McCOWAN, I., AND LATHOUD, G. 2004. Modeling Individual and Group Actions in Meetings: A Two-Layer HMM Framework. In 2004 Conference on Computer Vision and Pattern Recognition Workshop. 117–117.
EDM 2017 Journal Track