Supercharging BKT with Multidimensional Generalizable IRT and Skill Discovery

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Jun 27, 2024
Mohammad M. Khajah

Abstract

Bayesian Knowledge Tracing (BKT) is a popular interpretable computational model in the educational
mining community that can infer a student’s knowledge state and predict future performance based on
practice history, enabling tutoring systems to adaptively select exercises to match the student’s competency
level. Existing BKT implementations do not scale to large datasets and are difficult to extend
and improve in terms of prediction accuracy. On the other hand, uninterpretable neural network (NN)
student models, such as Deep Knowledge Tracing, enjoy the speed and modeling flexibility of popular
computational frameworks (e.g., PyTorch, Tensorflow, etc.), making them easy to develop and extend.
To bridge this gap, we develop a collection of BKT recurrent neural network (RNN) cells that are much
faster than brute-force implementations and are within an order of magnitude of a fast, fine-tuned but
inflexible C++ implementation. We leverage our implementation’s modeling flexibility to create two
novel extensions of BKT that significantly boost its performance. The first merges item response theory
(IRT) and BKT by modeling multidimensional problem difficulties and student abilities without fitting
student-specific parameters, allowing the model to easily generalize to new students in a principled way.
The second extension discovers the discrete assignment matrix of problems to knowledge components
(KCs) via stochastic neural network techniques and supports further guidance via problem input features
and an auxiliary loss objective. Both extensions are learned in an end-to-end fashion; that is, problem
difficulties, student abilities, and assignments to knowledge components are jointly learned with BKT
parameters. In synthetic experiments, the skill discovery model can partially recover the true generating
problem-KC assignment matrix while achieving high accuracy, even in some cases where the true KCs
are structured unfavorably (interleaving sequences). On a real dataset where problem content is available,
the skill discovery model matches BKT with expert-provided skills, despite using fewer KCs. On seven
out of eight real-world datasets, our novel extensions achieve prediction performance that is within 0.04
AUC-ROC points of state-of-the-art models. We conclude by showing visualizations of the parameters
and inferences to demonstrate the interpretability of our BKT RNN models on a real-life dataset.

How to Cite

Khajah, M. M. (2024). Supercharging BKT with Multidimensional Generalizable IRT and Skill Discovery. Journal of Educational Data Mining, 16(1), 233–278. https://doi.org/10.5281/zenodo.11235602
Abstract 84 | HTML Downloads 97 PDF Downloads 112

##plugins.themes.bootstrap3.article.details##

Keywords

Bayesian knowledge tracing, generalizable IRT, skill discovery

References
ACKERMAN, T. A. 1989. Unidimensional irt calibration of compensatory and noncompensatory multidimensional items. Applied Psychological Measurement 13, 2, 113–127.

ANTHONY, L. AND RITTER, S. 2008. Handwriting2/examples spring 2007.

BLOOM, B. S. 1984. The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring. Educational researcher 13, 6, 4–16.

BOECK, P. D. AND WILSON, M. 2004. Explanatory Item Response Models: a Generalized Linear and Nonlinear Approach. Springer-Verlag, New York, NY.

BRADLEY, A. P. 1997. The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recogn. 30, 7 (jul), 1145–1159.

CEN, H., KOEDINGER, K., AND JUNKER, B. 2006. Learning factors analysis – a general method for cognitive model evaluation and improvement. In Intelligent Tutoring Systems, M. Ikeda, K. D. Ashley, and T.-W. Chan, Eds. Springer Berlin Heidelberg, Berlin, Heidelberg, 164–175.

CORBETT, A. T. AND ANDERSON, J. R. 1994. Knowledge tracing: Modelling the acquisition of procedural knowledge. User modeling and user-adapted interaction 4, 4, 253–278.

DOZAT, T. 2016. Incorporating Nesterov Momentum into Adam. In Proceedings of the 4th International Conference on Learning Representations (2016). 1–4.

FENG, M., HEFFERNAN, N., AND KOEDINGER, K. 2009. Addressing the assessment challenge with an online system that tutors as it assesses. User Modeling and User-Adapted Interaction 19, 3 (Aug 1,), 243–266.

FINCH, W. H. AND FRENCH, B. F. 2018. Educational and psychological measurement. Routledge.

GERVET, T., KOEDINGER, K., SCHNEIDER, J., MITCHELL, T., ET AL. 2020. When is deep learning the best approach to knowledge tracing? Journal of Educational Data Mining 12, 3, 31–54.

GHOSH, A., HEFFERNAN, N., AND LAN, A. S. 2020. Context-aware attentive knowledge tracing. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. KDD ’20. Association for Computing Machinery, New York, NY, USA, 2330–2339.

GONZALEZ-BRENES, J., HUANG, Y., AND BRUSILOVSKY, P. 2014. General features in knowledge tracing to model multiple subskills, temporal item response theory, and expert knowledge. In The 7th International Conference on Educational Data Mining, J. C. Stamper, Z. A. Pardos, M. Mavrikis, and B. M. McLaren, Eds. International Educational Data Mining Society (IEDMS), 84 – 91 .

GONZÁ LEZ-BRENES, J. P. AND MOSTOW, J. 2012. Dynamic cognitive tracing: Towards unified discovery of student and cognitive models. In Proceedings of the 5th International Conference on Educational Data Mining, K. Yacef, O. R. Zaïane, A. Hershkovitz, M. Yudelson, and J. C. Stamper, Eds. International Educational Data Mining Society (IEDMS), 49–56.

GOODFELLOW, I., BENGIO, Y., AND COURVILLE, A. 2016. Deep learning. The MIT Press, London, England.

HOCHREITER, S. AND SCHMIDHUBER, J. 1997. Long short-term memory. Neural Computation 9, 8 (Nov 1,), 1735–1780.

HUBERT, L. AND ARABIE, P. 1985. Comparing partitions. Journal of classification 2, 1, 193–218.

JANG, E., GU, S., AND POOLE, B. 2017. Categorical reparameterization with gumbel-softmax. In 5th International Conference on Learning Representations (ICLR 2017). Curran Associates, Inc., 1920– 1931.

JURAFSKY, D. AND MARTIN, J. H. 2009. Speech and language processing, 2. ed., internat. ed. Pearson Education International, Prentice Hall, Upper Saddle River.

KHAJAH, M., LINDSEY, R. V., AND MOZER, M. 2016. How deep is knowledge tracing? In Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, T. Barnes, M. Chi, and M. Feng, Eds. International Educational Data Mining Society (IEDMS), 94–101 .

KHAJAH, M., WING, R., LINDSEY, R. V., AND MOZER, M. 2014. Integrating latent-factor and knowledge-tracing models to predict individual differences in learning. In Proceedings of the 7th International Conference on Educational Data Mining, EDM 2014, J. C. Stamper, Z. A. Pardos, M. Mavrikis, and B. M. McLaren, Eds. International Educational Data Mining Society (IEDMS), 99–106.

KOEDINGER, K. R., CARVALHO, P. F., LIU, R., AND MCLAUGHLIN, E. A. 2023. An astonishing regularity in student learning rate. Proceedings of the National Academy of Sciences 120, 13, e2221311120.

KOEDINGER, K. R., CORBETT, A. T., AND PERFETTI, C. 2012. The knowledge-learning-instruction framework: Bridging the science-practice chasm to enhance robust student learning. Cognitive Science 36, 5, 757–798 .

KOEDINGER, K. R., D. BAKER, R. S. J., CUNNINGHAM, K., SKOGSHOLM, A., LEBER, B., AND STAMPER, J. 2010. A data repository for the edm community: The pslc datashop. In Handbook of Educational Data Mining, C. Romero, S. Ventura, M. Pechenizkiy, and R. S. Baker, Eds. CRC Press, Chapter 4, 43–56.

LI, N., COHEN, W. W., AND KOEDINGER, K. R. 2013. Discovering student models with a clustering algorithm using problem content. In Proceedings of the 9th International Conference on Educational Data Mining, EDM 2013, S. K. D’Mello, R. A. Calvo, and A. Olney, Eds. International Educational Data Mining Society (IEDMS), 98–105.

LINDSEY, R. V., KHAJAH, M., AND MOZER, M. C. 2014. Automatic discovery of cognitive skills to improve the prediction of student learning. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 1, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds. NIPS’14. MIT Press, Cambridge, MA, USA, 1386–1394.

LINDSEY, R. V., SHROYER, J. D., PASHLER, H., AND MOZER, M. C. 2014. Improving students’ longterm knowledge retention through personalized review. Psychological Science 25, 3, 639–647.

LIU, R. AND KOEDINGER, K. R. 2017. Closing the loop: Automated data-driven cognitive model discoveries lead to improved instruction and learning gains. Journal of Educational Data Mining 9, 1 (Sep.), 25–41.

MARTORI, F., CUADROS, J., AND GONZ´A LEZ-SABAT´E, L. 2015. Direct estimation of the minimum RSS value for training bayesian knowledge tracing parameters. In Proceedings of the 8th International Conference on Educational Data Mining, EDM 2015, Madrid, Spain, June 26-29, 2015, O. C . Santos, J. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, P. Mitros, J. M. Luna, M. C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, and M. C. Desmarais, Eds. International Educational Data Mining Society (IEDMS), 364–367.

MOLNAR, C. 2022. Interpretable Machine Learning, 2 ed. https://christophm.github.io/ interpretable-ml-book.

MONTERO, S., ARORA, A., KELLY, S., MILNE, B., AND MOZER, M. 2018. Does deep knowledge tracing model interactions among skills? In Proceedings of the 11th International Conference on Educational Data Mining, EDM 2018, K. E. Boyer and M. Yudelson, Eds. International Educational Data Mining Society (IEDMS), 462–466.

PARDOS, Z. A. AND HEFFERNAN, N. T. 2010. Navigating the parameter space of bayesian knowledge tracing models: Visualizations of the convergence of the expectation maximization algorithm. In Educational Data Mining 2010, The 3rd International Conference on Educational Data Mining, R. S. J. de Baker, A. Merceron, and P. I. P. Jr., Eds. International Educational Data Mining Society (IEDMS), 161–170.

PASZKE, A., GROSS, S., MASSA, F., LERER, A., BRADBURY, J., CHANAN, G., KILLEEN, T., LIN, Z., GIMELSHEIN, N., ANTIGA, L., DESMAISON, A., KOPF, A., YANG, E., DEVITO, Z., RAISON, M., TEJANI, A., CHILAMKURTHY, S., STEINER, B., FANG, L., BAI, J., AND CHINTALA, S. 2019. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 8024–8035.

PAVLIK, P. I., CEN, H., AND KOEDINGER, K. R. 2009. Performance factors analysis –a new alternative to knowledge tracing. In Proceedings of the 2009 Conference on Artificial Intelligence in Education: Building Learning Systems That Care: From Knowledge Representation to Affective Modelling, V. Dimitrova, R. Mizoguchi, B. du Boulay, and A. Graesser, Eds. IOS Press, NLD, 531–538.

PELÁNEK, R. 2014. Application of time decay functions and the elo system in student modeling. In Proceedings of the 7th International Conference on Educational Data Mining, EDM 2014, J. C. Stamper, Z. A. Pardos, M. Mavrikis, and B. M. McLaren, Eds. International Educational Data Mining Society (IEDMS), 21–27.

PELÁNEK, R. 2017. Bayesian knowledge tracing, logistic models, and beyond: An overview of learner modeling techniques. User Modeling and User-Adapted Interaction 27, 3–5 (dec), 313–350.

PELÁNEK, R. AND ŘIHÁK, J. 2017. Experimental analysis of mastery learning criteria. In Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization. UMAP ’17. Association for Computing Machinery, New York, NY, USA, 156–163.

PIECH, C., BASSEN, J., HUANG, J., GANGULI, S., SAHAMI, M., GUIBAS, L. J., AND SOHLDICKSTEIN, J. 2015. Deep knowledge tracing. Advances in Neural Information Processing Systems 28, 505–513.

RAVAND, H. AND ROBITZSCH, A. 2018. Cognitive diagnostic model of best choice: a study of reading comprehension. Educational Psychology 38, 10, 1255–1277.

RECKASE, M. 2009. Multidimensional Item Response Theory. Statistics for Social and Behavioral Sciences. Springer New York.

RECKASE, M. D. 1979. Unifactor latent trait models applied to multifactor tests: Results and implications. Journal of Educational Statistics 4, 3, 207–230.

REIMERS, N. AND GUREVYCH, I. 2019. Sentence-bert: Sentence embeddings using siamese bertnetworks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP), K. Inui, J. Jiang, V. Ng, and X. Wan, Eds. Association for Computational Linguistics, Association for Computational Linguistics, 3980–3990.

RITTER, S., ANDERSON, J. R., KOEDINGER, K. R., AND CORBETT, A. 2007. Cognitive tutor: Applied research in mathematics education. Psychonomic Bulletin & Review 14, 2, 249–255.

RITTER, S., HARRIS, T. K., NIXON, T., DICKISON, D., MURRAY, R. C., AND TOWLE, B. 2009. Reducing the knowledge tracing space. In Educational Data Mining 2009: 2nd International Conference on Educational Data Mining, Proceedings, T. Barnes, M. C. Desmarais, C. Romero, and S. Ventura, Eds. International Educational Data Mining Society (IEDMS), 151–160.

STAMPER, J., NICULESCU-MIZIL, A., RITTER, S., GORDON, G., AND KOEDINGER, K. 2010a. Algebra i 2005-2006. development data set from kdd cup 2010 educational data mining challenge. Find it at http://pslcdatashop.web.cmu.edu/KDDCup/downloads.jsp.

STAMPER, J., NICULESCU-MIZIL, A., RITTER, S., GORDON, G., AND KOEDINGER, K. 2010b. Bridge to algebra 2006-2007. development data set from kdd cup 2010 educational data mining challenge. Find it at http://pslcdatashop.web.cmu.edu/KDDCup/downloads.jsp .

TSUTSUMI, E., KINOSHITA, R., AND UENO, M. 2021. Deep-irt with independent student and item networks. In Proceedings of the 14th International Conference on Educational Data Mining, EDM 2021, S. I. Hsiao, S. S. Sahebi, F. Bouchet, and J. Vie, Eds. International Educational Data Mining Society (IEDMS), 510–517.

VASWANI, A., SHAZEER, N., PARMAR, N., USZKOREIT, J., JONES, L., GOMEZ, A. N., KAISER, L., AND POLOSUKHIN, I. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, U. von Luxburg, I. Guyon, S. Bengio, H.Wallach, and R. Fergus, Eds. NIPS’17. Curran Associates Inc., Red Hook, NY, USA, 6000–6010.

WILSON, K. H., KARKLIN, Y., HAN, B., AND EKANADHAM, C. 2016. Back to the basics: Bayesian extensions of IRT outperform neural networks for proficiency estimation. In Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, T. Barnes, M. Chi, and M. Feng, Eds. International Educational Data Mining Society (IEDMS), 539–544.

YAO, L. AND SCHWARZ, R. D. 2006. A multidimensional partial credit model with associated item and test statistics: An application to mixed-format tests. Applied psychological measurement 30, 6 (Nov), 469–492.

YEUNG, C.-K. 2019. Deep-irt: Make deep learning based knowledge tracing explainable using item response theory. arXiv preprint arXiv:1904.11738.

YUDELSON, M. 2022. A tool for fitting hidden markov models models at scale. https://github.com/myudelson/hmm-scalable.

ZHANG, J., SHI, X., KING, I., AND YEUNG, D.-Y. 2017. Dynamic key-value memory networks for knowledge tracing. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 765–774.
Section
Articles