Optimizing Bayesian Knowledge Tracing with Neural Network Parameter Generation

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Jan 20, 2025
Anirudhan Badrinath Zachary Pardos

Abstract

Bayesian Knowledge Tracing (BKT) is a well-established model for formative assessment, with optimization typically using expectation maximization, conjugate gradient descent, or brute force search. However, one of the flaws of existing optimization techniques for BKT models is convergence to undesirable local minima that negatively impact performance and interpretability of the BKT parameters (i.e., parameter degeneracy). Recently, deep knowledge tracing methods such as context-aware attentive knowledge tracing have proven to be state-of-the-art in performance; however, these methods often lack the inherent interpretability or understanding provided by BKT's skill-level parameter estimates and student-level mastery probability estimates. We propose a novel optimization technique for BKT models using a neural network-based parameter generation approach, OptimNN, that leverages hypernetworks and stochastic gradient descent for training BKT parameters. We extend this approach and propose BKTransformer, a transformer-based sequence modeling technique that generates temporally-evolving BKT parameters for student response correctness prediction. With both approaches, we demonstrate improved performance compared to BKT and deep KT baselines, with minimal hyperparameter tuning. Importantly, we demonstrate that these techniques, despite their state-of-the-art expressive capability, retain the interpretability of skill-level BKT parameter estimates and student-level estimates of mastery and correctness probabilities. Our code and data can be found at https://github.com/abadrinath947/OptimNN.

How to Cite

Badrinath, A., & Pardos, Z. (2025). Optimizing Bayesian Knowledge Tracing with Neural Network Parameter Generation. Journal of Educational Data Mining, 17(1), 41–65. https://doi.org/10.5281/zenodo.14707987
Abstract 6 | HTML Downloads 0 PDF Downloads 0

##plugins.themes.bootstrap3.article.details##

Keywords

knowledge tracing, Bayesian Knowledge Tracing, intelligent tutoring systems, deep learning

References
Abdelrahman, G. and Wang, Q. 2022. Deep graph memory networks for forgetting-robust knowledge tracing. Institute of Electrical and Electronics Engineers Transactions on Knowledge and Data Engineering 35, 8, 7844–7855.

Aleven, V. A. and Koedinger, K. R. 2002. An effective metacognitive strategy: learning by doing and explaining with a computer-based cognitive tutor. Cognitive Science 26, 2, 147–179.

Anzai, Y. and Simon, H. A. 1979. The theory of learning by doing. Psychological Review 86, 2, 124.

Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.-R., and Samek, W. 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLOS ONE 10, 7 (07), 1–46.

Badrinath, A., Wang, F., and Pardos, Z. 2021. pybkt: An accessible python library of bayesian knowledge tracing models. In Proceedings of the 14th International Conference on Educational Data Mining (Educational Data Mining 2021), S. Hsiao and S. Sahebi, Eds. International Educational Data Mining Society, 468–474.

Baker, R. S. J. d., Corbett, A. T., and Aleven, V. 2008. More accurate student modeling through contextual estimation of slip and guess probabilities in bayesian knowledge tracing. In Intelligent Tutoring Systems, B. P. Woolf, E. Aïmeur, R. Nkambou, and S. Lajoie, Eds. Springer Berlin Heidelberg, Berlin, Heidelberg, 406–415.

Bloom, B. S. 1984. The 2 sigma problem: The search for methods of group instruction as effective as one-to-one tutoring. Educational Researcher 13, 6, 4–16.

Chang, K.-m., Beck, J., Mostow, J., and Corbett, A. 2006. A bayes net toolkit for student modeling in intelligent tutoring systems. In Intelligent Tutoring Systems, M. Ikeda, K. D. Ashley, and T.-W. Chan, Eds. Springer Berlin Heidelberg, Berlin, Heidelberg, 104–113.

Chollet, F. et al. 2015. Keras. https://keras.io.

Corbett, A. T. and Anderson, J. R. 1994. Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction 4, 4, 253–278.

Gardner, J. and Brooks, C. 2017. Statistical approaches to the model comparison task in learning analytics. In Joint Proceedings of the Workshop on Methodology in Learning Analytics (MLA) and the Workshop on Building the Learning Analytics Curriculum (BLAC) co-located with 7th International Learning Analytics and Knowledge Conference (LAK 2017), Vancouver, Canada, March 13th-14th, 2017, Y. Bergner, C. Lang, G. Gray, S. D. Teasley, and J. C. Stamper, Eds. CEUR Workshop Proceedings, vol. 1915. CEUR-WS.org.

Ghosh, A., Heffernan, N., and Lan, A. S. 2020. Context-aware attentive knowledge tracing. In Proceedings of the 26th Association for Computing Machinery Special Interest Group on Knowledge Discovery in Data International Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery, New York, NY, USA, 2330–2339.

Ha, D., Dai, A. M., and Le, Q. V. 2017. Hypernetworks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net.

Hestenes, M. R. and Stiefel, E. 1952. Methods of conjugate gradients for solving. Journal of Research of the National Bureau of Standards 49, 6, 409.

Hochreiter, S. and Schmidhuber, J. 1997. Long short-term memory. Neural Comput. 9, 8 (Nov.), 1735–1780.

Khajah, M., Lindsey, R. V., and Mozer, M. C. 2016. How deep is knowledge tracing? In Proceedings of the 9th International Conference on Educational Data Mining, T. Barnes, M. Chi, and M. Feng, Eds. International Educational Data Mining Society, 94–101.

Kingma, D. P. and Ba, J. 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR). ArXiv, Ithaca, NY, 1–13.

Labra, C. and Santos, O. C. 2023. Exploring cognitive models to augment explainability in deep knowledge tracing. In Adjunct Proceedings of the 31st Association for Computing Machinery Conference on User Modeling, Adaptation and Personalization. Association for Computing Machinery, New York, NY, USA, 220–223.

Levinson, S. E., Rabiner, L. R., and Sondhi, M. M. 1983. An introduction to the application of the theory of probabilistic functions of a markov process to automatic speech recognition. Bell System Technical Journal 62, 4, 1035–1074.

Liu, Q., Huang, Z., Yin, Y., Chen, E., Xiong, H., Su, Y., and Hu, G. 2019. Ekt: Exercise-aware knowledge tracing for student performance prediction. Institute of Electrical and Electronics Engineers Transactions on Knowledge and Data Engineering 33, 1, 100–115.

Lu, Y., Wang, D., Meng, Q., and Chen, P. 2020. Towards interpretable deep learning models for knowledge tracing. In Artificial Intelligence in Education, I. I. Bittencourt, M. Cukurova, K. Muldner, R. Luckin, and E. Millán, Eds. Springer International Publishing, Cham, 185–190.

Nagatani, K., Zhang, Q., Sato, M., Chen, Y.-Y., Chen, F., and Ohkuma, T. 2019. Augmenting knowledge tracing by considering forgetting behavior. In The World Wide Web Conference. Association for Computing Machinery, New York, NY, USA, 3101–3107.

Nakagawa, H., Iwasawa, Y., and Matsuo, Y. 2019. Graph-based Knowledge Tracing: Modeling Student Proficiency Using Graph Neural Network . In 2019 Institute of Electrical and Electronics Engineers/Web Intelligence Consortium/Association for Computing Machinery International Conference on Web Intelligence (WI). Institute of Electrical and Electronics Engineers Computer Society, Los Alamitos, CA, USA, 156–163.

Pandey, S. and Karypis, G. 2019. A self-attentive model for knowledge tracing. In Educational Data Mining 2019 - Proceedings of the 12th International Conference on Educational Data Mining, C. Lynch, A. Merceron, M. Desmarais, and R. Nkambou, Eds. International Educational Data Mining Society, 384–389.

Pardos, Z. A. and Heffernan, N. T. 2009. Detecting the learning value of items in a randomized problem set. In Proceedings of the 2009 Conference on Artificial Intelligence in Education: Building Learning Systems That Care: From Knowledge Representation to Affective Modelling, V. Dimitrova, R. Mizoguchi, B. du Boulay, and A. Graesser, Eds. IOS Press, NLD, 499–506.

Pardos, Z. A. and Heffernan, N. T. 2010. Navigating the parameter space of bayesian knowledge tracing models: Visualizations of the convergence of the expectation maximization algorithm. In Educational Data Mining 2010, The 3rd International Conference on Educational Data Mining, Pittsburgh, PA, USA, June 11-13, 2010. Proceedings, R. S. J. de Baker, A. Merceron, and P. I. Pavlik, Eds. www.educationaldatamining.org, 161–170.

Pardos, Z. A. and Heffernan, N. T. 2011. Kt-idem: Introducing item difficulty to the knowledge tracing model. In User Modeling, Adaption and Personalization, J. A. Konstan, R. Conejo, J. L. Marzo, and N. Oliver, Eds. Springer Berlin Heidelberg, Berlin, Heidelberg, 243–254.

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. 2019. Pytorch: an imperative style, high-performance deep learning library. In Proceedings of the 33rd International Conference on Neural Information Processing Systems. Curran Associates Inc., Red Hook, NY, USA.

Pelánek, R. 2017. Bayesian knowledge tracing, logistic models, and beyond: an overview of learner modeling techniques. User Modeling and User-Adapted Interaction 27, 3 (Dec), 313–350.

Piech, C., Bassen, J., Huang, J., Ganguli, S., Sahami, M., Guibas, L., and Sohl-Dickstein, J. 2015. Deep knowledge tracing. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1. MIT Press, Cambridge, MA, USA, 505–513.

Rai, D., Gong, Y., and Beck, J. 2009. Using dirichlet priors to improve model parameter plausibility. In Educational Data Mining - EDM 2009, Cordoba, Spain, July 1-3, 2009. Proceedings of the 2nd International Conference on Educational Data Mining, T. Barnes, M. C. Desmarais, C. Romero, and S. Ventura, Eds. International Educational Data Mining Society, 141–150.

Ritter, S., Anderson, J. R., Koedinger, K. R., and Corbett, A. 2007. Cognitive tutor: Applied research in mathematics education. Psychonomic Bulletin & Review 14, 2, 249–255.

Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., and Lillicrap, T. 2016. Meta-learning with memory-augmented neural networks. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48. JMLR.org, 1842–1850.

Tong, S., Liu, Q., Huang, W., Hunag, Z., Chen, E., Liu, C., Ma, H., and Wang, S. 2020. Structure-based knowledge tracing: An influence propagation view. In 2020 Institute of Electrical and Electronics Engineers International Conference on Data Mining (ICDM). Institute of Electrical and Electronics Engineers, 541–550.

van de Sande, B. 2013. Properties of the bayesian knowledge tracing model. Journal of Educational Data Mining 5, 2 (Jul.), 1–10.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., Red Hook, NY, USA, 6000–6010.

Wang, Z., Feng, X., Tang, J., Huang, G. Y., and Liu, Z. 2019. Deep knowledge tracing with side information. In Artificial Intelligence in Education, S. Isotani, E. Millán, A. Ogan, P. Hastings, B. McLaren, and R. Luckin, Eds. Springer International Publishing, Cham, 303–308.

Yeung, C.-K. 2019. Deep-irt: Make deep learning based knowledge tracing explainable using item response theory. In Proceedings of The 12th International Conference on Educational Data Mining (EDM 2019), C. F. Lynch, A. Merceron, M. Desmarais, and R. Nkambou, Eds. International Educational Data Mining Society, 683–686.

Yeung, C. K. and Yeung, D. Y. 2018. Addressing two problems in deep knowledge tracing via prediction-consistent regularization. In Proceedings of the 5th Association for Computing Machinery Conference on Learning @ Scale. Association for Computing Machinery, 5:1–5:10.

Yin, Y., Liu, Q., Huang, Z., Chen, E., Tong, W., Wang, S., and Su, Y. 2019. Quesnet: A unified representation for heterogeneous test questions. In Proceedings of the 25th Association for Computing Machinery Special Interest Group on Knowledge Discovery in Data International Conference on Knowledge Discovery & Data Mining. Association for Computing Machinery, New York, NY, USA, 1328–1336.

Yudelson, M. V. 2016. Individualizing bayesian knowledge tracing. are skill parameters more important than student parameters?. In Proceedings of the 9th International Conference on Educational Data Mining, EDM 2016, Raleigh, North Carolina, USA, June 29 - July 2, 2016, T. Barnes, M. Chi, and M. Feng, Eds. International Educational Data Mining Society, 46–53.

Yudelson, M. V., Koedinger, K. R., and Gordon, G. J. 2013. Individualized bayesian knowledge tracing models. In Artificial Intelligence in Education, H. C. Lane, K. Yacef, J. Mostow, and P. Pavlik, Eds. Springer Berlin Heidelberg, Berlin, Heidelberg, 171–180.

Zhang, L., Xiong, X., Zhao, S., Botelho, A., and Heffernan, N. T. 2017. Incorporating rich features into deep knowledge tracing. In Proceedings of the Fourth (2017) Association for Computing Machinery Conference on Learning @ Scale. Association for Computing Machinery, New York, NY, USA, 169–172.
Section
Articles