Most of the Time, It Works Every Time: Limitations in Refining Domain Models with Learning Curves

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Oct 25, 2018
Ilya Goldin April Galyardt

Abstract

Data from student learning provide learning curves that, ideally, demonstrate improvement in student performance over time. Existing data mining methods can leverage these data to characterize and improve the domain models that support a learning environment, and these methods have been validated both with already-collected data, and in close-the-loop studies that actually modify instruction. However, these methods may be less general than previously thought, because they have not been evaluated under a wide range of data conditions. We describe a problem space of 90 distinct scenarios within which data mining methods may be applied to recognize posited domain model improvements. The scenarios are defined by two kinds of domain model modifications, five kinds of learning curves, and 25 types of skill combinations under three ways of interleaving skill practice. These extensive tests are made possible by the use of simulated data. In each of the 90 scenarios, we test three predictive models that aim to recognize domain model improvements, and evaluate their performance. Results show that the conditions under which an automated method tests a proposed domain model improvement can drastically affect the method’s accuracy in accepting or rejecting the proposed improvement, and the conditions can be affected by learning curve shapes, method of interleaving, choice of predictive model, and the threshold for predictive model comparison. Further, results show consistent problems with accuracy in accepting a proposed improvement by the Additive Factors Model, made popular in the DataShop software. Other models, namely Performance Factors Analysis and Recent-Performance Factors Analysis, are much more accurate, but still struggle under some conditions, such as when distinguishing curves from two skills where students have a high rate of errors after substantial practice. These findings bear on how to evaluate proposed refinements to a domain model. In light of these results, historical attempts to test domain model refinements may need to be reexamined.

How to Cite

Goldin, I., & Galyardt, A. (2018). Most of the Time, It Works Every Time: Limitations in Refining Domain Models with Learning Curves. Journal of Educational Data Mining, 10(2), 55–92. https://doi.org/10.5281/zenodo.3554693
Abstract 620 | PDF Downloads 605

##plugins.themes.bootstrap3.article.details##

Keywords

domain model, predictive model, Q-matrix, learning curve

References
ANDERSON, J. R. 1996. ACT: A simple theory of complex cognition. American Psychologist 51, 4, 355–365.

ANDERSON, J. R. 2013. Discovering the structure of mathematical problem solving. In Proceedings of the 6th International Conference on Educational Data Mining (EDM 2013), S. K. D’Mello, R. A. Calvo, and A. Olney, Eds. 2.

BAKER, R. S. J. D., CORBETT, A. T., AND KOEDINGER, K. R. 2007. The difficulty factors approach to the design of lessons in intelligent tutor curricula. International Journal of Artificial Intelligence and Education 17, 4 (Dec.), 341–369.

CEN, H., KOEDINGER, K., AND JUNKER, B. 2006. Learning Factors Analysis – A General Method for Cognitive Model Evaluation and Improvement. M. Ikeda, K. D. Ashley, and T.-W. Chan, Eds. Vol. 4053. Springer Berlin Heidelberg, Jhongli, Taiwan, 164–175.

CEN, H., KOEDINGER, K. R., AND JUNKER, B. W. 2007. Is over practice necessary? Improving learning efficiency with the Cognitive Tutor through educational data mining. In Proceedings of the 2007 Conference on Artificial Intelligence in Education: Building Technology Rich Learning Contexts That Work, R. Luckin, K. R. Koedinger, and J. Greer, Eds. IOS Press, Amsterdam, The Netherlands, The Netherlands, 511 – 518.

CHEN, Y., GONZALEZ-BRENES, J. P., AND TIAN, J. 2016. Joint discovery of skill prerequisite graphs and student models. In Proceedings of the 9th International Conference on Educational Data Mining, T. Barnes, M. Chi, and M. Feng, Eds. International Educational Data Mining Society, 46–53.

CORBETT, A. T. AND ANDERSON, J. R. 1995. Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction 4, 4, 253–278. DATASHOP TEAM. 2016. Learning Curve. https://pslcdatashop.web.cmu.edu/help? page=learningCurve.

DE BOECK, P. AND WILSON, M., Eds. 2004. Explanatory Item Response Models: A Generalized Linear and Nonlinear Approach. Statistics for social science and public policy. Springer, New York.

DE LA TORRE, J. AND CHIU, C.-Y. 2016. A General Method of Empirical Q-matrix Validation. Psychometrika 81, 2 (June), 253–273.

DESMARAIS, M. C. 2012. Mapping Question Items to Skills with Non-negative Matrix Factorization. SIGKDD Explor. Newsl. 13, 2 (May), 30–36.

FALMAGNE, J.-C., KOPPEN, M., VILLANO, M., DOIGNON, J.-P., AND JOHANNESEN, L. 1990. Introduction to knowledge spaces: How to build, test, and search them. Psychological Review 97, 2, 201.

FANCSALI, S., NIXON, T., AND RITTER, S. 2013. Optimal and worst-case performance of mastery learning assessment with Bayesian Knowledge Tracing. In Proceedings of the 6th International Conference on Educational Data Mining, S. K. D’Mello, R. A. Calvo, and A. Olney, Eds. Memphis, TN, 35–42.

FISCHER, G. H. 1973. The linear logistic test model as an instrument in educational research. Acta Psychologica 37, 6 (Dec.), 359–374.

GALYARDT, A. 2012. Mixed Membership Distributions with Applications to Modeling Multiple Strategy Usage. Carnegie Mellon University, PhD Dissertation, Pittsburgh, PA 15213.

GALYARDT, A. AND GOLDIN, I. M. 2015. Move your lamp post: Recent data reflects learner knowledge better than older data. Journal of Educational Data Mining 7, 2, 83–108.

GELMAN, A. AND HILL, J. 2006. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.

GOLDIN, I. M. AND GALYARDT, A. 2015. Convergent validity of a student model: Recent-Performance Factors Analysis. In Proceedings of the 8th International Conference on Educational Data Mining, O. C. Santos, J. G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, P. Mitros, J. M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, and M. Desmarais, Eds. Madrid, Spain, 548– 553.

GOLDIN, I. M., PAVLIK JR, P., AND RITTER, S. 2016. Discovering Domain Models in Learning Curve Data. In Design Recommendations for Intelligent Tutoring Systems: Domain Modeling, R. A. Sottilare, A. C. Graesser, Xiangen Hu, A. Olney, B. Nye, and A. M. Sinatra, Eds. Vol. 4. US Army Research Laboratory.

GONZALEZ-BRENES, J. AND HUANG, Y. 2015. Your model is predictive but is it useful? Theoretical and empirical considerations of a new paradigm for adaptive tutoring evaluation. In Proceedings of the 8th International Conference on Educational Data Mining, O. C. Santos, J. G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, P. Mitros, J. M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, and M. Desmarais, Eds. Madrid, Spain, 187–194.

GONZLEZ-BRENES, J. P. AND MOSTOW, J. 2013. What and when do students learn? Fully data-driven joint estimation of cognitive and student models. In Proceedings of 6th International Conference on Educational Data Mining, S. K. D’Mello, R. A. Calvo, and A. Olney, Eds. Memphis, TN, 236–239.

HASTIE, T., TIBSHIRANI, R., AND FRIEDMAN, J. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition ed. Springer Series in Statistics. Springer, New York, NY.

KOEDINGER, K. R. AND MCLAUGHLIN, E. A. 2016. Closing the loop with quantitative cognitive task analysis. In Proceedings of the 9th International Conference on Educational Data Mining, T. Barnes, M. Chi, and M. Feng, Eds. Raleigh, NC, USA, 412–417.

KOEDINGER, K. R., MCLAUGHLIN, E. A., AND STAMPER, J. C. 2012. Automated student model improvement. In Proceedings of 5th International Conference on Educational Data Mining, K. Yacef, O. Zaane, A. Hershkovitz, M. Yudelson, and J. Stamper, Eds. Chania, Greece, 17–24.

KOEDINGER, K. R., STAMPER, J. C., MCLAUGHLIN, E. A., AND NIXON, T. 2013. Using data-driven discovery of better student models to improve student learning. In Artificial Intelligence in Education, H. C. Lane, K. Yacef, J. Mostow, and P. Pavlik, Eds. Number 7926 in Lecture Notes in Computer Science. Springer Berlin Heidelberg, 421–430.

KOEDINGER, K. R., YUDELSON, M. V., AND PAVLIK, P. I. 2016. Testing theories of transfer using error rate learning curves. Topics in Cognitive Science 8, 3 (July), 589–609.

KSER, T., KLINGLER, S., AND GROSS, M. 2016. When to stop?: Towards universal instructional policies. In Proceedings of the Sixth International Conference on Learning Analytics & Knowledge, D. Gaevi, G. Lynch, S. Dawson, H. Drachsler, and C. P. Ros, Eds. ACM, Edinburgh, UK, 289–298.

KSER, T., KOEDINGER, K., AND GROSS, M. 2014. Different parameters-same prediction: An analysis of learning curves. In Proceedings of the 7th International Conference on Educational Data Mining, J. Stamper, Z. A. Pardos, M. Mavrikis, and B. M. McLaren, Eds. London, UK, 52–59.

LEE, J. I. AND BRUNSKILL, E. 2012. The impact on individualizing student models on necessary practice opportunities. In Proceedings of the 5th International Conference on Educational Data Mining, K. Yacef, O. Zaane, A. Hershkovitz, M. Yudelson, and J. Stamper, Eds. International Educational Data Mining Society, 118–125.

LINDSEY, R. V., KHAJAH, M., AND MOZER, M. C. 2014. Automatic discovery of cognitive skills to improve the prediction of student learning. In Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, Eds. Curran Associates, Inc., 1386–1394.

LIU, J., XU, G., AND YING, Z. 2012. Data-driven learning of Q-matrix. Applied Psychological Measurement 36, 7 (Oct.), 548–564.

LIU, R. AND KOEDINGER, K. 2017. Towards reliable and valid measurement of individualized student parameters. In Proceedings of the 10th International Conference on Educational Data Mining, H. Xiangen, T. Barnes, A. Hershkovitz, and L. Paquette, Eds. Wuhan, China, 135–142.

MARTIN, B., MITROVIC, A., KOEDINGER, K. R., AND MATHAN, S. 2011. Evaluating and improving adaptive educational systems with learning curves. User Modeling and User-Adapted Interaction 21, 3, 249–283.

MATSUDA, N., COHEN, W. W., AND KOEDINGER, K. R. 2015. Teaching the teacher: Tutoring SimStudent leads to more effective cognitive tutor authoring. International Journal of Artificial Intelligence in Education 25, 1 (Mar.), 1–34.

MURRAY, R. C., RITTER, S., NIXON, T., SCHWIEBERT, R., HAUSMANN, R. G. M., TOWLE, B., FANCSALI, S. E., AND VUONG, A. 2013. Revealing the learning in learning curves. In Artificial Intelligence in Education, H. C. Lane, K. Yacef, J. Mostow, and P. Pavlik, Eds. Lecture Notes in Computer Science, vol. 7926. Springer, Heidelberg, 473–482.

PAVLIK JR, P., CEN, H., AND KOEDINGER, K. 2009. Performance Factors Analysis - a new alternative to Knowledge Tracing. In Proceedings of 14th International Conference on Artificial Intelligence in Education, V. Dimitrova and R. Mizoguchi, Eds. Brighton, England, 531–538.

REYE, J. 2004. Student Modelling Based on Belief Networks. International Journal of Artificial Intelligence in Education 14, 1 (Jan.), 63–96.

ROLLINSON, J. AND BRUNSKILL, E. 2015. From predictive models to instructional policies. In Proceedings of the 8th International Conference on Educational Data Mining, O. C. Santos, J. G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, P. Mitros, J. M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, and M. Desmarais, Eds. International Educational Data Mining Society, 179–186.

RUPP, A. A. AND TEMPLIN, J. L. 2008. Unique Characteristics of Diagnostic Classification Models: A Comprehensive Review of the Current State-of-the-Art. Measurement: Interdisciplinary Research & Perspective 6, 4 (Nov.), 219–262.

SCHEINES, R., SILVER, E., AND GOLDIN, I. M. 2014. Discovering prerequisite relationships among knowledge components. In Proceedings of 7th International Conference on Educational Data Mining, J. Stamper, Z. A. Pardos, M. Mavrikis, and B. M. McLaren, Eds. London, UK, 355–356.

SOTTILARE, R., GRAESSER, A., HU, X., OLNEY, A., NYE, B., AND SINATRA, A., Eds. 2016. Design recommendations for intelligent tutoring systems: domain modeling. Design Recommendations for Intelligent Tutoring Systems, vol. 4. U.S. Army Research Laboratory, Orlando, FL.

STAMPER, J., KOEDINGER, K., AND MCLAUGHLIN, E. 2013. A comparison of model selection metrics in Datashop. In Proceedings of 6th International Conference on Educational Data Mining, S. K. D’Mello, R. A. Calvo, and A. Olney, Eds. Memphis, TN, 284–287.

STONE, M. 1977. An asymptotic equivalence of choice of model by cross-validation and Akaike’s criterion. Journal of the Royal Statistical Society. Series B (Methodological) 39, 1, 44–47.

SULLIVAN, M. E., YATES, K. A., INABA, K., LAM, L., AND CLARK, R. E. 2014. The use of cognitive task analysis to reveal the instructional limitations of experts in the teaching of procedural skills:. Academic Medicine 89, 5 (May), 811–816. bibtex: sullivan use 2014.

TATSUOKA, K. K. 1983. Rule space: An approach for dealing with misconceptions based on item response theory. Journal of educational measurement 20, 4, 345–354.

VUONG, A., NIXON, T., AND TOWLE, B. 2011. A method for finding prerequisites within a curriculum. In Proceedings of 4th International Conference on Educational Data Mining, M. Pechenizkiy, T. Calders, C. Conati, S. Ventura, C. Romero, and J. Stamper, Eds. Eindhoven, The Netherlands, 211–216.

WIXON, M., BAKER, R. S. J. D., GOBERT, J. D., OCUMPAUGH, J., AND BACHMANN, M. 2012. WTF? Detecting students who are conducting inquiry without thinking fastidiously. In User Modeling, Adaptation, and Personalization, J. Masthoff, B. Mobasher, M. C. Desmarais, and R. Nkambou, Eds. Lecture Notes in Computer Science. Springer Berlin Heidelberg, 286–296.

YUE, G., BECK, J. E., AND HEFFERNAN III, N. T. 2011. How to construct more accurate student models: Comparing and optimizing knowledge tracing and performance factor analysis. International Journal of Artificial Intelligence in Education 1-2, 27–46.
Section
EDM 2018 Journal Track