Using Data Mining Results to Improve Educational Video Game Design



Published Jun 9, 2015
Deirdre Kerr


This study uses information about in-game strategy use, identified through cluster analysis of actions in an educational video game, to make data-driven modifications to the game in order to reduce construct-irrelevant behavior. The examination of student strategies identified through cluster analysis indicated that (a) it was common for students to pass certain levels using incorrect mathematical strategies and (b) throughout the game a large number of students used order-based strategies to solve problems rather than strategies based on mathematics, making measurement of their mathematical ability difficult. To address the construct irrelevant variance produced by these issues, two minor changes were made to the game and students were randomly assigned to either the original version or the revised version. Students who played the revised version (a) solved levels using incorrect mathematical strategies significantly less often and (b) used order-based strategies significantly less often than students who played the original version. Additionally, student perception of the revised version of the game was more positive than student perception of the original version, though there were no significant differences in either in-game or paper-and-pencil posttest performance. These findings indicate that data mining results can be used to make targeted modifications to a game that increased the interpretability of the resulting data without negatively impacting student perception or performance.

How to Cite

Kerr, D. (2015). Using Data Mining Results to Improve Educational Video Game Design. Journal of Educational Data Mining, 7(3), 1–17.
Abstract 798 | PDF Downloads 808



educational video game, cluster analysis, mathematical strategies, order-based strategies

AMERSHI , S., and CONATI , C. (2011). Automatic recognition of learner types in exploratory learning environments. In Handbook of Educational Data Mining, C. ROMERO , S. VENTURA , M. PECHENIZKIY , and R. S. J. D . BAKER , Eds. CRC Press, Boca Raton, FL, 213-230.

BAKER , R. S. J. D ., CORBETT , A. T., KOEDINGER , K. R., EVENSON , S., ROLL , I., WAGNER , A. Z., NAIM , M., RASPAT , J., BAKER , D. J., and BECK , J. E. 2006. Adapting to when students game an intelligent tutoring system. In Intelligent Tutoring Systems, M. IKEDA , K. ASHLEY , and T.-W. CHAN , Eds. Springer, Berlin, Germany, 392-401.

BECK , J., and RODRIGO , M. M. T. 2014. Understanding wheel spinning in the context of affective factors. In Intelligent Tutoring Systems, S. TRAUSAN -M ATU , K. E. BOYER , M. CROSBY , and K. PANOURGIA , Eds. Springer, Berlin, Germany, 162-167.

BEJAR , I. I. 1984. Educational diagnostic assessment. Journal of Educational Measurement, 21, 2, 175- 189.

BERKHIN , R. 2006. A survey of clustering data mining techniques. In Grouping Multidimensional Data, J. KOGAN , C. NICHOLAS , and M. TEBOULLE , Eds. Springer, New York, NY, 25-72.

BONCHI , F., GIANNOTI , F., GOZZI , C., MANCO , G., NANNI , M., PEDRESCHI , D., RENSO , C., AND RUGGIERI , S. 2001. Web log data warehouses and mining for intelligent web caching. Data & Knowledge Engineering, 39, 165-189.

BRIGHT , G. W., BEHR , M. J., POST , T. R., and WACHSMUTH , I. 1988. Identifying fractions on number lines. Journal for Research in Mathematics Education, 19, 3, 215-232.

BUCKLEY , B. C., GOBERT , J. D., and HORWITZ , P. 1999. Using log files to track students’ model-based inquiry. Journal of Management, 25, 1, 1-27.

CARPENTER , T. P., FENNEMA , E., FRANKE , M. L., LEVI , L. W., and EMPSON , S. B. 2000. Cognitively Guided Instruction: AResearch-Based Teacher Professional Development Program for Elementary School Mathematics. National Center for Improving Student Learning and Achievement in Mathematics and Science, Madison, WI.

CASTRO , F., VELLIDO , A., NEBOT , A., and MUGICA , F. 2007. Applying data mining techniques to e- learning problems. In Evolution of Teaching and Learning Paradigms in Intelligent Environments, Studies in Computational Intelligence (SCI) Volume 62, L. C. JAIN , R. A. TEADMAN , and D. K.

TEDMAN , Eds. Springer, Berlin, Germany, 183-221.

CEN , H., KOEDINGER , K. R., and JUNKER , B. 2007. Is Over Practice Necessary? Improving Learning Efficiency with the Cognitive Tutor through Educational Data Mining. Frontiers in Artificial Intelligence and Applications, 158, 511-518.

CHUNG , G. K. W. K., BAKER , E. L., VENDLINSKI , T. P., BUSCHANG , R. E., DELACRUZ , G. C., MICHIUYE , J. K., and BITTICK , S. J. 2010. Testing instructional design variations in a prototype math game. In Current perspectives from three national R&D centers focused on game-based learning: Issues in learning, instruction, assessment, and game design. Structured poster session at the annual meeting of the American Educational Research Association, Denver, CO, April, 2010, R. ATKINSON , Chair.

CHUNG , G. K. W. K., and KERR , D. 2012. A primer on data logging to support extraction of meaningful information from educational games: An example from Save Patch. CRESST Report 814. National Center for Research on Evaluation, Standards, and Student Testing, University of California, Los Angeles, CA.

DOLMANS , D. H. J. M., GIJSELAERS , W. H., SCHMIDT , H. G., and VAN DER MEER , S. B. 1993. Problem effectiveness in a course using problem-based learning. Academic Medicine, 68, 207-213.

ETHEREDGE , M., LOPES , R., and BIDARRA , R. 2013. A generic method for classification of player behavior. In Proceedings of the Second AIIDE Workshop on Artificial Intelligence in the Game Design Process, M. J. NELSON , A. M. SMITH , and G. SMITH , Eds. AAAI Press, Palo Alto, CA.

FISCH , S. M. 2005. Making educational computer games “educational.” In Proceedings of the 4th International Conference for Interaction Design and Children. Boulder, CO, June 2005, ACM Press, New York, NY, 56-61.

FRAWLEY , W. J., PIATESKI -S HAPIRO , G., and MATHEUS , C. J. 1992. Knowledge discovery in databases: An overview. AI Magazine, 13, 3, 57-70.

GARCIA , E., ROMERO , C., VENTURA , S., DE CASTRO , C., and CALDERS , T. 2011. Association rule mining in learning management systems. In Handbook of Educational Data Mining, C. ROMERO , S.

VENTURA , M. PECHENIZKIY , and R. S. J. D . BAKER , Eds. CRC Press, Boca Raton, FL, 93-106.

HARPSTEAD , E., MAC LELLAN , C. J., KOEDINGER , K. R., ALEVEN , V., DOW , S. P., and MYERS , B. A. 2013. Investigating the solution space of an open-ended educational game using conceptual feature extraction. In Proceedings of the 6th International Conference on Educational Data Mining (EDM 2013), S. K. D'M ELLO , R. A. CALVO , and A. OLNEY , Eds. Memphis, TN, July 2013, International Educational Data Mining Society, 51-58.

HERSHKOVITZ , A., and NACHMIAS , R. 2011. Log-based assessment of motivation in online learning. In Handbook of Educational Data Mining, C. ROMERO , S. VENTURA , M. PECHENIZKIY , and R. S.J. D . BAKER , Eds. CRC Press, Boca Raton, FL, 389-416, 287-297.

JITENDRA , A., and KAMEENUI , E. J. 1996. Experts’ and novices’ error patterns in solving part-whole mathematical word problems. Journal of Educational Research, 90, 1, 42-51.

KERR , D. 2014. Identifying common mathematical misconceptions from actions in educational video games. CRESST Report 838. National Center for Research on Evaluation, Standards, and Student Testing, University of California, Los Angeles, CA.

KERR , D., and CHUNG , G. K. W. K. 2012. Identifying key features of student performance in educational video games and simulations through cluster analysis. Journal of Educational Data Mining, 4, 144- 182.

KIM , J. H., GUNN , D. V., SCHUH , E., PHILLIPS , B. C., PAGULAYAN , R. J., and WIXON , D. 2008. Tracking real-time user experience (TRUE): A comprehensive instrumentation solution for complex systems. In Proceedings of the 26th annual SIGCHI Conference on Human Factors in Computing Systems. Florence, Italy, April 2008, ACM Press, New York, NY, 443-452.

MAECHLER , M. 2012. cluster: Cluster analysis extended Rousseeuw et al. R package version 1.14.3. Retrieved from

MALCOM , S. M., CHUBIN , D. E., and JESSE , J. K. 2004. Standing our ground: A guidebook for STEM educators in the Post-Michigan Era. American Association for the Advancement of Science, Washington, DC.

MC NEIL , N. M., and ALIBALI , M. W. 2005. Why won’t you change your mind? Knowledge of operational patterns hinders learning and performance on equations. Child Development, 76, 4, 883- 899.

MERCERON , A., and YACEF , K. 2004. Mining student data captured from a web-based tutoring tool: Initial exploration and results. Journal of Interactive Learning Research, 15, 319-346.

MOSTOW , J., BECK , J. E., CUNEAO , A., GOUVEA , E., HEINER , C., and JUAREZ , O. 2011. Lessons from Project LISTEN’s session browser. In Handbook of Educational Data Mining, C. ROMERO , S.

VENTURA , M. PECHENIZKIY , and R. S. J. D . BAKER , Eds. CRC Press, Boca Raton, FL, 389-416.

NATIONAL COUNCIL OF TEACHERS OF MATHEMATICS . 2000. Principles and standards for school mathematics. Reston, VA.

NATIONAL MATHEMATICS ADVISORY PANEL . 2008. Foundations for success: The final report of the National Mathematics Advisory Panel. U.S. Department of Education, Washington, DC.

NATIONAL RESEARCH COUNCIL . 2011. Learning science through computer games and simulations. National Academies Press, Washington, DC.

NILAKANT , K., and MITOVIC , A. 2005. Application of data mining in constraint-based intelligent tutoring systems. In Proceedings of the 2005 Conference on Artificial Intelligence in Education: Supporting Learning through Intelligent and Socially Informed Technology, C.-K. LOOI , G. I.

MC CALLA , B. BREDEWEG , and J. BREUKER , Eds. Amsterdam, Netherlands, July 2005, IOS Press, Amsterdam, Netherlands, 896-898.

RD EVELOPMENT CORE TEAM . 2010. R: A language and Environment for Statistical Computing. Retrieved from

RAHKILA , M., and KARJALAINEN , M. 1999. Evaluation of learning in computer based education using log systems. In Proceedings of 29th ASEE/IEEE Frontiers in Education Conference (FIE ’99). San Antonio, TX, October 2009, IEEE, Piscataway, NJ, 16-22.

RODRIGO , M. M. T., ANGLO , E. A., SUGAY , J. O., and BAKER , R. S. J. D . 2008. Use of unsupervised clustering to characterize learner behaviors and affective states while using an intelligent tutoring system. In Proceedings of the 16th International Conference on Computers in Education, T.-W. CHAN , G. BISWAS , F.-C. CHEN , S. CHEN , C.C HOU , M. JACOBSON , KINSHUK , F. KLETT , C.-K. LOOI , T. MITROVIC , R. MIZOGUCHI , K. NAKABAYASHI , P. REIMANN , S. SUTHERS , S. YANG , and J.-C.Y ANG , Eds. Taipei, Taiwan, October 2008, Asia-Pacific Society for Computers in Education, Taipei, Taiwan , 49-56.

ROMERO , C., GONZALEZ , P., VENTURA , S., DEL JESUS , M. J., and HERRERA , F. 2009. Evolutionary algorithms for subgroup discovery in e-learning: A practical application using Moodle data. Expert Systems with Applications, 39, 1632-1644.

ROMERO , C., and VENTURA , S. 2007. Educational data mining: A survey from 1995 to 2005. Expert Systems with Applications, 35, 135-146.

ROMERO , C., VENTURA , S., PECHENIZKIY , M., and BAKER , R. S. J. D . 2011. Handbook of Educational Data Mining. CRC Press, Boca Raton, FL.

RUPP , A. A., GUSHTA , M., MISLEVY , R. J., and SHAFFER , D. W. 2010. Evidence centered design of epistemic games: Measurement principles for complex learning environments. The Journal of Technology, Learning, and Assessment, 8, 4. Retrieved from jtla/article/view/1623/1467

SAXE , G. B., SHAUGHNESSY , M. M., SHANNON , A., LANGER -O SUNA , J. M., CHINN , R., and GEARHART , M. 2007. Learning about fractions as points on a number line. In The learning of mathematics: Sixty- ninth yearbook, M. E. STRUTCHENS and G. W. MARTIN , Eds. National Council of Teachers of Mathematics, Reston, VA, 221-237.

SIEBERT , D., and GASKIN , N. 2006. Creating, naming, and justifying fractions. Teaching Children Mathematics, 12, 8, 394-400.

SISON , R., NUMAO , M., and SHIMURA , M. 2000. Multistrategy discovery and detection of novice programmer errors. Machine Learning, 38, 157-180.

TRIGWELL , K., PROSSER , M., and WATERHOUSE , F. 1999. Relations between teachers’ approaches to teaching and students’ approaches to learning. Higher Education, 37, 57-70.

VENDLINSKI , T. P., DELACRUZ , G. C., BUSCHANG , R. E., CHUNG , G. K. W. K., and BAKER , E. L. 2010. Developing high-quality assessments that align with instructional video games. CRESST Report 774. National Center for Research on Evaluation, Standards, and Student Testing, University of California, Los Angeles, CA.