Mining Diagnostic Assessment Data for Concept Similarity



Published Oct 1, 2009
Tara Madhyastha Earl Hunt


This paper introduces a method for mining multiple-choice assessment data for similarity of the concepts represented by the multiple choice responses. The resulting similarity matrix can be used to visualize the distance between concepts in a lower-dimensional space. This gives an instructor a visualization of the relative difficulty of concepts among the students in the class. It may also be used to cluster concepts, to understand unknown responses in the context of previously identified concepts.

How to Cite

Madhyastha, T., & Hunt, E. (2009). Mining Diagnostic Assessment Data for Concept Similarity. Journal of Educational Data Mining, 1(1), 72–91.
Abstract 650 | PDF Downloads 433



student modeling, diagnostic assessment, misconceptions, concept similarity, visualization, individual differences

BAO, L. and REDISH, E.F. 2001. Concentration analysis: A quantitative assessment of student states. American Journal of Physics 69, S45-S53.

CHAMPAGNE, A., KLOPFER, L.E. and ANDERSON, J. 1980. Factors Influencing the Learning of Classical Mechanics. American Journal of Physics 48, 1074-1079.

CLEMENT, J. 1982. Students' Preconceptions in Introductory Mechanics. American Journal of Physics 50, 66-71.

DESMARAIS, M.C., MALUF, A. and LIU, J. 1995. User-expertise modeling with empirically derived probabilistic implication networks. User Modeling and User- Adapted Interaction 5, 283-315.

ELBY, A. 2000. What students' learning of representations tells us about constructivism. The Journal of Mathematical Behavior 19, 481-502.

FALMAGNE, J.-C., KOPPEN, M., VILLANO, M., DOIGNON, J.-P. and JOHANNESEN, L. 1990. Introduction to knowledge spaces: How to build, test, and search them. Psychological Review 97, 201-224.

FLEISS, J.L., LEVIN, B., PAIK, M.C. and FLEISS, J. 2003. Statistical Methods for Rates & Proportions. Wiley-Interscience.

FOSATTI, D. 2008. The role of positive feedback in intelligent tutoring systems. In Proceedings of the ACL-08: HLT Student Research Workshop (Companion Volume) Association for Computational Linguistics, Columbus.

GRAF, E.A. 2008. Approaches to the Design of Diagnostic Item Models Educational Testing Service, Princeton, NJ.

HAMZA, K.M. and WICKMAN, P.-O. 2008. Describing and analyzing learning in action: An empirical study of the importance of misconceptions in learning science. Science Education 92, 141-164.

HESTENES, D. 1992. Force Concept Inventory. Physics Teacher 30, 141-158.

HUANG, C.-W. 2003. Psychometric Analysis Based on Evidence-Centered Design and Cognitive Science of Learning to Explore Student's Problem-Solving in Physics, University of Maryland.

HUNT, E. and MINSTRELL, J. 1996. Effective instruction in science and mathematics: Psychological principles and social constraints. Issues in Education 2, 123-162.

ICHISE, R., TAKEDA, H. and HONIDEN, S. 2003. Integrating multiple internet directories by instance-based learning. In: Proceedings of the eighteenth International Joint Conference on Artificial Intelligence. (2003, 22--30.

KIRSTEN, T., THOR, A. and RAHM, E. 2007. Instance-Based Matching of Large Life Science Ontologies. In Data Integration in the Life Sciences, 172-187.

LEGREE, P., PSOTKA, J., TREMBLE, T. and BOURNE, D. 2005. Applying Consensus- Based Measurement to the Assessment of Emerging Domains, Army Research Institute for the Behavioral And Social Sciences.

MCCLOSKEY, M. 1993. Naive theories of motion. In Mental Models, D. GENTNER and A.L. STEVENS Eds. Lawrence Erlbaum, Hillsdale and London, 299-324.

MCDERMOTT, L.C. and REDISH, E.F. 1999. RL- PER1: Resource Letter on Physics Education Research. American Journal of Physics 67, 755--767.

MCDERMOTT, L.C., ROSENQUIST, M.L. and ZEE, E.H.V. 1987. Student difficulties in connecting graphs and physics: Examples from kinematics. American Journal of Physics 55, 503-513.

MINSTRELL, J. 2001. Facets of students' thinking: Designing to cross the gap from research to standards-based practice. In Designing for Science: Implications for Professional, Instructional, and Everyday Science, K. CROWLEY, C.D. SCHUNN and T. OKADA Eds., Mawah, NJ.

R. DEVELOPMENT CORE TEAM, 2008. R: A language and environment for statistical computing. R Foundation for Statistical Computing,, Vienna, Austria.

SCALISE, K., MADHYASTHA, T., MINSTRELL, J. and WILSON, M. in press. Improving Assessment Evidence in e-Learning Products: Some Solutions for Reliability. International Journal of Learning Technology (IJLT).

STAMPER, J. and BARNES, T. 2009. An Unsupervised, Frequency-based Metric for Selecting Hints in an MDP-based Tutor. In Educational Data Mining 2009: 2nd International Conference on Educational Data Mining, Proceedings, T. BARNES, M. DESMARAIS and S. VENURA Eds., Cordoba, Spain.

TROWBRIDGE, D. and MCDERMOTT, L.C. 1981. Investigation of Student Understanding of the Concept of Acceleration in One Dimension. American Journal of Physics 49, 242-253.

VIENNOT, L. 1979. Spontaneous Reasoning in Elementary Dynamics. European Journal of Science Education 1, 205-221.

WILSON, M. 2004. Constructing Measures: An Item Response Modeling Approach. Lawrence Erlbaum.

WILSON, M. and SLOANE, K. 2000. From Principles to Practice: An Embedded Assessment System. Applied Measurement in Education 13.

WRIGHT, B.D. and MASTERS, G.N. 1982. Rating Scale Analysis. Pluribus.