Putting ECD into Practice: The Interplay of Theory and Data in Evidence Models within a Digital Learning Environment



Published Oct 1, 2012
André A. Rupp Roy Levy Kristen E. Dicerbo Shauna J. Sweet Aaron V. Crawford Tiago Caliço Martin Benson Derek Fay Katie L. Kunze Robert J. Mislevy John T. Behrens


In this paper we describe the development and refinement of evidence rules and measurement models within the evidence model of the evidence-centered design (ECD) framework in the context of the Packet Tracer digital learning environment of the Cisco Networking Academy. Using Packet Tracer learners design, configure, and troubleshoot computer networks within an interactive interface. This leads to product data, which result from the students' final submitted network configurations, and process data, which are log file entries detailing how they got to the final configurations. We discuss how an iterative cycle of empirical analyses and discussions with subject-matter experts is essential for identifying and accumulating evidence about skill profiles of learners and their development. We present results from descriptive, exploratory, and confirmatory diagnostic modeling analyses for both data types, which required bringing to bear a diversity of tools from multivariate statistics, modern psychometrics, and educational data mining. We close the paper with a discussion of the implications of this work for evidence-based argumentation guided by ECD principles within digital learning environments more generally.

How to Cite

Rupp, A. A., Levy, R., Dicerbo, K. E., Sweet, S. J., Crawford, A. V., Caliço, T., Benson, M., Fay, D., Kunze, K. L., Mislevy, R. J., & Behrens, J. T. (2012). Putting ECD into Practice: The Interplay of Theory and Data in Evidence Models within a Digital Learning Environment. Journal of Educational Data Mining, 4(1), 49–110. https://doi.org/10.5281/zenodo.3554643
Abstract 2027 | PDF Downloads 732



educational data mining, evidence-centered design, log files, diagnostic classification models, Bayesian networks

ALMOND, R. G., STEINBERG, L.S., AND MISLEVY, R. J. 2002. Enhancing the design and delivery of assessment systems: A four-process architecture. Journal of Technology, Learning, and Assessment 1(5). Available online at http://www.bc.edu/research/intasc/jtla/journal/v1n5.shtml

ALMOND, R. G., WILLIAMSON, D. M., MISLEVY, R. J., AND YAN, D. in press. Bayes nets in educational assessment. Springer, New York, NY.

BAKER, R. S. J. D., AND YACEF, K. 2009. The state of educational data mining in 2009: A review and future visions. Journal of Educational Data Mining, 1(1), 3-17.

BOX, G. E. 1976. Science and statistics. Journal of the American Statistical Association 71, 791-799.

CHAPPLE, K., JOHNSON, A., ROSSON, J., WEST, P., STANLEY, K., TERMAAT, B., AND BEHRENS, J. T. 2009. Applying cross-disciplinary threads to a global assessment system with an emphasis on simulation and gaming. Paper presented at the Annual meeting of the American Educational Research Association (AERA), San Diego, CA.

CROCKER, L., AND ALGINA, J. 1986. Introduction to classical and modern test theory. Wadsworth, Belmont, CA.

DE BOECK, P., AND WILSON, M., Eds. 2004. Explanatory item response models: A generalized linear and nonlinear approach. Springer, New York, NY.

FELDMAN, R., AND SANGER, J. 2006. The text mining handbook: Advanced approaches in analyzing unstructured data. Cambridge University Press, Cambridge, UK.

FREZZO, D. C., BEHRENS, J. T, MISLEVY, R. J., WEST, P., AND DICERBO, K. E. 2009. Psychometric and evidentiary approaches to simulation assessment in Packet Tracersoftware. In ICNS '09: Proceedings of the Fifth International Conference on Networking and Services, IEEE Computer Society, Washington, DC, 555– 560.

FREZZO, D. C., BEHRENS, J. T., MISLEVY, R. J. 2010. Design patterns for learning and assessment: Facilitating the introduction of a complex simulation-based learning environment into a community of instructors. Journal of Science Education and Technology 19, 105-114.

GOBERT, J. D., SAO PEDRO, M. A., BAKER, R. S. J. D., TOTO, E., AND MONTALVO, O. this issue. Leveraging educational data mining for real time performance assessment of scientific inquiry skills within microworlds. Journal of Educational Data Mining.

JOLIFFE, I. T. 2010. Principal component analysis, 2nd ed. Springer, New York, NY.

KANE, M. 2006. Validation. In Educational measurement, 4 th ed., R. L. BRENNAN, Ed. American Council on Education / Praeger, Washington, DC, 18-64.

KERR, D., AND CHUNG, G. K. W. K. this issue. Identifying key features of student performance in educational video games and simulations through cluster analysis. Journal of Educational Data Mining.

KLINE, T. 2005. Psychological testing: A practical approach to design and evaluation. Sage, Thousand Oaks, CA.

KUNINA-HABENICHT, O., RUPP, A. A., AND WILHELM, O. 2012. The impact of model misspecification on parameter estimation and item-fit assessment in log-linear diagnostic classification models. Journal of Educational Measurement 49, 59-81.

LATTIN, J., CARROLL, D., AND GREEN, P. 2002. Analyzing multivariate data. Duxbury Press, New York, NY.

LEIGHTON, J. P. 2004. Avoiding misconception, misuse, and missed opportunities: The collection of verbal reports in educational achievement testing. Educational Measurement: Issues and Practice 23(4), 6-15.

LEVY, R. 2009. The rise of Markov chain Monte Carlo estimation for psychometric modeling. Journal of Probability and Statistics, Article ID 537139, 18 pages.

LEVY, R., AND MISLEVY, R. J. 2004. Specifying and refining a measurement model for a computer-based interactive assessment. International Journal of Testing 4, 333-369.

LEVY, R., CRAWFORD, A. V., FAY, D., AND POOLE, K. L. 2011.Data-model fit assessment for Bayesian networks for simulation-based assessments. Presented at the Annual meeting of the American Educational Research Association (AERA), New Orleans, LA.

LEVY, R., MISLEVY, R. J., AND BEHRENS, J. T. 2011. MCMC in educational research. In Handbook of Markov chain Monte Carlo: Methods and applications, S. BROOKS, A. GELMAN, G. L. JONES, AND X. L. MENG, Eds., Chapman and Hall/CRC, London, UK, 531-545.

LEVY, R., AND SVETINA, D. 2011. A generalized dimensionality discrepancy measure for dimensionality assessment in multidimensional item response theory. British Journal of Mathematical and Statistical Psychology 64, 208-232.

LYNCH, S. 2007. Introduction to applied Bayesian statistics and estimation for social scientists. Springer, New York, NY.

MANNING, C. D., AND Schuetze, H. 1999. Foundations of statistical natural language processing. MIT Press, Boston, MA.

MARIS, E. 1999. Estimating multiple classification latent class models. Psychometrika 64, 187-212.

MISLEVY, R. J., BEHRENS, J. T., LEVY, R., AND DICERBO, K. E. 2011. The interplay of design and data exploration in an evolving assessment system . Manuscript submitted for publication.

MISLEVY, R. J., AND LEVY, R. 2007. Bayesian psychometric modeling from an evidence-centered design perspective. In Handbook of statistics, volume 26: Psychometrics, C. R. RAO AND S. SINHARAY, Eds., North-Holland, Amsterdam, The Netherlands, 839-865.

MISLEVY, R. J., LEVY, R., KROOPNICK, M., AND RUTSTEIN, D. 2008. Evidentiary foundations of mixture item response theory models. In Advances in latent variable mixture models, G. R. HANCOCK AND K. M. SAMUELSEN, Eds., Information Age Publishing, Charlotte, NC, 149-175.

MISLEVY, R. J., STEINBERG, L. S., AND ALMOND, R. G. 2003. On the structure of educational assessments. Measurement: Interdisciplinary Research and Perspectives 1, 3-62.

MISLEVY, R. J., STEINBERG, L. S., ALMOND, R. G., AND LUKAS, J. F. 2006. Concepts, terminology, and basic models of evidence-centered design. In Automated scoring of complex tasks in computer-based testing, D. M. WILLIAMSON, R. J. MISLEVY, AND I. I. BEJAR, Eds. Erlbaum, Mahway, NJ.

MISLEVY, R. J., BEHRENS, J. T., DICERBO, K. E., AND LEVY, R. this issue. Design and discovery in educational assessment: Evidence centered design, psychometrics, and data mining. Journal of Educational Data Mining.

RECKASE, M. D. 2009. Multidimensional item response theory. Springer, New York, NY.

RAYKOV, T., AND MARCOULIDES, G. A. 2011. Introduction to psychometric theory. Taylor and Francis, New York, NY.

ROMERO, C., VENTURA, S., PECHENIZKIY, M., AND BAKER, R. S. J. D., Eds. 2010. Handbook of educational data mining. Chapman and Hall / CRC, New York, NY.

RUPP, A., AND TEMPLIN, J. 2008. Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement: Interdisciplinary Research and Perspectives 6, 219-262.

RUPP, A. A., TEMPLIN, J., AND HENSON, R. A. 2010. Diagnostic measurement: Theory, methods, and applications. Guilford Press, New York.

RUPP, A. A., GUSHTA, M., MISLEVY, R. J., AND SHAFFER, D. W. 2010. Evidence-centered design of epistemic games: Measurement principles for complex learning environments. Journal of Technology, Learning, and Assessment 8(4). Available online at http://ejournals.bc.edu/ojs/index.php/jtla/article/view/1623

SCOTT, J. P., AND CARRINGTON, P.2011. The SAGE handbook of social network analysis. Sage, Thousand Oaks, CA.

SINHARAY, S. 2006. Model diagnostics for Bayesian networks. Journal of Educational and Behavioral Statistics 31, 1-33.

TATSUOKA, K. K. 2009. Cognitive assessment: An introduction to the rule-space method. Routledge, Florence, KY.

THISSEN, D., AND WAINER, H. Eds. 2001. Test scoring. Mahwah, NJ: Erlbaum.

VAN DER AALST, W. M. P. 2011. Process mining: Discovery, conformance and enhancement of business processes. Springer, New York, NY.

WILLIAMSON, D. M., XI, X., AND BREYER, F. J. 2012. A framework for the evaluation and use of automated scoring. Educational Measurement: Issues and Practice 31(1), 2-13.

YEN, W. M. 1984. Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement 8, 125-145.