Leveraging Educational Data Mining for Real-time Performance Assessment of Scientific Inquiry Skills within Microworlds



Published Oct 1, 2012
Janice D. Gobert Michael A. Sao Pedro Ryan S.J.d. Baker Ermal Toto Orlando Montalvo


We present Science Assistments, an interactive environment, which assesses students’ inquiry skills as they engage in inquiry using science microworlds. We frame our variables, tasks, assessments, and methods of analyzing data in terms of evidence-centered design. Specifically, we focus on the student model, the task model, and the evidence model in the conceptual assessment framework. In order to support both assessment and the provision of scaffolding, the environment makes inferences about student inquiry skills using models developed through a combination of text replay tagging [cf. Sao Pedro et al. 2011], a method for rapid manual coding of student log files, and educational data mining. Models were developed for multiple inquiry skills, with particular focus on detecting if students are testing their articulated hypotheses, and if they are designing controlled experiments. Student-level cross-validation was applied to validate that this approach can automatically and accurately identify these inquiry skills for new students. The resulting detectors also can be applied at run-time to drive scaffolding intervention.

How to Cite

Gobert, J. D., Sao Pedro, M. A., Baker, R. S., Toto, E., & Montalvo, O. (2012). Leveraging Educational Data Mining for Real-time Performance Assessment of Scientific Inquiry Skills within Microworlds. Journal of Educational Data Mining, 4(1), 111–143. https://doi.org/10.5281/zenodo.3554645
Abstract 1158 | PDF Downloads 520



performance assessment, inquiry skills, educational data mining, machine learning, text replay tagging

AGRAWAL, R., AND SRIKANT, R. 1994. Fast Algorithms for Mining Association Rules. In Proceedings of the 20 th VLDB Conference, Santiago, Chile, 487-499.

ALMOND, R.G., WILLIAMSON, D.M., MISLEVY, R.J., AND YAN, D. In press. Bayes nets in educational assessment. Springer, New York, NY.

ALONZO, A., AND ASCHBACHER, P.R. 2004. Value Added? Long assessment of students’ scientific inquiry skills. Presented at the Annual Meeting of the American Educational Research Association, San Diego, CA.

AMERSHI, S., AND CONATI, C. 2009. Combining Unsupervised and Supervised Machine Learning to Build User Models for Exploratory Learning Environments. Journal of Educational Data Mining, 1 (1), 71-81.

ANDERSON, J.R., AND LEBIERE, C. 1998. The atomic components of thought. Erlbaum, Mahwah, NJ.

BACHMANN, M. 2012. Biology Microworld to Assess Students’ Content Knowledge and Inquiry Skills and Leveraging Student modeling to Prescribe Design Features for Scaffolding Learning. Unpublished Master's thesis. Worcester Polytechnic Institute, Worcester, MA.

BACHMANN, M., GOBERT, J.D., AND BECK, J. 2010. Tracking Students’ Inquiry Paths through Student Transition Analysis. In Proceedings of the 3rd International Conference on Educational Data Mining, 269-270.

BACHMANN, M., GOBERT, J., AND BECK, J. 2011. Do Differences in Student’s Exploration Behavior lead to differences in Domain Learning or Inquiry Skills? Presented at the Annual Meeting of the American Educational Research Association, New Orleans, LA.

BAKER, R.S.J.D., CORBETT, A.T., AND ALEVEN, V. 2008a. More Accurate Student modeling Through Contextual Estimation of Slip and Guess Probabilities in Bayesian Knowledge Tracing. In Proceedings of the 9th International Conference on Intelligent Tutoring Systems, 406-415.

BAKER, R.S.J.D., CORBETT, A.T., GOWDA, S.M., WAGNER, A.Z., MACLAREN, B.M., KAUFFMAN, L.R., MITCHELL, A.P., AND GIGUERE, S. 2010. Contextual Slip and Prediction of Student Performance After Use of an Intelligent Tutor. In Proceedings of the 18th Annual Conference on User Modeling, Adaptation, and Personalization, 52-63.

BAKER, R.S., CORBETT, A.T., ROLL, I., AND KOEDINGER, K.R. 2008b. Developing a Generalizable Detector of When Students Game the System. User Modeling and User- Adapted Interaction , 18 (3), 287-314.

BAKER, R., CORBETT, A., AND WAGNER, A. 2006. Human Classification of Low-Fidelity Replays of Student Actions. In Proceedings of the Educational Data Mining Workshop at the 8th International Conference on Intelligent Tutoring Systems, 29-36.

BAKER, R., AND DE CARVALHO, A. 2008. Labeling Student Behavior Faster and More Precisely with Text Replays. In Proceedings of the 1st International Conference on Educational Data Mining, EDM 2008, R.S. BAKER, T. BARNES, AND J.E. BECK, Eds. Montreal, Quebec, Canada, 38-47.

BAKER, R.S., MITROVIC, A., AND MATHEWS, M. 2010. Detecting Gaming the System in Constraint-Based Tutors. In Proceedings of the 18th Annual Conference on User Modeling, Adaptation, and Personalization, UMAP 2010. LNCS 6075, P. DE BRA, P. KOBSA, AND D. CHIN, Eds. Springer-Verlag, Big Island of Hawaii, HI, 267-278.

BAKER, R.S.J.D., PARDOS, Z., GOWDA, S., NOORAEI, B., AND HEFFERNAN, N. 2011. Ensembling Predictions of Student Knowledge within Intelligent Tutoring Systems. Proceedings of the 19th International Conference on User Modeling, Adaptation, and Personalization, 13-24.

BAKER, R., AND YACEF, K. 2009. The State of Educational Data Mining in 2009: A Review and Future Visions. Journal of Educational Data Mining, 1 (1), 3-17.

BAUM, L.E., AND PETRIE, T. 1966. Statistical Inference for Probabilistic Functions of Finite State Markov Chains. The Annals of Mathematical Statistics, 37 (6), 1554-1563.

BAXTER, G., AND SHAVELSON, R. 1994. Science performance assessments: benchmarks and surrogates. International Journal of Education Research, 21 (3) , 279-298.

BERNARDINI, A., AND CONATI, C. 2010. Discovering and Recognizing Student Interaction Patterns in Exploratory Learning Environments. In Proceedings of the 10th International Conference of Intelligent Tutoring Systems, ITS 2010, Part 1, V. ALEVEN, J. KAY, AND J. MOSTOW, Eds. Springer-Verlag, Berlin Heidelberg, 125-134.

BLACK, P. 1999. Testing: Friend or Foe? Theory and Practice of Assessment and Testing. Falmer Press, New York, NY.

BRYSON, A.E., AND HO, Y.-C. 1969. Applied Optimal Control. Blaisdell, New York.

BUCKLEY, B. C., GOBERT, J.D., AND HORWITZ, P. 2006. Using log files to track students' model-based inquiry. In Proceedings of the 7th International Conference on Learning Sciences, ICLS 2006, Erlbaum, Bloomington, IN, 57-63.

BUCKLEY, B., GOBERT, J., HORWITZ, P., AND O’DWYER, L. 2010. Looking Inside the Black Box: Assessments and Decision-making in BioLogica. International Journal of Learning Technology, 5 (2), 166-190.

BULL, S., BRNA, P., AND PAIN, H. 1995. Extending the scope of the student model. User Modeling and User-Adapted Interaction, 5 (1), 45-65.

CHAMPAGNE, A., BERGIN, K., BYBEE, R., DUSCHL, R., AND GALLAGHER, J. 2004. NAEP 2009 science framework development: Issues and recommendations. Paper commissioned by the National Assessment Governing Board, Washington, DC.

CHEN, Z., AND KLAHR, D. 1999. All Other Things Being Equal: Acquisition and Transfer of the Control of Variables Strategy. Child Development, 70 (5), 1098-1120.

CHI, M. 2000. Self-explaining Expository Texts: The Dual Process of Generating Inferences and Repairing Mental Models. In Advances in Instructional Psychology, R. GLASER, Ed. Lawrence Erlbaum Associates, Inc., Mahweh, NJ, 161-238.

CHI, M., BASSOK, M., LEWIS, M.W., REIMANN, P., AND GLASER, R. 1989. Self- explanations: How students study and use examples in learning to solve problems. Cognitive Science, 13, 145-182.

CHI, M., DELEEUW, N., CHIU, M., AND LAVANCHER, C. 1994. Eliciting Self-Explanations Improves Understanding. Cognitive Science, 18, 439-477.

CHINN, C.A., AND BREWER, W.F. 1993. The role of anomalous data in knowledge acquisition: A theoretical framework and implications for science instruction. Review of Educational Research, 63, 1-49.

COHEN, J. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20 (1), 37-46.

CORBETT, A., AND ANDERSON, J. 1995. Knowledge-Tracing: Modeling the Acquisition of Procedural Knowledge. User Modeling and User-Adapted Interaction, 4, 253-278.

CROCKER, L., AND ALGINA, J. 2006. Introduction to Classical and Modern Test Theory. Cengage Learning, Independency, KY.

DE AYALA, R.J. 2009. The Theory and Practice of Item Response Theory. The Guilford Press, New York, NY.

DE JONG, T. 2006. Computer simulations - Technological advances in inquiry learning. Science, 312, 532-533.

DE JONG, T., BEISHUIZENM, J., HULSHOF, C., PRINS, F., VAN RIJN, H., VAN SOMEREN, M., ET AL. 2005. Determinants of Discovery Learning in a Complex Simulation Learning Environment. In Cognition, Education and Communication Technology, P. GARDENFORS, AND P. JOHANSSON, Eds. Lawrence Erlbaum Associates, Mahwah, NJ, 257-283.

DE JONG, T., VAN JOOLINGEN, W., GIEMZA, A., GIRAULT, I., HOPPE, U., KINDERMANN, J., ET AL. 2010. Learning by creating and exchanging objects: The SCY experience. British Journal of Educational Technology, 41 (6), 909-921.

FADEL, C., HONEY, M., AND PASNICK, S. 2007. Assessment in the Age of Innovation. Education Week, 26 (38), 34-40.

FENG, M., HEFFERNAN, N.T., AND KOEDINGER, K.R. 2009. Addressing the assessment challenge in an online system that tutors as it assesses. User Modeling and User-Adapted Interaction: The Journal of Personalization Research (UMUAI), 19 (3), 243-266.

GHAZARIAN, A., AND NOORHOSSEINI, S.M. 2010. Automatic Detection of Users' Skill Levels Using High-Frequency User Interface Events. User Modeling and User-Adapted Interaction, 20 (2), 109-146.

GLASER, R., SCHAUBLE, L., RAGHAVAN, K., AND ZEITZ, C. 1992. Scientific Reasoning Across Different Domains. In Computer-based Learning Environments and Problem-Solving, E. DECORTE, M. LINN, H. MANDL, AND L. VERSCHAFFEL, Eds. Springer-Verlag, Heidelberg, Germany, 345-371.

GOBERT, J. 2005a. Leveraging Technology and Cognitive Theory on Visualization to Promote Students' Science Learning and Literacy. In Visualization in Science Education, J. GILBERT, Ed. Springer-Verlag Publishers, Dordrecht, The Netherlands, 73-90.

GOBERT, J. 2005b. The effects of different learning tasks on conceptual understanding in science: teasing out representational modality of diagramming versus explaining. Journal of Geoscience Education, 53 (4), 444-455.

GOBERT, J., AND BAKER, R. 2010. Empirical Research: Emerging Research: Using Automated Detectors to Examine the Relationships Between Learner Attributes and Behaviors During Inquiry in Science. Proposal Awarded July 1, 2010 by the National Science Foundation.

GOBERT, J., AND BAKER, R. 2012. The Development of an Intelligent Pedagogical Agent for Physical Science Inquiry Driven by Educational Data Mining. Proposal (R305A120778) awarded May, 2012 by the U.S. Dept. of Education.

GOBERT, J., HEFFERNAN, N., KOEDINGER, K., AND BECK, J. 2008. ASSISTments Meets Science Learning (AMSL). Proposal (R305A090170) funded February 1, 2009 by the U.S. Dept. of Education.

GOBERT, J., HEFFERNAN, N., RUIZ, C., AND RYUNG, K. 2007. AMI: ASSISTments Meets Inquiry. Proposal NSF-DRL 0733286 funded by the National Science Foundation.

GOBERT, J., AND KOEDINGER, K. 2011. Using model-tracing to conduct performance assessment of students’ inquiry skills within a Microworld. Presented at the Society for Research on Educational Effectiveness, Washington, D.C., September 8-10.

GOBERT, J., RAZIUDDIN, J., AND MONTALVO, O. In prep. Warranting claims as an epistemological driver. Manuscript in preparation.

GOTWALS, A., AND SONGER, N. 2006. Measuring Students’ Scientific Content and Inquiry Reasoning. In Proceedings of the 7th International Conference of the Learning Sciences, ICLS 2006, S. BARAB, K. HAY, AND D. HICKEY, Eds. Lawrence Erlbuam Associates, Bloomington, IN, 196-202.

HAMBLETON R., AND JONES, R. 1993. Comparison of Classical Test Theory and Item Response Theory ad their applications to test development. Educational Measurement: Issues & Practice 12 (3), 38-47.

HANLEY, J., AND MCNEIL, B. 1982. The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve. Radiology, 143, 29-36.

HARRISON, A.M., AND SCHUNN, C.D. 2004. The transfer of logically general scientific reasoning skills. In Proceedings of the 26th Annual Conference of the Cognitive Science Society, K. FORBUS, D. GENTNER, AND T. REGIER, Eds. Erlbaum, Mahwah, NJ, 541- 546.

HEFFERNAN, N., TURNER, T., LOURENCO, A., MACASEK, M., NUZZO-JONES, G., AND KOEDINGER, K. 2006. The ASSISTment builder: Towards an analysis of cost effectiveness of ITS creation. In Proceedings of the 19th International FLAIRS Conference, Melbourne Beach, FL, 515-520.

HERSHKOVITZ, A., WIXON, M., BAKER, R.S.J.D., GOBERT, J., AND SAO PEDRO, M. 2011. Carelessness and Goal Orientation in a Science Microworld. Proceedings of the 15th International Conference on Artificial Intelligence in Education, 462-465.

HMELO-SILVER, C.E., DUNCAN, R.G., AND CHINN, C.A. 2007. Scaffolding and Achievement in Problem-Based and Inquiry Learning: A Response to Krischner, Sweller, and Clark (2006). Educational Psychologist, 42 (2), 99-107.

KIRSCHNER, P.A., SWELLER, J., AND CLARK, R.E. 2006. Why minimal guidance during instruction does not work: An analysis of the failure of constructivist, discover, problem-based, experiential, and inquiry-based teaching. Educational Psychologist, 41 (2), 75-86.

KLAHR, D., AND DUNBAR, K. 1988. Dual search space during scientific reasoning. Cognitive Science, 12 (1), 1-48.

KOEDINGER, K., ANDERSON, J., HADLEY, W., AND MARK, M. 1997. Intelligent Tutoring Goes to School in the Big City. International Journal of Arificial Intelligence in Education, 8, 30-43.

KOEDINGER, K., AND CORBETT, A. 2006. Cognitive Tutors: Technology Bringing Learning Sciences to the Classroom. In The Cambridge Handbook of the Learning Sciences, R. SAWYER, Ed. Cambridge University Press, New York, NY, 61-77.

KOEDINGER, K., SUTHERS, D., AND FORBUS, K. 1998. Component-Based Construction of a Science Learning Space. International Journal of Artificial Intelligence in Education (IJAIED), 10, 292-313.

KRAJCIK, J., BLUMENFELD, P., MARX, R., BASS, K., FREDRICKS, J., AND SOLOWAY, E. 1998. Inquiry in project-based science classrooms: Initial attempts by middle school students. Journal of the Learning Sciences, 7, 313-350.

KUHN, D. 1991. The skills of argument. Cambridge Press, Cambridge, MA.

KUHN, D. 2005. Education for thinking. Harvard University Press, Cambridge, MA.

KUHN, D., GARCIA-MILA, M., ZOHAR, A., AND ANDERSEN, C. 1995. Strategies of knowledge acquisition. Monographs of the Society for Research in Child Development, 60 (4, Serial No. 245).

KUHN, D., SCHAUBLE, L., AND GARCIA-MILA, M. 1992. Cross-domain development of scientific reasoning, Cognition and Instruction, 9 (4), 285-327.

LIU, B., HSU, W., AND MA, Y. 1998. Integrating Classification and Association Rule Mining. In Proceedings of the Fourth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, 80-86.

MARTIN, J., AND VANLEHN, K. 1995. Student assessment using Bayesian nets. International Journal of Human-Computer Studies, 42 (6), 575-591. MASSACHUSETTS DEPARTMENT OF EDUCATION. 2006. Massachusetts Science and Technology/Engineering Curriculum Framework. Massachusetts Department of Education, Malden, MA.

MCELHANEY, K., AND LINN, M. 2008. Impacts of Students' Experimentation Using a Dynamic Visualization on their Understanding of Motion. In Proceedings of the 8th International Conference of the Learning Sciences, ICLS 2008, Volume 2, International Society of the Learning Sciences, Inc., Utrecht, The Netherlands, 51-58.

MCELHANEY, K., AND LINN, M. 2010. Helping Students Make Controlled Experiments More Informative. In Learning in the Disciplines: Proceedings of the 9th International Conference of the Learning Sciences (ICLS 2010) - Volume 1, Full Papers, K. GOMEZ, L. LYONS, AND J. RADINSKY, Eds. International Society of the Learning Sciences, Chicago, IL, 786-793.

MCNEILL, K.L., AND KRAJCIK, J. 2007. Middle school students' use of appropriate and inappropriate evidence in writing scientific explanations. In Thinking with data, M. LOVETT AND P. SHAH, Eds. Taylor & Francis Group, LLC, New York, NY, 233-265.

MESSICK, S. 1994. The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23 (2), 13-23.

MISLEVY, R.J., BEHRENS, J.T., DICERBO, K.E., AND LEVY, R., This issue. Design and discovery in educational assessment: Evidence centered design, psychometrics, and data mining. Journal of Educational Data Mining.

MISLEVY, R., CHUDOWSKY, N., DRANEY, K., FRIED, R., GAFFNEY, T., AND HAERTEL, G. 2003. Design Patterns for Assessing Science Inquiry, SRI International, Menlo Park, CA.

MISLEVY, R.J., AND HAERTEL, G.D. 2006. Implications of ECD for educational testing. Educational Measurement: Issues and Practice, 25 (4), 6-20.

MISLEVY, R.J., STEINBERG, L.S., AND ALMOND, R.G. 2003. On the Structure of Educational Assessments. Measurement: Interdisciplinary Research and Perspectives, 1, 3- 67.

MISLEVY, R., STEINBERG, L., ALMOND, R., AND LUKAS, J. 2006. Concepts, Terminology, and Basic Models of ECD. Automated Scoring of Complex Tasks in Computer- Based Testing, D. WILLIAMSON, R. MISLEVY, AND I. BEJAR, Eds. Lawrence Erlbaum Associates, Mawah, NJ.

MITROVIC, A., MAYO, M., SURAWEERA, P., AND MARTIN, B. 2001. Constraint-Based Tutors: A Success Story. In Proceedings of the 14th International Conference on Industrial and Engineering Application of Artificial Intelligence and Expert Systems: Engineering of Intelligent Systems, IEA/AIE-2001. LNCS 2070, L. MONOSTORI, J. VANCZA, AND M. ALI, Eds. Springer-Verlag, Budapest, Hungary, 931-940.

MONTALVO, O., BAKER, R., SAO PEDRO, M., NAKAMA, A., AND GOBERT, J. 2010. Identifying Students' Inquiry Planning Using Machine Learning. In Proceedings of the 3rd International Conference on Educational Data Mining, R. BAKER, A. MERCERON, AND P. PAVLIK, Eds. Pittsburgh, PA, 141-150.

NATIONAL RESEARCH COUNCIL. 1996. National Science Education Standards. National Academy Press, Washington, D.C. NATIONAL RESEARCH COUNCIL. 2011. A framework for K-12 science education: Practices, crosscutting concepts, and core ideas. National Academy Press, Washington, D.C.

NEWELL, A. 1990. Unified Theories of Cognition. Harvard University Press, Cambridge, MA.

NEWELL, A., AND SIMON, H.A. 1972. Human problem solving. Prentice-Hall, Englewood Cliffs, NJ.

NJOO, M., AND DE JONG, T. 1993. Exploratory Learning with a Computer Simulations for Control Theory: Learning Processes and Instructional Support. Journal of Research in Science Teaching, 30, 821-844.

PAPERT, S. 1980. Computer-based Microworlds as Incubators for Powerful Ideas. In The Computer in the School: Tutor, Tool, Tutee, R. TAYLOR, Ed. Teacher's College Press, New York, NY, 203-201.

PARDOS, Z.A., HEFFERNAN, N.T., ANDERSON, B., AND HEFFERNAN, L. 2010. Using Fine-Grained Skill Models to Fit Student Performance with Bayesian Networks. Handbook of Educational Data Mining, C. ROMERO, S. VENTURA, S.R. VIOLA, M. PECHENIZKIY, AND R.S.J. BAKER, Eds. Chapman & Hall/CRC Press.

PEA, R., AND KURLAND, D. 1984. On the Cognitive Effects of Learning Computer Programming. New Ideas in Psychology, 2, 137-168.

PELLEGRINO, J., CHUDOWSKY, N., AND GLASER, R. 2001. Knowing What Students Know: The Science and Design of Educational Assessment. National Academy Press, Washington, DC.

PERKINS, D. 1986. Knowledge as design. Erlbaum, Hillsdale, NJ.

QUELLMALZ, E., KREIKEMEIER, P., DEBARGER, A. H., AND HAERTEL, G. 2007. A study of the alignment of the NAEP, TIMSS, and New Standards Science Assessments with the inquiry abilities in the National Science Education Standards. Presented at the Annual Meeting of the American Educational Research Association, Chicago, IL, April 9-13.

QUELLMALZ, E., TIMMS, M., AND SCHNEIDER, S. 2009. Assessment of Student Learning in Science Simulations and Games. National Research Council Report, Washington, D.C.

RAZZAQ, L., FENG, M., NUZZO-JONES, G., HEFFERNAN, N.T., KOEDINGER, K.R., JUNKER, B., RITTER, S., KNIGHT, A., ANISZCZYK, C., CHOKSEY, S., LIVAK, T., MERCADO, E., TURNER, T.E., UPALEKAR. R, WALONOSKI, J.A., MACASEK. M.A., AND RASMUSSEN, K.P. 2005. The Assistment Project: Blending Assessment and Assisting. In Proceedings of the 12th Artificial Intelligence In Education, C.K. LOOI, G. MCCALLA, B. BREDEWEG, AND J. BREUKER, Eds. ISO Press, Amsterdam, The Netherlands, 555-562.

REIMANN, P. 1991. Detecting functional relations in a computerized discovery environment. Learning and Instruction, 1 (1), 45-65.

RESNICK, M. 1997. Turtles, Termintes, and Traffic Jams: Explorations in Massively Parallel Microworlds. MIT Press, Cambridge, MA.

REYE, J. 2004. Student modeling Based on Belief Networks. International Journal of Artificial Intelligence in Education, 14 (1), 1-33.

RICHARDSON, J. 2008. Science ASSISTments: Tutoring Inquiry Skills in Middle School Students. Unpublished Interactive Qualifying Project, Worcester Polytechnic Institute, Worcester, MA.

RITTER, S., HARRIS, T., NIXON, T., DICKINSON, D., MURRAY, R. C., AND TOWLE, B. 2009. Reducing the Knowledge-Tracing Space. In Proceedings of the 2nd International Conference on Educational Data Mining, EDM 2009, T. BARNES, M. DESMARAIS, C. ROMERO, AND S. VENTURA, Eds. Cordoba, Spain, 151-160.

ROMERO, C., AND VENTURA, S. 2010. Educational Data Mining: A Review of the State-of- the-Art. IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews, 40 (6), 601-618.

ROWE, J., AND LESTER, J. 2010. Modeling User Knowledge with Dynamic Bayesian Networks in Interactive Narrative Environments. In Proceedings of the 6th Annual AI and Interactive Digital Entertainment Conference, AIIDE 2010, C.G. YOUNGBLOOD AND V. BULITKO, Eds. AAAI Press, Palo Alto, CA, 57-62.

RUIZ-PRIMO, M., AND SHAVELSON, R. 1996. Rhetoric and reality in science performance assessment. Journal of Research in Science Teaching, 33 (10), 1045-1063.

RUPP, A.A., GUSHTA, M., MISLEVY, R.J., AND SHAFFER, D.W. 2010. Evidence-centered design of epistemic games: Measurement principles for complex learning environments. The Journal of Technology, Learning, and Assessment, 8 (4), 3-47.

RUPP, A.A., LEVY, R., DICERBO, K.E., SWEET, S., ET AL. This issue. Putting ECD into practice: The interplay of theory and data in evidence models within a digital learning environment. Journal of Educational Data Mining.

RUSSELL, S., AND NORVIG, P. 2009. Artificial Intelligence: A Modern Approach, 3 rd Edition. Prentice Hall, Upper Saddle River, NJ.

SAO PEDRO, M., BAKER, R., GOBERT, J., MONTALVO, O., AND NAKAMA, A. 2011. Leveraging Machine-Learned Detectors of Systematic Inquiry Behavior to Estimate and Predict Transfer of Inquiry Skill. User Modeling and User-Adapted Interaction, DOI: 10.1007/s11257-011-9101-0.

SAO PEDRO, M.A., BAKER, R.S., MONTALVO, O., NAKAMA, A., AND GOBERT, J.D. 2010. Using Text Replay Tagging to Produce Detectors of Systematic Experimentation Behavior Patterns. In Proceedings of the 3rd International Conference on Educational Data Mining, R. BAKER, A. MERCERON, AND P. PAVLIK, Eds. Pittsburgh, PA, 181-190.

SCALISE, K., TIMMS, M., CLARK, L., AND MOORJANI, A. 2009. Student learning in science simulatons: What makes a difference? Paper presented at the America Educational Research Association, San Diego, CA.

SCHAUBLE, L., GLASER, R., DUSCHL, R.A., SCHULZE, S., AND JOHN, J. 1995. Students' Understanding of the Objectives and Procedures of Experimentation in the Science Classroom. The Journal of the Learning Sciences, 4, 131-166.

SCHUNN, C.D., AND ANDERSON, J.R. 1998. Scientific Discovery. In The Atomic Components of Thought, J.R. ANDERSON, Ed. Lawrence Erlbaum Associates Inc., Mahwah, NJ, 385-428.

SCHUNN, C.D., AND ANDERSON, J.R. 1999. The generality/specificity of expertise in scientific reasoning. Cognitive Science, 23 (3), 337-370.

SHAVELSON, R., WILEY, E.W., AND RUIZ-PRIMO, M. 1999. Note On Sources of Sampling Variability in Science Performance Assessments. Journal of Educational Measurement, 36 (1) , 61-71.

SHUTE, V., AND GLASER, R. 1990. A large-scale evaluation of an intelligent discovery world: Smithtown. Interactive Learning Environments, 1, 55-71.

STEVENS, R., SOLLER, A., COOPER, M., AND SPRANG, M. 2004. Modeling the Development of Problem Solving Skills in Chemistry with a Web-Based Tutor. In Proceedings of the 7th International Conference on Intelligent Tutoring Systems, ITS 2004. LNCS 3220, J.C. LESTER, R.M. VICARI, AND F. PARAGUACU, Eds. Springer, Maceio, Alagoas, Brazil, 580-591.

TSIRGI, J. 1980. Sensible Reasoning: A Hypothesis about Hypotheses. Child Development, 51, 1-10.

VAN JOOLINGEN, W., AND DE JONG, T. 1991. Supporting Hypothesis Generation by Learners Exploring an Interactive Computer Simulation. Instructional Science, 20, 389-404.

VAN JOOLINGEN, W.R., AND DE JONG, T. 1997. An extended dual search space model of scientific discovery learning. Instructional Science, 25 (5), 307-346.

VAN JOOLINGEN, W.R., AND DE JONG, T. 2003. SimQuest, Authoring Educational Simulations. Authoring Tools for Advanced Technology Learning Environments: Toward Cost-effective Adaptive, Interactive, and Intelligent Educational Software, T. MURRAY, S. BLESSING, AND S. AINSWORTH, Eds. Kluwer, Dordrecht, The Netherlands, 1-31.

WALONOSKI, J., AND HEFFERNAN, N. 2006. Detection and Analysis of Off-Task Gaming Behavior in Intelligent Tutoring Systems. In Proceedings of the 8th International Conference on Intelligent Tutoring Systems, ITS 2006. LNCS 4053, M. IKEDA, K. ASHLAY, AND T.- W. CHAN, Eds. Springer-Verlag, Johngli, Taiwan, 382-391.

WENGER, E. 1987. Artificial Intelligence and Intelligent Tutoring of Knowledge. Morgan Kaufmann, Los Altos, CA.

WHITE, B., AND FREDERIKSEN, J. 1998. Inquiry, Modeling and Metacognition: Making Science Accessible to All Students. Cognition and Instruction, 16 (1), 3-118.

WILLIAMSON, D., MISLEVY, R., AND BEJAR, I. 2006. Automated Scoring of Complex Tasks in Computer-Based Testing. Lawrence Erlbaum Associates, Mahwah: NJ.

WITTEN, I., AND FRANK, E. 2005. Data Mining: Practical Machine Learning Tools and Techniques, 2nd Edition. Morgan Kaufmann, San Francisco, CA.