Response process data have the potential to provide a rich description of test-takers’ thinking processes.
However, retrieving insights from these data presents a challenge for educational assessments and educational
data mining as they are complex and not well annotated. The present study addresses this
challenge by developing a computational model that simulates how different problem-solving strategies
would behave while searching for a solution to a Program for International Student Assessment (PISA)
2012 problem-solving item, and uses n-gram processing of data together with a na¨ive Bayesian classifier
to infer latent problem-solving strategies from the test-takers’ response process data. The retrieval of
simulated strategies improved with increased n-gram length, reaching an accuracy of 0.72 on the original
PISA task. Applying the model to generalized versions of the task showed that classification accuracy increased
with problem size and the mean number of actions, reaching a classification accuracy of 0.90 for
certain task versions. The strategy that was most efficient and effective in the PISA Traffic task evaluated
paths based on the labeled travel time. However, in generalized versions of the task, a straight line strategy
was more effective. When applying the classifier to empirical data, most test-takers were classified as
using a random path strategy (46%). Test-takers classified as using the travel time strategy had the highest
probability of solving the task. The test-takers classified as using the random actions strategy
had the lowest probability of solving the task. The effect of (classified) strategy on general
PISA problem-solving performance was overall weak, except for a negative effect for the random actions
strategy (β ≈ −65, CI95% ≈ [−96,−36]). The study contributes with a novel approach to inferring
latent problem-solving strategies from action sequences. The study also illustrates how simulations can
provide valuable information about item design by exploring how changing item properties could affect
the accuracy of inferences about unobserved problem-solving strategies.
How to Cite
process data, computational cognitive modeling, PISA, problem-solving, educational assessment
BAKER, R. 2010. Data mining. In International Encyclopedia of Education (Third Edition), Third Edition ed., P. Peterson, E. Baker, and B. McGaw, Eds. Elsevier, Oxford, 112–118.
BAKER, R. AND SIEMENS, G. 2014. Educational Data Mining and Learning Analytics, 2 ed. Cambridge Handbooks in Psychology. Cambridge University Press, 253–272.
BASSO, D., BISIACCHI, P. S., COTELLI, M., AND FARINELLO, C. 2001. Planning times during traveling salesman’s problem: Differences between closed head injury and normal subjects. Brain and Cognition 46, 1-2, 38–42.
BERGSTRA, J. AND BENGIO, Y. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research 13, 10, 281–305.
BISHOP, C. M. 2006. Pattern Recognition and Machine Learning. Springer.
CARPENTER, B., GELMAN, A., HOFFMAN, M. D., LEE, D., GOODRICH, B., BETANCOURT, M., BRUBAKER, M., GUO, J., LI, P., AND RIDDELL, A. 2017. Stan: A probabilistic programming language. Journal of Statistical Software 76, 1, 1–32.
CHEN, Y., ZHANG, J., YANG, Y., AND LEE, Y.-S. 2022. Latent space model for process data. Journal of Educational Measurement. Advance online publication. doi:10.1111/jedm.12337.
COWAN, N. 2001. The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences 24, 1, 87–114.
DALTON, R. C. 2003. The secret is to follow your nose: Route path selection and angularity. Environment and Behavior 35, 1, 107–131.
DE GROOT, A. D. 1978. Thought and Choice in Chess, second ed. Mouton Publishers.
DUBEY, R. K., SOHN, S. S., THRASH, T., HOLSCHER, C., KAPADIA, M., AND BORRMANN, A. 2022. Cognitive path planning with spatial memory distortion. IEEE Transactions on Visualization and Computer Graphics. Advance online publication. doi:10.1109/TVCG.2022.3163794.
DYE, H. A. 2007. A diagrammatic reasoning: Route planning on maps with act-r. In Eighth International Conference on Cognitive Modeling, R. L. Lewis, T. A. Polk, and J. E. Laird, Eds. 217–218.
EDELKAMP, S. AND SCHRODL, S. 2011. Heuristic search: theory and applications. Elsevier.
ERCIKAN, K. AND PELLEGRINO, J. W. 2017. Validation of score meaning for the next generation of assessments: The use of response processes. Taylor & Francis.
ERICSSON, K. A. AND SIMON, H. A. 1984. Protocol analysis: Verbal reports as data. MIT Press.
FANG, G. AND YING, Z. 2020. Latent theme dictionary model for finding co-occurrent patterns in process data. Psychometrika 85, 3, 775–811.
FUNKE, J., FISCHER, A., AND HOLT, D. V. 2018. Competencies for complexity: problem solving in the twenty-first century. In Assessment and teaching of 21st century skills. Springer, 41–53.
GÄRLING, T. 1989. The role of cognitive maps in spatial decisions. Journal of Environmental Psychology 9, 4, 269–278.
GREIFF, S., WÜSTENBERG, S., AND AVVISATI, F. 2015. Computer-generated log-file analyses as a window into students’ minds? a showcase study based on the pisa 2012 assessment of problem solving. Computers & Education 91, 92–105.
HAO, J., SHU, Z., AND VON DAVIER, A. 2015. Analyzing process data from game/scenario-based tasks: an edit distance approach. Journal of Educational Data Mining 7, 1, 33–50.
HE, Q., BORGONOVI, F., AND PACCAGNELLA, M. 2021. Leveraging process data to assess adults’ problem-solving skills: Using sequence mining to identify behavioral patterns across digital tasks. Computers & Education 166, Article 104170.
HOCHMAIR, H. AND FRANK, A. U. 2000. Influence of estimation errors on wayfinding-decisions in unknown street networks–analyzing the least-angle strategy. Spatial Cognition and Computation 2, 4, 283–313.
HUANG, W., EADES, P., AND HONG, S.-H. 2009. A graph reading behavior: Geodesic-path tendency. In 2009 IEEE Pacific Visualization Symposium, P. Eades, T. Ertl, and H.-W. Shen, Eds. Institute of Electrical and Electronics Engineers, 137–144.
HUBLEY, A. M. AND ZUMBO, B. D. 2017. Response processes in the context of validity: Setting the stage. In Understanding and investigating response processes in validation research, B. D. Zumbo and A. M. Hubley, Eds. Vol. 69. Springer, 1–12.
KANE, M. AND MISLEVY, R. 2017. Validating score interpretations based on response processes. In Validation of score meaning for the next generation of assessments. Routledge, 11–24.
KANGASRÄÄSIÖ, A., JOKINEN, J. P., OULASVIRTA, A., HOWES, A., AND KASKI, S. 2019. Parameter inference for computational cognitive models with approximate bayesian computation. Cognitive science 43, 6, Article e12738.
LAMAR, M. M. 2014. Models for understanding student thinking using data from complex computerized science tasks. University of California, Berkeley.
LEIGHTON, J. P. AND GIERL, M. J. 2007. Defining and evaluating models of cognition used in educational measurement to make inferences about examinees’ thinking processes. Educational Measurement: Issues and Practice 26, 2, 3–16.
LEVY, R. 2020. Implications of considering response process data for greater and lesser psychometrics. Educational Assessment 25, 3, 218–235.
LEVY, R. AND MISLEVY, R. J. 2017. Bayesian Psychometric Modeling. CRC Press.
LIU, H., LIU, Y., AND LI, M. 2018. Analysis of process data of pisa 2012 computer-based problem solving: Application of the modified multilevel mixture irt model. Frontiers in Psychology 9, Article 1372.
LUNDGREN, E. AND EKLÖF, H. 2020.Within-item response processes as indicators of test-taking effort and motivation. Educational Research and Evaluation 26, 5-6, 275–301.
MACGREGOR, J. N. AND CHU, Y. 2011. Human performance on the traveling salesman and related problems: A review. The Journal of Problem Solving 3, 2, Article 2.
MANNING, C. D., RAGHAVAN, P., AND SCHÜTZE, H. 2008. Introduction to information retrieval. Vol. 1. Cambridge University Press.
MOON, J. A., FINN, B., LAMAR, M., AND IRVIN R., K. 2018. Simulations of thought: The role of computational cognitive models in assessment. Periodical RDC-26, Educational Testing Service. September.
MUELLER, S. T., PERELMAN, B. S., AND SIMPKINS, B. G. 2013. Pathfinding in the cognitive map: Network models of mechanisms for search and planning. Biologically Inspired Cognitive Architectures 5, 94–111.
NETZEL, R., BURCH, M., AND WEISKOPF, D. 2014. Comparative eye tracking study on node-link visualizations of trajectories. IEEE transactions on visualization and computer graphics 20, 12, 2221– 2230.
NEWELL, A. 1990. Unified Theories of Cognition. Harvard University Press, USA.
NEWELL, A. AND SIMON, H. A. 1972. Human Problem Solving. Prentice Hall.
OECD. 2013. PISA 2012 Assessment and Analytical Framework. https://doi.org/10.1787/ 9789264190511-en.
OECD. 2014. PISA 2012 Results: Creative Problem Solving (Volume V). https://doi.org/10.1787/9789264208070-en.
PAASSEN, B., MCBROOM, J., JEFFRIES, B., KOPRINSKA, I., YACEF, K., ET AL. 2021. Mapping python programs to vectors using recursive neural encodings. Journal of Educational Data Mining 13, 3, 1– 35.
PENG, F., SCHUURMANS, D., AND WANG, S. 2004. Augmenting naive bayes classifiers with statistical language models. Information Retrieval 7, 3, 317–345.
PYLYSHYN, Z. W. 1984. Computation and cognition: Towards a foundation for cognitive science. MIT Press.
QIAO, X. AND JIAO, H. 2018. Data mining techniques in analyzing process data: A didactic. Frontiers in psychology 9, Article 2231.
RAFFERTY, A. N. 2014. Applying Probabilistic Models for Knowledge Diagnosis and Educational Game Design. University of California, Berkeley.
RAFFERTY, A. N., ZAHARIA, M., AND GRIFFITHS, T. L. 2014. Optimally designing games for behavioural research. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 470, Article 20130828.
RAHMANI, V. AND PELECHANO, N. 2022. Towards a human-like approach to path finding. Computers & Graphics 102, 164–174.
REITTER, D. AND LEBIERE, C. 2010. A cognitive model of spatial path-planning. Computational and Mathematical Organization Theory 16, 3, 220–245.
ROMERO, C. AND VENTURA, S. 2020. Educational data mining and learning analytics: An updated survey.Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 10, 3, Article e1355.
ROSÉ , C. P., MCLAUGHLIN, E. A., LIU, R., AND KOEDINGER, K. R. 2019. Explanatory learner models: Why machine learning (alone) is not the answer. British Journal of Educational Technology 50, 6, 2943–2958.
RUSSELL, S. AND NORVIG, P. 2021. Artificial Intelligence, Global Edition A Modern Approach. Pearson.
SALLES, F., DOS SANTOS, R., AND KESKPAIK, S. 2020. When didactics meet data science: Process data analysis in large-scale mathematics assessment in france. Large-scale Assessments in Education 8, 1, 1–20.
STEWART, T. AND WEST, R. 2010. Testing for equivalence: a methodology for computational cognitive modelling. Journal of Artificial General Intelligence 2, 2, 69–87.
TANG, X., WANG, Z., LIU, J., AND YING, Z. 2021. An exploratory analysis of the latent structure of process data via action sequence autoencoders. British Journal of Mathematical and Statistical Psychology 74, 1, 1–33.
TURNER, A. 2009. The role of angularity in route choice. In International Conference on Spatial Information Theory, K. Stewart Hornsby, C. Claramunt, M. Denis, and G. Ligozat, Eds. Springer Berlin, Heidelberg, 489–504.
ULITZSCH, E., HE, Q., ULITZSCH, V., MOLTER, H., NICHTERLEIN, A., NIEDERMEIER, R., AND POHL, S. 2021. Combining clickstream analyses and graph-modeled data clustering for identifying common response processes. psychometrika 86, 1, 190–214.
WALSH, M. M., ARSLAN, B., AND FINN, B. 2021. Computational cognitive modeling of human calibration and validity response scoring for the graduate record examinations (gre). Journal of Applied Research in Memory and Cognition 10, 1, 143–154.
WARE, C., PURCHASE, H., COLPOYS, L., AND MCGILL, M. 2002. Cognitive measurements of graph aesthetics. Information visualization 1, 2, 103–110.
WIENER, J. M. AND MALLOT, H. A. 2003. ’fine-to-coarse’route planning and navigation in regionalized environments. Spatial cognition and computation 3, 4, 331–358.
XU, H., FANG, G., CHEN, Y., LIU, J., AND YING, Z. 2018. Latent class analysis of recurrent events in problem-solving items. Applied Psychological Measurement 42, 6, 478–498.
XU, H., FANG, G., AND YING, Z. 2020. A latent topic model with markov transition for process data. British Journal of Mathematical and Statistical Psychology 73, 3, 474–505.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish with this journal agree to the following terms:
- The Author retains copyright in the Work, where the term “Work” shall include all digital objects that may result in subsequent electronic publication or distribution.
- Upon acceptance of the Work, the author shall grant to the Publisher the right of first publication of the Work.
- The Author shall grant to the Publisher and its agents the nonexclusive perpetual right and license to publish, archive, and make accessible the Work in whole or in part in all forms of media now or hereafter known under a Creative Commons 4.0 License (Attribution-Noncommercial-No Derivatives 4.0 International), or its equivalent, which, for the avoidance of doubt, allows others to copy, distribute, and transmit the Work under the following conditions:
- Attribution—other users must attribute the Work in the manner specified by the author as indicated on the journal Web site;
- Noncommercial—other users (including Publisher) may not use this Work for commercial purposes;
- No Derivative Works—other users (including Publisher) may not alter, transform, or build upon this Work,with the understanding that any of the above conditions can be waived with permission from the Author and that where the Work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
- The Author is able to enter into separate, additional contractual arrangements for the nonexclusive distribution of the journal's published version of the Work (e.g., post it to an institutional repository or publish it in a book), as long as there is provided in the document an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post online a pre-publication manuscript (but not the Publisher’s final formatted PDF version of the Work) in institutional repositories or on their Websites prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (see The Effect of Open Access). Any such posting made before acceptance and publication of the Work shall be updated upon publication to include a reference to the Publisher-assigned DOI (Digital Object Identifier) and a link to the online abstract for the final published Work in the Journal.
- Upon Publisher’s request, the Author agrees to furnish promptly to Publisher, at the Author’s own expense, written evidence of the permissions, licenses, and consents for use of third-party material included within the Work, except as determined by Publisher to be covered by the principles of Fair Use.
- The Author represents and warrants that:
- the Work is the Author’s original work;
- the Author has not transferred, and will not transfer, exclusive rights in the Work to any third party;
- the Work is not pending review or under consideration by another publisher;
- the Work has not previously been published;
- the Work contains no misrepresentation or infringement of the Work or property of other authors or third parties; and
- the Work contains no libel, invasion of privacy, or other unlawful matter.
- The Author agrees to indemnify and hold Publisher harmless from Author’s breach of the representations and warranties contained in Paragraph 6 above, as well as any claim or proceeding relating to Publisher’s use and publication of any content contained in the Work, including third-party content.