A Framework for Considering Exploration, Interpretation, and Confirmation During Data Analysis: Computationally Assisted Analysis of Teacher-Group Interactions
Main
Sidebar
Abstract
Education researchers increasingly analyze heterogeneous, multimodal data with computational tools. Yet, reporting rarely makes explicit who (human or computer) leads meaning-making at different points in the analysis. We introduce a framework for analytic agency that distinguishes three stages, exploration, interpretation, and confirmation, and classifies each as primarily human- or computer-led, as considering stage-level leadership can clarify assumptions in analysis. We demonstrate the framework in a multimodal case study of teacher-student group interactions in high school mathematics classrooms. Using 15 classroom videos from three teachers, we selected 21 student groups and developed a pose-based detector that flags interactions. The pipeline aligned group-level audio and word-level transcripts to each detected window and computed acoustic/prosodic features and large-language-model indicators for question-asking, confusion, help-seeking, and math talk. Across the corpus, the detector surfaced 317 interaction events (M = 15.10 per group, SD = 12.42; mean duration = 32.73s). We compared before, during, and after segments using paired tests and mixed-effects models. Naturally, results for mixed-effects models showed significant shifts in keypoints before-to-during and before-to-after for those emphasized in the detection approach, while audio features showed no significant changes. One transcript indicator, confusion, decreased after interactions (beta = -0.061, p = .049). The pipeline showed preferences for spatial co-presence rather than interaction discourse change, which illustrates how leadership in exploration shaped what became detectable and, consequently, how interpretation proceeded. In the paper's conclusion, we outline hybrid, iterative variants and discuss limitations. Making stage-level agency explicit can help researchers align methodological choices with theoretical aims and produce more transparent, auditable analyses of complex classroom data.
How to Cite
Details
student group interactions, computational grounded theory, hybrid analysis, classroom video data
Alderete, J., Hui, M. K. F., and Mohan, A. 2025. Evaluating ASR robustness to spontaneous speech errors: A study of WhisperX using a speech error database. arXiv. https://doi.org/10.48550/arXiv.2508.13060
Alibali, M. W., and Nathan, M. J. 2012. Embodiment in mathematics teaching and learning: Evidence from learners’ and teachers’ gestures. Journal of the Learning Sciences, 21(2), 247–286. https://doi.org/10.1080/10508406.2011.611446
Andriluka, M., Pishchulin, L., Gehler, P., and Schiele, B. 2014. 2D human pose estimation: New benchmark and state of the art analysis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3686–3693. https://doi.org/10.1109/CVPR.2014.471
Arnould, E., Price, L., and Moisio, R. 2006. Making contexts matter: Selecting research contexts for theoretical insights. In Handbook of Qualitative Research Methods in Marketing, R. W. Belk, Ed. Edward Elgar Publishing. 106–128. https://doi.org/10.4337/9781847204127.00016
Bain, M., Huh, J., Han, T., and Zisserman, A. 2023. WhisperX: Time-accurate speech transcription of long-form audio. In Proceedings of INTERSPEECH 2023, 4489–4493. https://doi.org/10.21437/Interspeech.2023-78
Baker, R. S., Hutt, S., Brooks, C. A., Srivastava, N., and Mills, C. 2024. Open science and educational data mining: Which practices matter most? In Proceedings of the 17th International Conference on Educational Data Mining, C. Demmans Epp, B. Paaßen, and D. Joyner, Eds. International Educational Data Mining Society, 279–287. https://doi.org/10.5281/zenodo.12729816
Barany, A., Nasiar, N., Porter, C., Zambrano, A. F., Andres, A. L., Bright, D., Shah, M., Liu, X., Gao, S., Zhang, J., Mehta, S., Choi, J., Giordano, C., and Baker, R. S. 2024. ChatGPT for education research: Exploring the potential of large language models for qualitative codebook development. In Artificial Intelligence in Education. AIED 2024 (Lecture Notes in Computer Science, Vol. 14830), A. M. Olney, I.-A. Chounta, Z. Liu, O. C. Santos, and I. I. Bittencourt, Eds. Springer, Cham, Switzerland, 134–149. https://doi.org/10.1007/978-3-031-64299-9_10
Bergner, Y., Gray, G., and Lang, C. 2018. What does methodology mean for learning analytics? Journal of Learning Analytics, 5(2), 1–8. https://doi.org/10.18608/jla.2018.52.1
Boersma, P., and Van Heuven, V. 2001. Speak and unSpeak with Praat. Glot International, 5(9/10), 341–347.
Bosch, N. 2021. AutoML feature engineering for student modeling yields high accuracy, but limited interpretability. Journal of Educational Data Mining, 13(2), 55–79. https://doi.org/10.5281/zenodo.5275314
Bredin, H., Yin, R., Coria, J. M., Gelly, G., Korshunov, P., Lavechin, M., Fustes, D., Titeux, H., Bouaziz, W., and Gill, M. P. 2020. pyannote.audio: Neural building blocks for speaker diarization. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 7124–7128. https://doi.org/10.1109/ICASSP40776.2020.9052974
Cao, Z., Hidalgo, G., Simon, T., Wei, S. E., and Sheikh, Y. 2021. OpenPose: Realtime multi-person 2D pose estimation using Part Affinity Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1), 172–186. https://doi.org/10.1109/TPAMI.2019.2929257
Charmaz, K. 2014. Constructing Grounded Theory (2nd ed.). SAGE Publications.
Choi, Y., Lee, Y., Shin, D., Cho, J., Park, S., Lee, S., Baek, J., Bae, C., Kim, B., and Heo, J. 2020. EdNet: A large-scale hierarchical dataset in education. In Artificial Intelligence in Education. AIED 2020 (Lecture Notes in Computer Science, Vol. 12164), I. I. Bittencourt, M. Cukurova, K. Muldner, R. Luckin, and E. Millán, Eds. Springer, Springer, Cham, 69–73. https://doi.org/10.1007/978-3-030-52240-7_13
Cohn, C., Davalos, E., Vatral, C., Fonteles, J. H., Wang, H. D., Ma, M., and Biswas, G. 2024. Multimodal methods for analyzing learning and training environments: A systematic literature review. arXiv. https://doi.org/10.48550/arXiv.2408.14491
D’Mello, S. K., and Graesser, A. 2023. Intelligent tutoring systems: How computers achieve learning gains that rival human tutors. In Handbook of Educational Psychology (4th ed.), P. A. Schutz and K. R. Muis, Eds. Routledge, New York, 603–629. http://doi.org/10.4324/9780429433726-31
Dragut, E., Li, Y., Popa, L., and Vucetic, S. 2021. Data science with human in the loop. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 4123–4124. https://doi.org/10.1145/3447548.3469476
Eyben, F., Scherer, K. R., Schuller, B. W., Sundberg, J., André, E., Busso, C., Devillers, L. Y., Epps, J., Laukka, P., Narayanan, S. S., and Truong, K. P. 2016. The Geneva minimalistic acoustic parameter set (GeMAPS) for voice research and affective computing. IEEE Transactions on Affective Computing, 7(2), 190–202. https://doi.org/10.1109/TAFFC.2015.2457417
Eyben, F., Wöllmer, M., and Schuller, B. 2010. openSMILE: The Munich versatile and fast open-source audio feature extractor. In Proceedings of the 18th ACM International Conference on Multimedia, 1459–1462. https://doi.org/10.1145/1873951.1874246
Frankel, L., and Brownstein, B. 2016. An evaluation of the usefulness of prosodic and lexical cues for understanding synthesized speech of mathematics. ETS Research Report Series, 2016(2), 1–19. https://doi.org/10.1002/ets2.12119
French, D., Moulder, R., Ezema, K., Von Der Wense, K., and D’Mello, S. 2025. Linguistic alignment predicts learning in small group tutoring sessions. In Findings of the Association for Computational Linguistics: EMNLP 2025. C. Christodoulopoulos, T. Chakraborty, C. Rose, and V. Peng, Eds. Association for Computational Linguistics, Suzhou, China, 15600–15611. https://doi.org/10.18653/v1/2025.findings-emnlp.844
Gabbay, H., and Cohen, A. 2022. Investigating the effect of automated feedback on learning behavior in MOOCs for programming. In Proceedings of the 15th International Conference on Educational Data Mining. International Educational Data Mining Society, 376–383. https://doi.org/10.5281/zenodo.6853124
Gonzales, A. C., Purington, S., Robinson, J., and Nieswandt, M. 2019. Teacher interactions and effects on group triple problem solving space. International Journal of Science Education, 41(13), 1744–1763. https://doi.org/10.1080/09500693.2019.1638982
González-Brenes, J. P., and Mostow, J. 2012. Dynamic cognitive tracing: Towards unified discovery of student and cognitive models. In Proceedings of the 5th International Conference on Educational Data Mining, K. Yacef, O. Zaïane, A. Hershkovitz, M. Yudelson, and J. Stamper, Eds. International Educational Data Mining Society, 49–56.
Haider, F., Pollak, S., Albert, P., and Luz, S. 2021. Emotion recognition in low-resource settings: An evaluation of automatic feature selection methods. Computer Speech & Language, 65, Article 101119. https://doi.org/10.1016/j.csl.2020.101119
Heng, B. C., Cheong, C. Y. M., and Taib, F. 2017. Instructional proxemics and its impact on classroom teaching and learning. International Journal of Modern Languages and Applied Linguistics, 1(1), 69–85. https://journal.uitm.edu.my/ojs/index.php/IJMAL
Hur, P., and Bosch, N. 2022. Tracking individuals in classroom videos via post-processing OpenPose data. In LAK22: 12th International Learning Analytics and Knowledge Conference, 465–471. https://doi.org/10.1145/3506860.3506888
Kaendler, C., Wiedmann, M., Rummel, N., and Spada, H. 2015. Teacher competencies for the implementation of collaborative learning in the classroom: A framework and research review. Educational Psychology Review, 27(3), 505–536. https://doi.org/10.1007/s10648-014-9288-9
Kocabas, M., Athanasiou, N., and Black, M. J. 2020. VIBE: Video inference for human body pose and shape estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5253–5263. https://doi.org/10.1109/CVPR42600.2020.00530
Kumar, D., Madan, S., Singh, P., Dhall, A., and Raman, B. 2024. Towards engagement prediction: A cross-modality dual-pipeline approach using visual and audio features. In Proceedings of the 32nd ACM International Conference on Multimedia, 11383–11389. https://doi.org/10.1145/3664647.3688986
Kuznetsova, A., Brockhoff, P. B., and Christensen, R. H. 2017. lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26. https://doi.org/10.18637/jss.v082.i13
Lin, R., and Koedinger, K. R. 2017. Closing the loop: Automated data-driven cognitive model discoveries lead to improved instruction and learning gains. Journal of Educational Data Mining, 9(1), 25–41. https://doi.org/10.5281/zenodo.3554625
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. L. 2014. Microsoft COCO: Common objects in context. In Computer Vision – ECCV 2014 (Lecture Notes in Computer Science, Vol. 8693) D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds. Springer, Cham, Switzerland, 740–755. https://doi.org/10.1007/978-3-319-10602-1_48
Lu, W., Laffey, J., Sadler, T., Griffin, J., and Goggins, S. 2024. A scalable, flexible, and interpretable analytic pipeline for stealth assessment in complex digital game-based learning environments: Towards generalizability. Journal of Educational Data Mining, 16(2), 149–176. https://doi.org/10.5281/zenodo.14503598
Mejia-Domenzain, P., Nazaretsky, T., Schultze, S., Hochweber, J., and Käser, T. 2024. Navigating self-regulated learning dimensions: Exploring interactions across modalities. In Artificial Intelligence in Education. AIED 2024 (Lecture Notes in Computer Science, Vol. 14830), A. M. Olney, I.-A. Chounta, Z. Liu, O. C. Santos, and I. I. Bittencourt, Eds. Springer, Cham, Swizerland, 104–118. https://doi.org/10.1007/978-3-031-64299-9_8
Mills, C., Gregg, J., Bixler, R., and D’Mello, S. K. 2021. Eye-Mind Reader: An intelligent reading interface that promotes long-term comprehension by detecting and responding to mind wandering. Human–Computer Interaction, 36(4), 306–332. https://doi.org/10.1080/07370024.2020.1716762
Milner IV, H. R. 2007. Race, culture, and researcher positionality: Working through dangers seen, unseen, and unforeseen. Educational Researcher, 36(7), 388–400. https://doi.org/10.3102/0013189X07309471
Mistral AI. 2025. Mistral Small 3.2 24B Instruct 2506 [Large language model]. Hugging Face. https://huggingface.co/mistralai/Mistral-Small-3.2-24B-Instruct-2506
Nelson, L. K. 2020. Computational grounded theory: A methodological framework. Sociological Methods & Research, 49(1), 3–42. https://doi.org/10.1177/0049124117729703
Norcutt, N., and McCoy, D. 2004. Interactive Qualitative Analysis: A Systems Method for Qualitative Research. SAGE Publications. https://doi.org/10.4135/9781412984539
Ochoa, X., and Worsley, M. 2016. Augmenting learning analytics with multimodal sensory data. Journal of Learning Analytics, 3(2), 213–219. https://doi.org/10.18608/jla.2016.32.10
Ouhaichi, H., Bahtijar, V., and Spikol, D. 2024. Exploring design considerations for multimodal learning analytics systems: An interview study. Frontiers in Education, 9, Article 1356537. https://doi.org/10.3389/feduc.2024.1356537
Parr, E. D. 2021. Making space for joint exploration: The embodiment of social and epistemic positioning in student-teacher interaction. In Proceedings of the 15th International Conference of the Learning Sciences - ICLS 2021, E. de Vries, Y. Hod, and J. Ahn, Eds. International Society of the Learning Sciences, 843–850. https://par.nsf.gov/biblio/10291504
R Core Team 2020. R: A language and environment for statistical computing (Version 4.0) [Computer software]. R Foundation for Statistical Computing. https://www.R-project.org/
Radford, A., Kim, J. W., Xu, T., Brockman, G., McLeavey, C., and Sutskever, I. 2023. Robust speech recognition via large-scale weak supervision. In Proceedings of the 40th International Conference on Machine Learning¸ A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, and J. Scarlett, Eds. PMLR, 28492–28518. http://doi.org/10.48550/arXiv.2212.04356
Rajarathinam, R. J., Palaguachi, C., and Kang, J. 2025. 360-degree cameras vs traditional cameras in multimodal learning analytics: Comparative study of facial recognition and pose estimation. Journal of Educational Data Mining, 17(1), 157–182. https://doi.org/10.5281/zenodo.14966499
Romero, C., and Ventura, S. 2020. Educational data mining and learning analytics: An updated survey. WIREs Data Mining and Knowledge Discovery, 10(3), Article e1355. https://doi.org/10.1002/widm.1355
Salvi, R. C., and Bosch, N. 2025. Investigating perception of gender stereotypes in large language models: A computational grounded theory approach. ACM Journal on Responsible Computing, 2(2), 1–29. https://doi.org/10.1145/3737882
Sandvig, C. 2014. Seeing the sort: The aesthetic and industrial defense of “the algorithm.” Media-N, 10(1). http://median.newmediacaucus.org/art-infrastructures-information/seeing-the-sort-the-aesthetic-and-industrial-defense-of-the-algorithm/
Scherr, R. E. 2009. Video analysis for insight and coding: Examples from tutorials in introductory physics. Physical Review Special Topics - Physics Education Research, 5(2), 020106. https://doi.org/10.1103/PhysRevSTPER.5.020106
Seedhouse, P. 2005. Conversation analysis and language learning. Language Teaching, 38(4), 165–187. https://doi.org/10.1017/S026144480500318X
Shapiro, B. R., Horn, I. S., Gilliam, S., and Garner, B. 2024. Situating teacher movement, space, and relationships to pedagogy: A visual method and framework. Educational Researcher, 53(6), 335–347. https://doi.org/10.3102/0013189X241238698
Shute, V. J. 2011. Stealth assessment in computer-based games to support learning. In Computer Games and Instruction, S. Tobias and J. D. Fletcher, Eds. Information Age Publishing, Charlotte, NC, 503–524.
Singh, S., Singh, L., and Satsangee, N. 2025. Automated assessment of classroom interaction based on verbal dynamics: A deep learning approach. SN Computer Science, 6(3), Article 201. https://doi.org/10.1007/s42979-025-03770-3
Sivakumaran, N., Yang, C. Y., Zala, A., Yu, S., Hong, D., Zou, X., Stengel-eskin, E., Carpenter, D., Min, W., Hmelo-Silver, C., Rowe, J., Lester, J., and Bansal, M. 2025. A multimodal classroom video question-answering framework for automated understanding of collaborative learning. In Proceedings of the 27th International Conference on Multimodal Interaction, 516–525. Association for Computing Machinery. https://doi.org/10.1145/3716553.3750795
Snape, D., and Spencer, L. 2003. The foundations of qualitative research. In Qualitative Research Practice: A Guide for Social Science Students and Researchers, J. Ritchie and J. Lewis Eds. SAGE Publications, 1–23..
Soloman, S., and Sawilowsky, S. 2009. Impact of rank-based normalizing transformations on the accuracy of test scores. Journal of Modern Applied Statistical Methods, 8(2), 448–462. https://doi.org/10.22237/jmasm/1257034080
Tang, L., and Bosch, N. 2024. Can students understand AI decisions based on variables extracted via AutoML? In Proceedings of the 2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 3342–3349. https://doi.org/10.1109/SMC54092.2024.10831034
Venkatesha, V., Bradford, M., and Blanchard, N. 2025. Dude, where’s my utterance? Evaluating the effects of automatic segmentation and transcription on CPS detection. In Artificial Intelligence in Education. AIED 2025 (Communications in Computer and Information Science, Vol. 2592), A. I. Cristea, E. Walker, Y. Lu, O. C. Santos, and S. Isotani Eds. Springer, Cham, Switzerland, 144–151. https://doi.org/10.1007/978-3-031-99267-4_18
Vieira, F., Cechinel, C., Ramos, V., Riquelme, F., Noel, R., Villarroel, R., Cornide-Reyes, H. and Munoz, R. 2021. A learning analytics framework to analyze corporal postures in students’ presentations. Sensors, 21(4), Article 1525. https://doi.org/10.3390/s21041525
Wenskovitch, J., and North, C. 2020. Interactive artificial intelligence: Designing for the “two black boxes” problem. Computer, 53(8), 29–39. https://doi.org/10.1109/MC.2020.2996416
Whitehill, J., and LoCasale-Crouch, J. 2023. Automated evaluation of classroom instructional support with LLMs and BoWs: Connecting global predictions to specific feedback. Journal of Educational Data Mining, 16(1), 34–60. https://doi.org/10.5281/zenodo.10974824
Wolpert, D. H., and Macready, W. G. 1997. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
Xue, W., Cucchiarini, C., van Hout, R. W. N. M., and Strik, H. 2019. Acoustic correlates of speech intelligibility: The usability of the eGeMAPS feature set for atypical speech. In Proceedings of the 8th ISCA Workshop on Speech and Language Technology in Education (SLaTE 2019), 48–52. https://doi.org/10.21437/SLaTE.2019-9
Xu, R., and Wunsch, D. 2005. Survey of clustering algorithms. IEEE Transactions on Neural Networks, 16(3), 645–678. https://doi.org/10.1109/TNN.2005.845141
Yin, S., Liu, Z., Goh, D. L., Quek, C., and Chen, N. 2025. Scaling up collaborative dialogue analysis: An AI-driven approach to understanding dialogue patterns in computational thinking education. In Proceedings of the 15th International Learning Analytics and Knowledge Conference, 47–57. Association for Computing Machinery. https://doi.org/10.1145/3706468.3706474
Yoon, S. A., and Hmelo-Silver, C. E. 2017. What do learning scientists do? A survey of the ISLS membership. Journal of the Learning Sciences, 26(2), 167–183. https://doi.org/10.1080/10508406.2017.1279546
Zhao, J., Li, J., and Jia, J. 2021. A study on posture-based teacher-student behavioral engagement pattern. Sustainable Cities and Society, 67, Article 102749. https://doi.org/10.1016/j.scs.2021.102749

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish with this journal agree to the following terms:
- The Author retains copyright in the Work, where the term “Work” shall include all digital objects that may result in subsequent electronic publication or distribution.
- Upon acceptance of the Work, the author shall grant to the Publisher the right of first publication of the Work.
- The Author shall grant to the Publisher and its agents the nonexclusive perpetual right and license to publish, archive, and make accessible the Work in whole or in part in all forms of media now or hereafter known under a Creative Commons 4.0 License (Attribution-Noncommercial-No Derivatives 4.0 International), or its equivalent, which, for the avoidance of doubt, allows others to copy, distribute, and transmit the Work under the following conditions:
- Attribution—other users must attribute the Work in the manner specified by the author as indicated on the journal Web site;
- Noncommercial—other users (including Publisher) may not use this Work for commercial purposes;
- No Derivative Works—other users (including Publisher) may not alter, transform, or build upon this Work,with the understanding that any of the above conditions can be waived with permission from the Author and that where the Work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
- The Author is able to enter into separate, additional contractual arrangements for the nonexclusive distribution of the journal's published version of the Work (e.g., post it to an institutional repository or publish it in a book), as long as there is provided in the document an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post online a pre-publication manuscript (but not the Publisher’s final formatted PDF version of the Work) in institutional repositories or on their Websites prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (see The Effect of Open Access). Any such posting made before acceptance and publication of the Work shall be updated upon publication to include a reference to the Publisher-assigned DOI (Digital Object Identifier) and a link to the online abstract for the final published Work in the Journal.
- Upon Publisher’s request, the Author agrees to furnish promptly to Publisher, at the Author’s own expense, written evidence of the permissions, licenses, and consents for use of third-party material included within the Work, except as determined by Publisher to be covered by the principles of Fair Use.
- The Author represents and warrants that:
- the Work is the Author’s original work;
- the Author has not transferred, and will not transfer, exclusive rights in the Work to any third party;
- the Work is not pending review or under consideration by another publisher;
- the Work has not previously been published;
- the Work contains no misrepresentation or infringement of the Work or property of other authors or third parties; and
- the Work contains no libel, invasion of privacy, or other unlawful matter.
- The Author agrees to indemnify and hold Publisher harmless from Author’s breach of the representations and warranties contained in Paragraph 6 above, as well as any claim or proceeding relating to Publisher’s use and publication of any content contained in the Work, including third-party content.
https://orcid.org/0000-0002-4431-7339