Evaluating the Effects of Assignment Report Usage on Student Outcomes in an Intelligent Tutoring System: A Randomized-Encouragement Design

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published May 8, 2025

Abstract

As online learning platforms become more popular and deeply integrated into education, understanding their effectiveness and what drives that effectiveness becomes increasingly important. While there is extensive prior research illustrating the benefits of intelligent tutoring systems (ITS) for student learning, there is comparatively less focus on how teachers' use of ITS impacts student outcomes. Much existing research on teachers' ITS usage relies on qualitative studies, small-scale experiments, or survey data, making it difficult to identify the causal effects of their engagement with these systems. To bridge this gap, we conducted a study using a randomized encouragement design on an online mathematics platform, where teachers were randomly assigned to one of two groups: an encouragement group or a control group. Teachers in the encouragement group received a popup prompt urging them to explore the assignment report after they created an assignment, while those in the control group did not receive any additional prompts. The study focused exclusively on teachers new to the platform, as this group was expected to be most influenced by the encouragement prompt. The findings show that viewing the assignment report did not significantly impact the percentage of students who started the next assignment or their value-added scores. However, it did lead to a notable increase in the percentage of students completing the next assignment. This effect, confirmed using the Anderson-Rubin test (which is robust against weak instruments), demonstrates a measurable causal relationship between teachers' use of assignment reports and student outcomes. Based on data from 330 teachers, this large-scale study sheds light on the causal effects of teachers engaging with ITS data on student learning and adds to the growing evidence base for effective teaching strategies in online learning environments. The pre-registration for the paper is available at https://osf.io/5u2n3/?view_only=39c4416ed9c04666885873b82c23f734, while data and code are available at https://osf.io/4nqxu/?view_only=1dcf5157005c4f82b815dad1fc67514a.

How to Cite

Lim, W. C., Heffernan, N. T., & Sales, A. (2025). Evaluating the Effects of Assignment Report Usage on Student Outcomes in an Intelligent Tutoring System: A Randomized-Encouragement Design. Journal of Educational Data Mining, 17(1). https://doi.org/10.5281/zenodo.15366697
Abstract 7 | PDF Downloads 5 HTML Downloads 4

##plugins.themes.bootstrap3.article.details##

Keywords

randomized encouragement design, instrumental variable, teaching practices, intelligent tutoring systems

References
Aleven, V., Blankestijn, J., Lawrence, L., Nagashima, T., and Taatgen, N. 2022. A dashboard to support teachers during students’ self-paced AI-supported problem-solving practice. In Educating for a New Future: Making Sense of Technology-Enhanced Learning Adoption, I. Hilliger, P. J. Muñoz-Merino, T. De Laet, A. Ortega-Arranz, and T. Farrell, Eds. Vol. 13450. Springer International Publishing, Cham, 16–30. Series Title: Lecture Notes in Computer Science.

Andrews, I. and Armstrong, T. B. 2017. Unbiased instrumental variables estimation under known first-stage sign: Unbiased IV estimation. Quantitative Economics 8, 2 (July), 479–503.

Andrews, I., Stock, J. H., and Sun, L. 2019. Weak instruments in instrumental variables regression: Theory and practice. Annual Review of Economics 11, 1 (Aug.), 727–753.

Angrist, J. and Kolesár, M. 2024. One instrument to rule them all: The bias and coverage of just-ID IV. Journal of Econometrics 240, 2 (Mar.), 105398.

Angrist, J. and Lavy, V. 2009. The effects of high stakes high school achievement awards: Evidence from a randomized trial. American Economic Review 99, 4 (Aug.), 1384–1414.

Angrist, J. D. 2001. Estimation of limited dependent variable models with dummy endogenous regressors: Simple strategies for empirical practice. Journal of Business & Economic Statistics 19, 1 (Jan.), 2–28.

Angrist, J. D. 2004. Treatment effect heterogeneity in theory and practice. The Economic Journal 114, 494 (Mar.), C52–C83.

Angrist, J. D., Imbens, G. W., and Rubin, D. B. 1996. Identification of causal effects using instrumental variables. Journal of the American Statistical Association 91, 434 (June), 444–455.

Angrist, J. D. and Pischke, J.-S. 2008. Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton University Press, Princeton, NJ.

ASSISTments. 2024. Explore the assignment report. The ASSISTments Foundation. https://new.assistments.org/resources/explore-the-assignment-report. Accessed March 18, 2025.

Baker, H. D. 2003. Teaching with ALEKS. ALEKS Corporation. https://www.aleks.com/manual/pdf/teaching.pdf. Accessed March 18, 2025.

Baker, R. S. 2016. Stupid Tutoring Systems, Intelligent Humans. International Journal of Artificial Intelligence in Education 26, 2 (June), 600–614.

Bound, J., Jaeger, D. A., and Baker, R. M. 1995. Problems with instrumental variables estimation when the correlation between the instruments and the endogeneous explanatory variable is weak. Journal of the American Statistical Association 90, 430 (June), 443.

Bradlow, E. 1998. Encouragement designs: An approach to self-selected samples in an experimental design. Marketing Letters 9, 4, 383–391.

Burgess, S., Small, D. S., and Thompson, S. G. 2017. A review of instrumental variable estimators for Mendelian randomization. Statistical Methods in Medical Research 26, 5 (Oct.), 2333–2355.

Carnegie Learning. 2025. Getting started with MATHia reports. https://support.carnegielearning.com/help-center/math/mathia-reports/general/article/getting-started-mathia-reports. Accessed 18 Mar 2025.

Feng, M., Huang, C., and Collins, K. 2023. Technology-based support shows promising long- term impact on math learning: Initial results from a randomized controlled trial in middle schools.

Ginsburg, A. and Smith, M. S. 2016. Do randomized controlled trials meet the “gold standard”? A study of the usefulness of RCTs in the What Works Clearinghouse. Tech. rep., American Enterprise Institute. https://www.carnegiefoundation.org/wp-content/uploads/2016/03/Do-randomized-controlled-trials-meet-the-gold-standard.pdf. Accessed 18 Mar 2025.

Greene, W. 2018. Econometric analysis, Eighth ed. Pearson, New York, NY.

Hahn, J., Hausman, J., and Kuersteiner, G. 2004. Estimation with weak instruments: Accuracy of higher-order bias and MSE approximations. The Econometrics Journal 7, 1 (June), 272–306.

Hayashi, F. 2000. Econometrics. Princeton University Press, Princeton.

Heffernan, N. T. and Heffernan, C. L. 2014. The ASSISTments ecosystem: Building a platform that brings scientists and teachers together for minimally invasive research on human learning and teaching. International Journal of Artificial Intelligence in Education 24, 4 (Dec.), 470–497.

Helsabeck, N. P., Justice, L. M., and Logan, J. A. R. 2022. Assessing fidelity of implementation to a technology-mediated early intervention using process data. Journal of Computer Assisted Learning 38, 2 (Apr.), 409–421.

Holstein, K., Hong, G., Tegene, M., McLaren, B. M., and Aleven, V. 2018. The classroom as a dashboard: co-designing wearable cognitive augmentation for K-12 teachers. In Proceedings of the 8th International Conference on Learning Analytics and Knowledge. ACM, Sydney New South Wales Australia, 79–88.

Holstein, K., McLaren, B. M., and Aleven, V. 2017. Intelligent tutors as teachers’ aides: exploring teacher needs for real-time analytics in blended classrooms. In Proceedings of the Seventh International Learning Analytics & Knowledge Conference. ACM, Vancouver British Columbia Canada, 257–266.

Kang, H., Jiang, Y., Zhao, Q., and Small, D. S. 2021. ivmodel: An R package for inference and sensitivity analysis of Instrumental variables models with one endogenous variable. Observational Studies 7, 2, 1–24.

Keane, M. P. and Neal, T. 2024. A practical guide to weak instruments. Annual Review of Economics 16, 1 (Aug.), 185–212.

Keller, T. and Szakál, P. 2021. Not just words! Effects of a light-touch randomized encouragement intervention on students’ exam grades, self-efficacy, motivation, and test anxiety. PLOS ONE 16, 9 (Sept.), e0256960.

Kelly, K., Heffernan, N., Heffernan, C., Goldman, S., Pellegrino, J., and Soffer Goldstein, D. 2013. Estimating the effect of web-based homework. In Artificial Intelligence in Education, D. Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell, M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen, M. Sudan, D. Terzopoulos, D. Tygar, M. Y. Vardi, G. Weikum, H. C. Lane, K. Yacef, J. Mostow, and P. Pavlik, Eds. Vol. 7926. Springer Berlin Heidelberg, Berlin, Heidelberg, 824–827. Series Title: Lecture Notes in Computer Science.

Kochmar, E., Vu, D. D., Belfer, R., Gupta, V., Serban, I. V., and Pineau, J. 2020. Automated personalized feedback improves learning gains in an intelligent tutoring system. In Artificial Intelligence in Education, I. I. Bittencourt, M. Cukurova, K. Muldner, R. Luckin, and E. Millán, Eds. Vol. 12164. Springer International Publishing, Cham, 140–146. Series Title: Lecture Notes in Computer Science.

Koedinger, K. R. and Aleven, V. 2016. An interview reflection on “Intelligent tutoring goes to school in the big city”. International Journal of Artificial Intelligence in Education 26, 1 (Mar.), 13–24.

Koedinger, K. R., Anderson, J. R., Hadley, W. H., and Mark, M. A. 1997. Intelligent tutoring goes to school in the big city. International Journal of Artificial Intelligence in Education 8, 30–43.

Lee, M., Siedahmed, A., and Heffernan, N. 2024. Expert features for a student support recommendation contextual bandit algorithm. In Proceedings of the 14th Learning Analytics and Knowledge Conference. ACM, Kyoto Japan, 864–870.

Ley, T., Tammets, K., Pishtari, G., Chejara, P., Kasepalu, R., Khalil, M., Saar, M., Tuvi, I., Väljataga, T., and Wasson, B. 2023. Towards a partnership of teachers and intelligent learning technology: A systematic literature review of model-based learning analytics. Journal of Computer Assisted Learning 39, 5 (Oct.), 1397–1417.

Lousdal, M. L. 2018. An introduction to instrumental variable assumptions, validation and estimation. Emerging Themes in Epidemiology 15, 1 (Jan.), 1.

Ma, W., Adesope, O. O., Nesbit, J. C., and Liu, Q. 2014. Intelligent tutoring systems and learning outcomes: A meta-analysis. Journal of Educational Psychology 106, 4 (Nov.), 901–918.

Martens, E. P., Pestman, W. R., De Boer, A., Belitser, S. V., and Klungel, O. H. 2006. Instrumental variables: Application and limitations. Epidemiology 17, 3 (May), 260–267.

Martin, F., Sun, T., and Westine, C. D. 2020. A systematic review of research on online teaching and learning from 2009 to 2018. Computers & Education 159, 104009.

McGraw Hill Education. 2025. Aleks reporting. https://www.mheducation.com/prek-12/program/microsites/MKTSP-GAB02M0.html#reporting. Accessed March 18, 2025.

Nye, B. D. 2014. Barriers to ITS adoption: A systematic mapping study. In Intelligent Tutoring Systems, D. Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, A. Kobsa, F. Mattern, J. C. Mitchell, M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen, D. Terzopoulos, D. Tygar, G. Weikum, S. Trausan-Matu, K. E. Boyer, M. Crosby, and K. Panourgia, Eds. Vol. 8474. Springer International Publishing, Cham, 583–590. Series Title: Lecture Notes in Computer Science.

OECD. 2023. PISA 2022 Results (Volume II): Learning During – and From – Disruption. PISA. OECD, Paris.

Paloyo, A. R., Rogan, S., and Siminski, P. 2016. The effect of supplemental instruction on academic performance: An encouragement design experiment. Economics of Education Review 55, 57–69.

Pane, J. F., Griffin, B. A., McCaffrey, D. F., and Karam, R. 2014. Effectiveness of Cognitive Tutor Algebra I at Scale. Educational Evaluation and Policy Analysis 36, 2 (June), 127–144.

Pantelimon, F.-V., Bologa, R., Toma, A., and Posedaru, B.-S. 2021. The evolution of AI-driven educational systems during the covid-19 pandemic. Sustainability 13, 23 (Dec.), 13501.

Patikorn, T. and Heffernan, N. T. 2020. Effectiveness of crowd-sourcing on-demand assistance from teachers in online learning platforms. In Proceedings of the Seventh ACM Conference on Learning @ Scale. ACM, Virtual Event USA, 115–124.

Razzaq, L. and Heffernan, N. 2009. To tutor or not to tutor: That is the question. In Proceedings of the 2009 Artificial Intelligence in Education Conference. IOS Press, Brighton, UK, 457–464.

Schochet, P. Z. and Chiang, H. S. 2011. Estimation and identification of the complier average causal effect parameter in education RCTs. Journal of Educational and Behavioral Statistics 36, 3 (June), 307–345.

Shi, X. 2024. Lecture 11: Weak instruments. Econ 715 Lecture Notes, University of Wisconsin-Madison. https://users.ssc.wisc.edu/ xshi/econ715/Lecture_11_WeakIV.pdf. Accessed 18 Mar 2025.

Staiger, D. and Stock, J. H. 1997. Instrumental variables regression with weak instruments. Econometrica 65, 3 (May), 557–586.

Stock, J. H. and Yogo, M. 2005. Testing for weak instruments in linear IV regression. In Identification and Inference for Econometric Models: Essays in Honor of Thomas Rothenberg, D. W. K. Andrews and J. H. Stock, Eds. Cambridge University Press, Cambridge, 80–108.

Userflow. 2023. A/B testing. Userflow Documentation. https://docs.userflow.com/docs/guides/ab-testing. Accessed 30 Nov 2024.

Userflow. 2024. Userflow Documentation. Userflow Documentation. https://docs.userflow.com/docs. Accessed 30 Nov 2024.

Vanacore, K., Dieter, K., Hurwitz, L., and Studwell, J. 2021. Longitudinal clusters of online educator portal access: Connecting educator behavior to student outcomes. In LAK21: 11th International Learning Analytics and Knowledge Conference. ACM, Irvine CA USA, 540–545.

Vanacore, K., Gurung, A., Sales, A., and Heffernan, N. T. 2024. The effect of assistance on gamers: Assessing the impact of on-demand hints & feedback availability on learning for students who game the system. In Proceedings of the 14th Learning Analytics and Knowledge Conference. ACM, Kyoto Japan, 462–472.

West, S. G., Duan, N., Pequegnat, W., Gaist, P., Des Jarlais, D. C., Holtgrave, D., Szapocznik, J., Fishbein, M., Rapkin, B., Clatts, M., and Mullen, P. D. 2008. Alternatives to the randomized controlled trial. American Journal of Public Health 98, 8 (Aug.), 1359–1366.

What Works Clearinghouse. 2022. What Works Clearinghouse procedures and standards handbook, version 5.0. Condition of Education. U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance (NCEE). This report is available on the What Works Clearinghouse website at https://ies.ed.gov/ncee/wwc/Handbooks.

Xhakaj, F., Aleven, V., and McLaren, B. M. 2016. How teachers use data to help students learn: Contextual inquiry for the design of a dashboard. In Adaptive and Adaptable Learning, K. Verbert, M. Sharples, and T. Klobučar, Eds. Vol. 9891. Springer International Publishing, Cham, 340–354. Series Title: Lecture Notes in Computer Science.

Xhakaj, F., Aleven, V., and McLaren, B. M. 2017. Effects of a teacher dashboard for an intelligent tutoring system on teacher knowledge, lesson planning, lessons and student learning. In Data Driven Approaches in Digital Education, E. Lavoué, H. Drachsler, K. Verbert, J. Broisin, and M. Pérez-Sanagustín, Eds. Vol. 10474. Springer International Publishing, Cham, 315–329. Series Title: Lecture Notes in Computer Science.

Xiao, Z., Hauser, O., Kirkwood, C., Li, D. Z., Ford, T., and Higgins, S. 2024. Uncovering individualised treatment effects for educational trials. Scientific Reports 14, 1 (Sept.), 22606.

Zhou, G., Yang, X., Azizsoltani, H., Barnes, T., and Chi, M. 2020. Improving student-system interaction through data-driven explanations of hierarchical reinforcement learning induced pedagogical policies. In Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization. ACM, Genoa Italy, 284–292.
Section
EDM 2025 Journal Track