Human-AI Collaboration for Qualitative Analysis in Participatory Design: Refining the Writing Analytics Tool

Main

Sidebar

Published February 20, 2026
Andrew Potter Zeinab Serhan Nishad Patne Püren Öncel Ishrat Ahmed Tracy Arner Rezwana Islam Rod Roscoe Laura Allen Scott Crossley Danielle McNamara

Abstract

This study introduces a hybrid human-AI workflow to qualitative data analysis within the participatory design of the Writing Analytics Toolkit (WAT), an open-source platform that provides formative feedback on student writing using natural language processing. The toolkit includes a classroom-facing implementation (WAT Classroom; WAT-C), designed to support instruction, and a researcher-facing implementation (WAT Researcher; WAT-R), designed to support analytic and validation workflows. Nine experienced college writing instructors (with 97 cumulative years of teaching) participated in focus group sessions to evaluate an early prototype of the classroom version of WAT (WAT-C), offering formative input on usability, instructional alignment, and feedback clarity. To analyze the resulting qualitative data, we employed a novel AI-augmented analytic process: GPT-4o, integrated within a secure, retrieval-augmented system, to generate inductive codes and preliminary themes from transcripts. These AI-generated outputs were iteratively reviewed, critiqued, refined, and synthesized by researchers, supporting both analytical scalability and interpretive rigor. This human-AI partnership enabled efficient thematic exploration while preserving methodological transparency and researcher judgment. Findings from both qualitative and complementary survey data identified four key design priorities: (1) clearer, more concise feedback, (2) increased instructor customization, (3) reduced administrative burden, and (4) a simplified user interface. These insights directly informed subsequent revisions to WAT-C, including a redesigned feedback interface, customizable metric targets, learning management system integration, and a more intuitive layout. This work illustrates how large language models (LLMs) can support inductive qualitative analysis within participatory design workflows. Moreover, results demonstrate how this workflow can inform iterative educational technology development. Implications include the need to ensure ethical oversight, researcher-led interpretation, and alignment with instructional priorities when incorporating AI into the design of educational technologies.

How to Cite

Human-AI Collaboration for Qualitative Analysis in Participatory Design: Refining the Writing Analytics Tool. (2026). Journal of Educational Data Mining, 18(1), 113-155. https://doi.org/10.5281/zenodo.18714068
Abstract 123 | PDF Downloads 205 HTML Downloads 32

Details

Keywords

writing analytics, participatory design, generative AI, qualitative data analysis, natural language processing, educational technology

References
Abras, C., Maloney-Krichmar, D., and Preece, J. 2004. User-Centered Design. In Encyclopedia of Human-Computer Interaction (Vol. 2 pp. 763-768), W. S. Bainbridge (Ed.),. Berkshire Publishing Group.

Ahmed, I., Alvarado, P., Jain, S., Arner, T., Reilley, E., and McNamara, D.S. 2025. Arizona State University CreateAI platform. In Design recommendations for intelligent tutoring systems: Generative AI in intelligent tutoring systems (Vol. 12), A. M. Sinatra, A. C. Graesser, P. M. Lawton, and V. Rus, Eds. US Army Combat Capabilities Development Command - Soldier Center. https://eric.ed.gov/?id=ED672991

Arizona State University. 2024. Create AI Chat. https://ai.asu.edu/technical-foundation/create-ai-chat

Aspers, P., and Corte, U. 2019. What is qualitative in qualitative research. Qualitative sociology, 42(2), 139-160. https://doi.org/10.1007/s11133-019-9413-7

Béchard, P., and Ayala, O. M. 2024. Reducing hallucination in structured outputs via Retrieval-Augmented Generation. arXiv preprint arXiv:2404.08189.

Bijker, R., Merkouris, S. S., Dowling, N. A., and Rodda, S. N. 2024. ChatGPT for automated qualitative research: Content analysis. Journal of Medical Internet Research, 26, e59050. https://doi.org/10.2196/59050

Bourke, B. 2014. Positionality: Reflecting on the research process. The Qualitative Report, 19(33), 1-9. http://www.nova.edu/ssss/QR/QR19/bourke18.pdf

Braun, V., and Clarke, V. 2006. Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77-101. https://doi.org/10.1191/1478088706qp063oa

Butterfuss, R., Roscoe, R. D., Allen, L. K., McCarthy, K. S., and McNamara, D. S. 2022. Strategy uptake in writing pal: Adaptive feedback and instruction. Journal of Educational Computing Research, 60(3), 696-721. https://doi.org/10.1177/07356331211045304

Chen, J., Lotsos, A., Zhao, L., Wang, G., Wilensky, U., Sherin, B., and Horn, M. 2024. Prompts matter: Comparing ML/GAI approaches for generating inductive qualitative coding results. arXiv. https://arxiv.org/abs/2411.06316

Chew, R., Bollenbacher, J., Wenger, M., Speer, J., and Kim, A. 2023. LLM-assisted content analysis: Using large language models to support deductive coding. arXiv. https://arxiv.org/abs/2306.14924

Conijn, R., Martinez-Maldonado, R., Knight, S., Buckingham Shum, S., Waes, L. V., and van Zaanen, M. 2022. How to provide automated feedback on the writing process? A participatory approach to design writing analytics tools, Computer Assisted Language Learning, 35(8), 1838-1868, https://doi.org/10.1080/09588221.2020.1839503

Corlett, S., and Mavin, S. 2018. Reflexivity and researcher positionality. In The SAGE handbook of qualitative business and management research methods, C. Cassell, A. Cunliffe, and G. Grandy, Eds. 377–389. SAGE.

Correnti, R., Matsumura, L. C., Wang, E. L., Litman, D., and Zhang, H. 2022. Building a validity argument for an automated writing evaluation system (eRevise) as a formative assessment. Computers and Education Open, 3, 100084. https://doi.org/10.1016/j.caeo.2022.100084

Correnti, R., Wang, E. L., Matsumura, L. C., Litman, D., Liu, Z., and Li, T. 2024. Supporting students’ text-based evidence use via formative automated writing and revision assessment. In The Routledge international handbook of automated essay evaluation, M. D. Shermis and J. Wilson, Eds., 221–243. Routledge. https://doi.org/10.4324/9781003397618

Creswell, J. W., and Clark, V. L. P. 2017. Designing and conducting mixed methods research. Sage publications.

Cumbo, B., and Selwyn, N. 2022. Using participatory design approaches in educational research. International Journal of Research & Method in Education, 45, 60-72. https://doi.org/10.1080/1743727X.2021.1902981

Davis, F. D. 1989. Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly, 13(3), 319-340.

Dikli, S., and Bleyle, S. 2014. Automated essay scoring feedback for second language writers: How does it compare to instructor feedback? Assessing Writing, 22, 1–17. https://doi.org/10.1016/j.asw.2014.03.006

Dunivin, Z. O. 2024. Scalable qualitative coding with LLMs: Chain-of-thought reasoning matches human performance in some hermeneutic tasks. Center for Complex Networks and Systems Research, Indiana University. https://osf.io/k4fg9

Fielding, N. G. 2012. Triangulation and mixed methods designs: Data integration with new research technologies. Journal of Mixed Methods Research, 6(2), 124-136. https://doi.org/10.1177/1558689812437101

Fleckenstein, J., Liebenow, L, and Meyer, J. 2023. Automated feedback and writing: A multi-level meta-analysis of effects on students’ performance. Frontiers in Artificial Intelligence, 6. https://doi.org/10.3389/frai.2023.1162454

Fossey, E., Harvey, C., McDermott, F., and Davidson, L. 2002. Understanding and evaluating qualitative research. Australian and New Zealand Journal of Psychiatry, 36(6), 717-732. https://doi.org/10.1046/j.1440-1614.2002.01100.x

Golda, A., Singh, A., Raj, M., Deora, V., Nayak, G. K., and Poonia, R. C. 2024. Privacy and security concerns in generative AI: A comprehensive survey. IEEE Access, 12, 48126–48144. https://doi.org/10.1109/ACCESS.2024.3381611

Goldshtein, M., Alhashim, A. G., and Roscoe, R. D. 2024. Automating bias in writing evaluation: Sources, barriers, and recommendations. In The Routledge International Handbook of Automated Essay Evaluation, M. D. Shermis and J. Wilson, Eds., 421-444. Routledge. https://doi.org/10.4324/9781003397618

Goldshtein, M., Ocumpaugh, J., Potter, A., and Roscoe, R. D. 2024. The social consequences of language technologies and their underlying language ideologies. In Universal access in human-computer interaction. HCII 2024. Lecture Notes in Computer Science (Vol. 14696), M. Antona and C. Stephanidis, Eds., 271-290. https://doi.org/10.1007/978-3-031-60875-9_18

Hattie, J., and Timperley, H. 2007. The power of feedback. Review of Educational Research, 77, 81-112. https://doi.org/10.3102/0034654302984

Hayes, A. S. 2025. “Conversing” With Qualitative Data: Enhancing Qualitative Research Through Large Language Models (LLMs). International Journal of Qualitative Methods, 24, 16094069251322346. https://doi.org/10.1177/16094069251322346

Huang, Y., Palermo, C., and Wilson, J. 2025. Identifying active ingredients and uptake patterns in the implementation of an AI-based writing support tool: Insights from a randomized controlled trial. Computers and Education: Artificial Intelligence, 6, 100479. https://doi.org/10.1016/j.caeai.2025.100479

Imundo, M. N., Watanabe, M., Potter, A. H., Gong, J., Arner, T., and McNamara, D. S. 2024. Expert thinking with generative chatbots. Journal of Applied Research in Memory and Cognition, 13(4), 465–484. https://doi.org/10.1037/mac0000199

Katz, A., Fleming, G. C., and Main, J. 2024. Thematic analysis with open-source generative AI and machine learning: A new method for inductive qualitative codebook development. arXiv. https://arxiv.org/abs/2410.03721

Khalid, M. T., and Witmer, A. P. 2025. Prompt engineering for large language model-assisted inductive thematic analysis. arXiv. https://arxiv.org/abs/2503.22978

Kirsten, E., Buckmann, A., Mhaidli, A., and Becker, S. 2024. Decoding complexity: Exploring human-AI cocnordance in qualitative coding. Max Planck Institute for Security and Privacy. https://arxiv.org/abs/2403.06607

Knight, S., Martinez-Maldonado, R., Gibson, A., and Buckingham Shum, S. 2017. Towards mining sequences and dispersion of rhetorical moves in student written texts. In Proceedings of the Seventh International Learning Analytics and Knowledge Conference (pp. 228–232). ACM. https://doi.org/10.1145/3027385.3027433

Knight, S., Shibani, A., Abel, S., Gibson, A., and Ryan, P. 2020. AcaWriter: A learning analytics tool for formative feedback on academic writing. Journal of Writing Research, 12, 141–186. https://doi.org/10.17239/jowr-2020.12.01.06

Li, T., Creer, S. D., Arner, T., Roscoe, R. D., Allen, L. K., and McNamara, D. S. 2022. Participatory Design of a Writing Analytics Tool: Teachers’ Needs and Design Solutions. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge LAK22, A. F. Wise., R. Martinez-Maldonado, and I. Hilliger, Eds., 15-18. Online.

Liaqat, A., Munteanu, C., and Demmans E., C. 2021. Collaborating with mature English language learners to combine peer and automated feedback: A user-centered approach to designing writing support. International Journal of Artificial Intelligence in Education, 31(4), 638-679. https://doi.org/10.1007/s40593-020-00204-4

Link, S., Dursun, A., Karakaya, K., and Hegelheimer, V. 2014. Towards best ESL practices for implementing automated writing evaluation. Calico Journal, 31(3), 323–344. https://www.jstor.org/stable/calicojournal.31.3.323

Liu, L., Xu, W., Li, Y., and Liu, M. 2017. Automated essay feedback generation and its impact on revision. IEEE Transactions on Learning Technologies, 10(4), 502–513. https://doi.org/10.1109/TLT.2016.2612659

Lo, L. S. 2023. The CLEAR path: A framework for enhancing information literacy through prompt engineering. The Journal of Academic Librarianship, 49(4), 102720. https://doi.org/10.1016/j.acalib.2023.102720

MacArthur, C. A. 2016. Instruction in evaluation and revision. In Handbook of writing research (2nd ed.), C. A. MacArthur, S. Graham, and J. Fitzgerald, Eds., 272-287. Guilford Press.

McNamara, D. S., and Kendeou, P. 2022. The early automated writing evaluation (eAWE) framework. Assessment in Education: Principles, Policy & Practice, 29(2), 150-182. https://doi.org/10.1080/0969594X.2022.2037509

McNamara, D. S., and Potter, A. 2024. The two U's in the future of automated essay evaluation: Universal access and user-centered design. In Handbook of Automated Essay Evaluation (2nd ed.), M. D. Shermis and J. Wilson, Eds.. 590-608. Routledge. https://doi.org/10.4324/9781003397618

Minaee, S., Mikolov, T., Nikzad, N., Chenaghlu, M., Socher, R., Amatriain, X., and Gao, J. 2024. Large language models: A survey. arXiv. https://arxiv.org/abs/2402.06196

Muller, M. J., & Kuhn, S. 1993. Participatory design. Communications of the ACM, 36(6), 24-28.

OpenAI. 2024. GPT-4o [Large multimodal model]. https://openai.com/index/hello-gpt-4o

Palermo, C., and Thomson, M. M. 2018. Teacher implementation of self-regulated strategy development with an automated writing evaluation system: Effects on the argumentative writing performance of middle school students. Contemporary Educational Psychology, 54, 255-270. https://doi.org/10.1016/j.cedpsych.2018.07.002

Potter, A., and Wilson, J. 2022. Statewide implementation of automated writing evaluation: analyzing usage and associations with state test performance in grades 4-11. Educational Technology Research and Development, 69(3), 1557-1578. https://doi.org/10.1007/s11423-021-10004-9

Potter, A., Wilson, J., Roscoe, R.D., Arner, T., and McNamara, D.S. 2025. Computer-based writing instruction. In Handbook of writing research (3rd ed.) C. A. MacArthur, S. Graham, and J. Fitzgerald, Eds., 255–270. Guilford Press.

Prescott, M. R., Yeager, S., Ham, L., Rivera Saldana, C. D., Serrano, V., Narez, J., and Montoya, J. 2024. Comparing the efficacy and efficiency of human and generative AI: Qualitative thematic analyses. JMIR AI, 3, e54482. https://doi.org/10.2196/54482

Roscoe, R. D., Craig, S. D., and Douglas, I. 2018. End-user considerations in educational technology design. IGI Global Scientific Publishing. https://doi.org/10.4018/978-1-5225-2639-1

Roscoe, R. D., Wilson, J., Johnson, A. C., and Mayra, C. R. 2017. Presentation, expectations, and experience: Sources of student perceptions of automated writing evaluation. Computers in Human Behavior, 70, 207-221. https://doi.org/10.1016/j.chb.2016.12.076

Saldaña, J. 2014. Coding and analysis strategies. In P. Leavy (Ed.), The Oxford handbook of Qualitative Research, 581–605. Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199811755.013.001

Shermis, M. D., and Wilson, J. 2024. Introduction to automated essay evaluation. In The Routledge international handbook of automated essay evaluation, M. D. Shermis and J. Wilson, Eds., 3-22. Routledge. https://doi.org/10.4324/9781003397618

Steen, M. 2011. Tensions in human-centered design. CoDesign, 7, 45–60. https://doi.org/10.1080/15710882.2011.563314

Stone, M. L., Kent, K. M., Roscoe, R. D., Corley, K. M., Allen, L. K., and McNamara, D. S. 2018. The design implementation framework: Iterative design from the lab to the classroom. In End-user considerations in educational technology design, R. D. Roscoe, S. D. Craig, and I. Douglas, Eds., 76-98. Hershey, PA: IGI Global. https://doi.org/10.4018/978-1-5225-2639-1.ch004

Strobl, C., Ailhaud, E., Benetos, K., Devitt, A., Kruse, O., Proske, A., and Rapp, C. 2019. Digital support for academic writing: A review of technologies and pedagogies. Computers & Education, 131, 33-48. https://doi.org/10.1016/j.compedu.2018.12.005

Ten Holter, C. 2022. Participatory design: lessons and directions for responsible research and innovation. Journal of Responsible Innovation, 9(2), 275-290. https://doi.org/10.1080/23299460.2022.2041801

Theelen, H., Vreuls, J., and Rutten, J. 2024. Doing research with help from ChatGPT: Promising examples for coding and inter-rater reliability. International Journal of Technology in Education, 7, 1–18. https://doi.org/10.46328/ijte.537

Thomas, D. R. 2006. A general inductive approach for analyzing qualitative evaluation data. American Journal of Evaluation, 27(2), 237-246. https://doi.org/10.1177/1098214005283748

Toledo, C., and Shannon-Baker, P. 2023. Choosing a qualitatively oriented mixed methods research approach: Recommendations for researchers. In Handbook of mixed methods research in business and management, R. Cameron and X. Golenko, Eds., 41–54. Edward Elgar Publishing. https://doi.org/10.4337/9781800887954.00011

Tuhkala, A. 2021. A systematic literature review of participatory design studies involving teachers. European Journal of Education, 56(4), 641-659. https://doi.org/10.1111/ejed.12471

Turobov, A., Coyle, D., and Harding, V. 2024. Using ChatGPT for thematic analysis. arXiv. https://arxiv.org/abs/2405.08828

Venkatesh, V., and Davis, F. D. 2000. A theoretical extension of the technology acceptance model: Four longitudinal field studies. Management Science, 46(2), 186-204. https://doi.org/10.1287/mnsc.46.2.186.11926

Verma, R. K., Gupta, S., and Illinich, S. 2024. Technology-enhanced personalized learning in higher education. In Advances in technological innovations in higher education (1st ed.), A. Garg, B. V. Babu, & V. E. Balas, Eds., 71–92. CRC Press. https://doi.org/10.1201/9781003376699

Wacnik, P., Daly, S. R., and Verma, A. 2025. Participatory design: a systematic review and insights for future practice. Design Science, 11, e21. doi:10.1017/dsj.2025.10009

Wang, E. L., Matsumura, L. C., Correnti, R., Litman, D., Zhang, H., Howe, E., Magooda, A., and Quintana, R. 2020. eRevis(ing): Students’ revision of text evidence use in an automated writing evaluation system. Assessing Writing, 44, 100449. https://doi.org/10.1016/j.asw.2020.100449

Warr, M., and Heath, M. K. (2025). Uncovering the Hidden Curriculum in Generative AI: A Reflective Technology Audit for Teacher Educators. Journal of Teacher Education, 76(3), 245-261. https://doi.org/10.1177/00224871251325073

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q.V., and Zhou, D. 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35, 24824–24837.

Wilson, J., Delgado, A., Palermo, C., Cruz Cordero, T. M., Myers, M. C., Eacker, H., Potter, A., Coles, J., and Zhang, S. 2024. Middle school teachers’ implementation and perceptions of automated writing evaluation. Computers and Education Open, 7, 100231. https://doi.org/10.1016/j.caeo.2024.100231

Wilson, J., and MacArthur, C. 2024. Exploring the role of automated writing evaluation as a formative assessment tool supporting self-regulated learning and writing. In Routledge international handbook of automated essay evaluation, M. D. Shermis and J. Wilson, Eds., 197-220. Routledge. https://doi.org/10.4324/9781003397618

Wilson, J., and Roscoe, R. D. 2020. Automated writing evaluation and feedback: Multiple metrics of efficacy. Journal of Educational Computing Research, 58, 87-125. https://doi.org/10.1177/0735633119830764

Wilson, J., Zhang, S., Palermo, C., Cruz Cordero, T., Zhang, F., Myers, M. C., Potter, A., Eacker, H., and Coles, J. 2024. A Latent Dirichlet Allocation approach to understanding students’ perceptions of automated writing evaluation. Computers and Education Open, 6, 100194. https://doi.org/10.1016/j.caeo.2024.100194

Wisniewski, B., Zierer, K., and Hattie, J. 2020. The power of feedback revisited: A meta-analysis of educational feedback research. Frontiers in Psychology, 10, 487662. https://doi.org/10.3389/fpsyg.2019.03087

Xiao, Z., Yuan, X., Liao, Q. V., Abdelghani, R., and Oudeyer, P.-Y. 2023. Supporting qualitative analysis with large language models: Combining codebook with GPT-3 for deductive coding. In Proceedings of the 28th International Conference on Intelligent User Interfaces (IUI ’23 Companion) (pp. 1–6). ACM. https://doi.org/10.1145/3581754.3584136

Xu, Z., Jain, S., and Kankanhalli, M. 2024. Hallucination is inevitable: An innate limitation of large language models. arXiv. https://arxiv.org/abs/2401.11817

Yang, Y., Alba, C., Wang, C., Wang, X., Anderson, J., and An, R. (2024). GPT models can perform thematic analysis in public health studies, akin to qualitative researchers. Journal of Social Computing, 5(4), 293–312. https://doi.org/10.23919/JSC.2024.0024

Zambrano, A. F., Liu, X., Barany, A., Baker, R. S., Kim, J., and Nasiar, N. 2023. From nCoder to ChatGPT: From automated coding to refining human coding. In Advances in quantitative ethnography: ICQE 2023 (Vol. 1895), G. Arastoopour Irgens and S. Knight, Eds.. 470–485. Springer. https://doi.org/10.1007/978-3-031-47014-1_32

Zhang, H., Wu, C., Xie, J., Kim, C., and Carroll, J. M. 2023. QualiGPT: GPT as an easy-to-use tool for qualitative coding. arXiv. https://arxiv.org/abs/2310.07061

Zhang, H., Wu, C., Xie, J., Lyu, Y., Cai, J., and Carroll, J. M. 2023. Redefining qualitative analysis in the AI era: Utilizing ChatGPT for efficient thematic analysis. arXiv. https://arxiv.org/abs/2309.10771

Zhang, H., Wu, C., Xie, J., Rubino, F., Graver, S., Kim, C., and Cai, J. 2024. When qualitative research meets large language model: Exploring the potential of QualiGPT as a tool for qualitative coding. arXiv. https://arxiv.org/abs/2407.14925

Zhao, F., Yu, F., and Shang, Y. 2024. A new method supporting qualitative data analysis through prompt generation for inductive coding. In Proceedings of the 2024 IEEE International Conference on Information Reuse and Integration for Data Science (IRI) (pp. 164–169). IEEE. https://doi.org/10.1109/IRI62200.2024.00043
Section
Special Section: Human-AI Partnership for Qualitative Analysis