Integrating Topic Modeling and LLM Prompt Engineering into a Human-driven Approach to Analyze Interview Transcripts

Main

Sidebar

Published February 22, 2026
Teresa M. Ober Karyssa A. Courey Michael Flor

Abstract

Topic modeling has become a widely used unsupervised machine learning method for extracting latent themes from large textual datasets. However, the interpretability of these themes often relies heavily on human judgment, which can limit transparency and reproducibility. Recent advances in large language models (LLMs) and prompt engineering offer new opportunities to enhance the interpretability and scalability of topic modeling outputs. This study presents a hybrid, human-in-the-loop methodological framework that integrates topic modeling, LLM prompting, and human-derived codes to support rigorous qualitative analysis. We apply this framework to focus group interviews with 13 U.S. teachers discussing the conceptualization and assessment of communication and digital literacy skills within competency-based education (CBE) contexts. The multi-stage process includes semantic clustering, LLM-assisted topic labeling, and iterative codebook refinement, enabling both scale and interpretive depth. Our findings demonstrate that this approach supports construct alignment, thematic stability, and methodological transparency, while preserving the contextual richness of qualitative data. We also highlight the importance of human oversight in guiding LLM outputs and ensuring theoretical coherence. This work contributes to emerging best practices for integrating AI tools into qualitative educational research by offering a replicable approach for analyzing complex, open-ended data that maintains both scalability and interpretability. The framework demonstrates how computational tools can augment human interpretive expertise while maintaining the epistemological integrity essential to qualitative inquiry. Supplemental materials are available at: https://doi.org/10.17605/osf.io/4q6w8 

How to Cite

Integrating Topic Modeling and LLM Prompt Engineering into a Human-driven Approach to Analyze Interview Transcripts. (2026). Journal of Educational Data Mining, 18(1), 156-179. https://doi.org/10.5281/zenodo.18733521
Abstract 200 | PDF Downloads 295 HTML Downloads 51

Details

Keywords

large language models, topic modeling, qualitative analysis, skills assessment, human-AI

References
Atwell, M. N.,and Tucker, A. 2024. Portraits of a Graduate: Strengthening career and college readiness through social and emotional skill development. Collaborative for Academic, Social, and Emotional Learning. https://files.eric.ed.gov/fulltext/ED641286.pdf

Barany, A., Nasiar, N., Porter, C., Zambrano, A. F., Andres, A. L., Bright, D., Shah, M., Liu, A., Gao, S., Zhang, J., Mehta, S., Choi, J., Giordano, C., and Baker, R. S. 2024, July. ChatGPT for education research: exploring the potential of large language models for qualitative codebook development. In International conference on artificial intelligence in education, 134–149. Cham: Springer Nature Switzerland. https://link.springer.com/chapter/10.1007/978-3-031-64299-9_10

Blackwell, R. E., Barry, J., and Cohn, A. G. 2024. Towards reproducible LLM evaluation: Quantifying uncertainty in LLM benchmark scores. arXiv preprint arXiv:2410.03492. https://doi.org/10.48550/arXiv.2410.03492

Blei, D. M., Ng, A. Y., and Jordan, M. I. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research, 3(Jan), 993–1022. https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf

Braun, V., and Clarke, V. 2006. Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. https://doi.org/10.1191/1478088706qp063oa

Chatfield, S. L., and Debois, K. A. 2022. Engaging students in socially constructed qualitative research pedagogies, Strategies for collaborative classroom practice in qualitative data analysis. 234–252. https://doi.org/10.1163/9789004518438_015

Cheng, A., and Zamarro, G. 2018. Measuring teacher non-cognitive skills and its impact on students: Insight from the Measures of Effective Teaching Longitudinal Database. Economics of Education Review, 64, 251–260. https://doi.org/10.1016/j.econedurev.2018.03.001

Chew, R., Bollenbacher, J., Wenger, M., Speer, J., and Anand, A. 2023. LLM-assisted content analysis: Using large language models to support deductive coding. Behavior Research Methods, 55(4), 1485–1497. https://arxiv.org/abs/2306.14924

Churchill, R., and Singh, L. 2022. The evolution of topic modeling. ACM Computing Surveys, 54(10s), 1–35. https://doi.org/10.1145/3507900

Dai, W., Lin, J., Jin, F., Li, T., Tsai, Y. S., Gašević, D., and Chen, G. 2023. Can large language models provide feedback to students? A case study on ChatGPT. In Proceedings of the IEEE International Conference on Advanced Learning Technologies, N.-S. Chen, G. Rudolph, D. G. Sampson, M. Chang, R. Kuo, & A. Tlili, Eds., 323–327. https://ieeexplore.ieee.org/abstract/document/10260740

De Paoli, S. 2024. Performing an inductive thematic analysis of semi-structured interviews with a large language model: An exploration and provocation on the limits of the approach. Social Science Computer Review, 42(4), 997–1019. https://doi.org/10.1177/08944393231220483

Elliott, J. G., Stemler, S. E., Sternberg, R. J., Grigorenko, E. L., and Hoffman, N. 2011. The socially skilled teacher and the development of tacit knowledge. British Educational Research Journal, 37(1), 83–103. https://doi.org/10.1080/01411920903420016

Evans, C. M., Landl, E., and Thompson, J. 2020. Making sense of K-12 competency-based education: A systematic literature review of implementation and outcomes research from 2000 to 2019. The Journal of Competency-Based Education, 5(4), e01228. https://doi.org/10.1002/cbe2.1228

Forbus, K. D. 2019. Qualitative representations: How people reason and learn about the continuous world. MIT Press. https://direct.mit.edu/books/monograph/4167/Qualitative-RepresentationsHow-People-Reason-and

Frey, B. J., and Dueck, D. 2007. Clustering by passing messages between data points. Science, 315(5814), 972–976. https://doi.org/10.1126/science.1136800

Gao, T., Dang, A., and Reinecke, K. 2023. CollabCoder: A GPT-powered workflow for collaborative qualitative analysis. In Proceedings of the ACM on Human-Computer Interaction, 7(CSCW1), M. D. Choudhury, X. Ding, S. Guha, A. F. P. de Carvalho, H. Kuzuoka, K. Reinecke, H.-C. Wang, & N. Yamashita, Eds., 1–27. https://arxiv.org/pdf/2304.07366

Glaser, B., and Strauss, A. 2017. Discovery of grounded theory: Strategies for qualitative research. Routledge. https://doi.org/10.4324/9780203793206

Gratsanis, P., Karydis, I., Sioutas, S., and Vonitsanos, G. 2025. Human-AI Co-creation: LLMs, contextual hints, performance. In Artificial Intelligence Applications and Innovations. AIAI 2025 IFIP WG 12.5 International Workshops, A. Papaleonidas, E. Pimenidis, H. Papadopoulos, & I. Chochliouros, Eds. AIAI 2025. IFIP Advances in Information and Communication Technology, vol 754, 81–94. Springer, Cham, 7. https://doi.org/10.1007/978-3-031-97313-0_7

Hall, A. B., and Trespalacios, J. 2019. Personalized professional learning and teacher self-efficacy for integrating technology in K–12 classrooms. Journal of Digital Learning in Teacher Education, 35(4), 221–235. https://doi.org/10.1080/21532974.2019.1647579

Hayes, A. S. 2025. “Conversing” with qualitative data: Enhancing qualitative research through Large Language Models (LLMs). International Journal of Qualitative Methods, 24, 16094069251322346. https://doi.org/10.1177/16094069251322346

Hitch, D. 2024. Artificial intelligence augmented qualitative analysis: the way of the future?. Qualitative Health Research, 34(7), 595-606. https://doi.org/10.1177/10497323231217392

Isoaho, K., Gritsenko, D., and Mäkelä, E. 2021. Topic modeling and text analysis for qualitative policy research. Policy Studies Journal, 49(1), 300–324. https://doi.org/10.1111/psj.12343

Jung, H. S., Lee, H., Woo, Y. S., Baek, S. Y., and Kim, J. H. 2024. Expansive data, extensive model: Investigating discussion topics around LLM through unsupervised machine learning in academic papers and news. Plos One, 19(5), e0304680. https://doi.org/10.1371/journal.pone.0304680

Levine, E., and Patrick, S. 2019. What is competency-based education? An updated definition. Aurora Institute. https://aurora-institute.org/wp-content/uploads/what-is-competency-based-education-an-updated-definition-web.pdf

Li, Z., Dohan, D., and Abramson, C. M. 2021. Qualitative coding in the computational era: A hybrid approach to improve reliability and reduce effort for coding ethnographic interviews. Socius, 7, 23780231211062345. https://doi.org/10.1177/23780231211062345

Liu, O. L., Kell, H. J., Liu, L., Ling, G., Wang, Y., Wylie, C., Sevak, A., Sherer, D., Lemahieu, P., and Knowles, T. 2023. A new vision for skills-based assessment. Educational Testing Service. https://www.ets.org/pdfs/rd/new-vision-skills-based-assessment.pdf

Liu, X., Zambrano, A. F., Baker, R. S., Barany, A., Ocumpaugh, J., Zhang, J., Pankiewicz, M., Nasiar, N., and Wei, Z. 2025. Qualitative coding with GPT-4: Where it works better. Journal of Learning Analytics, 12(1), 169–185. https://doi.org/10.18608/jla.2025.8575

Magida, A. 2024. The Use of Digital Tools and Emerging Technologies in Qualitative Research—A Systematic Review of Literature. In World Conference on Qualitative Research, 257-269. Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-65735-1_16

Maier, D., Waldherr, A., Miltner, P., Wiedemann, G., Niekler, A., Keinert, A., Pfetsch, B., Heyer, G. , Reber, U. , Häussler, T. , Schmid-Petri, H., and Adam, S. 2021. Applying LDA topic modeling in communication research: Toward a valid and reliable methodology. In Computational methods for communication science (1st Ed), W. van Atteveldt & T.-Q. Peng, Eds., 13–38. New York: Routledge. https://www.taylorfrancis.com/chapters/edit/10.4324/9781003082606-2/applying-lda-topic-modeling-communication-research-toward-valid-reliable-methodology-daniel-maier-waldherr-miltner-wiedemann-niekler-keinert-pfetsch-heyer-reber-h%C3%A4ussler-schmid-petri-adam

Malterud, K., Siersma, V. D., and Guassora, A. D. 2016. Sample size in qualitative interview studies: Guided by information power. Qualitative Health Research, 26(13), 1753–1760. https://doi.org/10.1177/1049732315617444

Männikkö, I., and Husu, J. 2019. Examining teachers’ adaptive expertise through personal practical theories. Teaching and Teacher Education, 77, 126–137. https://doi.org/10.1016/j.tate.2018.09.016

Marion, S., Worthen, M., and Evans, C. 2020. How systems of assessments aligned with competency-based education can support equity. Aurora Institute and Center for Assessment. https://files.eric.ed.gov/fulltext/ED603989.pdf

McClure, C., Smyslova, O., Hall, A., and Jiang, Y. 2024. Deductive coding’s role in AI vs. human performance. In Proceedings of the 17th International Conference on Educational Data Mining (EDM 2024), C. Demmans Epp, B. Paaßen, & D. Joyner, Eds.. https://educationaldatamining.org/edm2024/proceedings/2024.EDM-posters.91/

Merchant, S., Klinger, D., and Love, A. 2018. Assessing and reporting non-cognitive skills: A cross-Canada survey. Canadian Journal of Educational Administration and Policy, (187), 2–17. https://journalhosting.ucalgary.ca/index.php/cjeap/article/view/43135

Morgan, D. L. 2023. Exploring the use of artificial intelligence for qualitative data analysis: The case of ChatGPT. International Journal of Qualitative Methods, 22, 16094069231211248. https://doi.org/10.1177/16094069231211248

Morreale, S., Lowenthal, P., Thorpe, J., and Olesova, L. 2024. Instructional communication competence and instructor social presence: enhancing teaching and learning in the online environment. Frontiers in Communication, 9, 1397570. https://doi.org/10.3389/fcomm.2024.1397570

Nicmanis, M., and Spurrier, H. 2025. Getting started with Artificial Intelligence assisted qualitative analysis: An introductory guide to qualitative research approaches with exploratory examples from reflexive content analysis. International Journal of Qualitative Methods, 24, 16094069251354863. https://doi.org/10.1177/16094069251354863

Player, L., Hughes, R., Mitev, K., Whitmarsh, L., Demski, C., Nash, N., Papakonstantinou, T., and Wilson, M. 2025. The use of large language models for qualitative research: The Deep Computational Text Analyser (DECOTA). Psychological Methods. https://doi.org/10.1037/met0000753

Prescott, M. R., Yeager, S., Ham, L., Rivera Saldana, C. D., Serrano, V., Narez, Paltin, D., Delgado, J., Moore, D., And Montoya, J. 2024. Comparing the efficacy and efficiency of human and generative AI: Qualitative thematic analyses. JMIR AI, 3, e54482. https://ai.jmir.org/2024/1/e54482

Qiao, T., Walker, C., Cunningham, C., and Koh, Y. S. 2025. Thematic-LM: a LLM-based multi-agent system for large-scale thematic analysis. In Proceedings of the ACM on Web Conference 2025, G. Long, M. Blumestein, Y. Chang, L. Lewin-Eytan, H. Huang, & E. Yom-Tov, Eds., 649–658. https://doi.org/10.1145/3696410.3714595

Reimers, N., and Gurevych, I. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), R. Huang & S. Padó, Eds., 3982–3992. Association for Computational Linguistics, Hong Kong, China. https://arxiv.org/abs/1908.10084

Sankaranarayanan, S., Borchers, C., Simon, S., Tajik, E., Ataş, A. H., Celik, B., and Balzan, F. 2025. Automating thematic analysis with multi-agent LLM systems. OSF preprint. https://osf.io/preprints/edarxiv/kq8zh_v1

Schroeder, H., Aubin Le Quéré, M., Randazzo, C., Mimno, D., and Schoenebeck, S. 2025, April. Large Language Models in qualitative research: Uses, tensions, and intentions. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, N. Yamashita, V. Evers, K. Yatani, X. Ding, B. Lee, M. Chetty, & P. Toups-Dugas, Eds., 1–17. https://doi.org/10.1145/3706598.3713120

Singh, C., Inala, J. P., Galley, M., Caruana, R., and Gao, J. 2024. Rethinking interpretability in the era of large language models. arXiv preprint arXiv:2402.01761. https://arxiv.org/abs/2402.01761

Sinha, R., Solola, I., Nguyen, H., Swanson, H., and Lawrence, L. 2024, June. The role of generative AI in qualitative research: GPT-4’s contributions to a grounded theory analysis. In Proceedings of the 2024 Symposium on Learning, Design and Technology, G. Arastoopour Irgens & H. Swanson, Eds., 17–25. https://doi.org/10.1145/3663433.3663456

Stammbach, D., Zouhar, V., Hoyle, A., Sachan, M., and Ash, E. 2023. Revisiting automated topic model evaluation with large language models. arXiv preprint arXiv:2305.12152. https://doi.org/10.48550/arXiv.2305.12152

Sturgis, C., and Casey, K. 2018. Designing for equity: Leveraging competency-based education to ensure all students succeed. CompetencyWorks Final Paper. iNACOL. https://files.eric.ed.gov/fulltext/ED589907.pdf

Than, N., Fan, L., Law, T., Nelson, L. K., and Mccall, L. 2025. Updating “The Future of Coding”: Qualitative coding with generative Large Language Models. Sociological Methods & Research, 5(3), 849–888. https://doi.org/10.1177/00491241251339188

Trent, A., and Cho, J. 2020. Interpretation in qualitative research: What, why, how. In The Oxford Handbook of Qualitative Research (2nd Ed.), P. Leavy, Ed., 956–982, Oxford Handbooks. https://doi.org/10.1093/oxfordhb/9780190847388.013.35

Wang, H., Prakash, N., Hoang, N. K., Hee, M. S., Naseem, U., and Lee, R. K. W. 2023. Prompting large language models for topic modeling. In 2023 IEEE International Conference on Big Data (BigData), A. Cuzzocrea & R. Agrawal, Eds., 1236–1241. https://doi.org/10.1109/BigData59044.2023.10386113

Wang, N., and Lester, J. 2023. K-12 Education in the age of AI: A call to action for K-12 AI literacy. International Journal of Artificial Intelligence in Education, 33(2), 228-232. https://doi.org/10.1007/s40593-023-00358-x

Williams, R. T. 2024. Paradigm shifts: Exploring AI’s influence on qualitative inquiry and analysis. Frontiers in Research Metrics and Analytics, 9, 1331589. https://doi.org/10.3389/frma.2024.1331589

Xiao, Z., Yuan, X., Liao, Q. V., Abdelghani, R., and Oudeyer, P. Y. 2023, March. Supporting qualitative analysis with large language models: Combining codebook with GPT-3 for deductive coding. In Proceedings of the 28th international conference on intelligent user interfaces, F. Chen, M. Billinghurst, & M. Zhou, Eds., 75–78. https://doi.org/10.1145/3581754.3584136

Xing, W., Nixon, N., Crossley, S., Denny, P., Lan, A., Stamper, J., and Yu, Z. 2025. The use of Large Language Models in education. International Journal of Artificial Intelligence in Education, 35, 1–5. https://doi.org/10.1007/s40593-025-00457-x

Yan, L., Echeverria, V., Fernandez-Nieto, G. M., Jin, Y., Swiecki, Z., Zhao, L., Gašević, D., and Martinez-Maldonado, R. 2024. Human-AI collaboration in thematic analysis using ChatGPT: A user study and design recommendations. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, N. Yamashita, V. Evers, K. Yatani, & X. Ding, Eds., 1–7. https://doi.org/10.1145/3613905.3650732

Yang, J., Jin, H., Tang, R., Han, X., Feng, Q., Jiang, H., Zhong, S., Yin, B., and Hu, X. 2024. Harnessing the power of LLMs in practice: A survey on ChatGPT and beyond. ACM Transactions on Knowledge Discovery from Data, 18(6), 1–32. https://doi.org/10.1145/3649506

Yang, X., Zhao, H., Xu, W., Qi, Y., Lu, J., Phung, D., and Du, L. 2024. Neural topic modeling with Large Language Models in the loop. arXiv preprint arXiv:2411.08534. https://arxiv.org/abs/2411.08534

Zhang, H. E., Wu, C., Xie, J., Lyu, Y., Cai, J., and Carroll, J. M. 2025. Harnessing the power of AI in qualitative research: Exploring, using and redesigning ChatGPT. Computers in Human Behavior: Artificial Humans, 4, 100144. https://doi.org/10.1016/j.chbah.2025.100144

Zhang, S., Meshram, P. S., Ganapathy Prasad, P., Israel, M., and Bhat, S. 2025, February. An LLM-based framework for simulating, classifying, and correcting students’ programming knowledge with the SOLO taxonomy. In Proceedings of the 56th ACM Technical Symposium on Computer Science Education v.2, J. A. Stone, T. Yuen, L. Shoop, S. A. Rebelsky, & J. Prather, Eds., 1681–1682. https://dl.acm.org/doi/abs/10.1145/3641555.3705125

Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., Dong, Z., Du, Y., Yang, C., Chen, Y., Chen, Z., Jiang, J., Ren, R., Li, Y., Tang, X., Liu, Z., … Wen, J. R. 2023. A survey of large language models. arXiv preprint arXiv:2303.18223, 1(2). https://doi.org/10.48550/arXiv.2303.18223
Section
Special Section: Human-AI Partnership for Qualitative Analysis