Integrating Topic Modeling and LLM Prompt Engineering into a Human-driven Approach to Analyze Interview Transcripts
Main
Sidebar
Abstract
Topic modeling has become a widely used unsupervised machine learning method for extracting latent themes from large textual datasets. However, the interpretability of these themes often relies heavily on human judgment, which can limit transparency and reproducibility. Recent advances in large language models (LLMs) and prompt engineering offer new opportunities to enhance the interpretability and scalability of topic modeling outputs. This study presents a hybrid, human-in-the-loop methodological framework that integrates topic modeling, LLM prompting, and human-derived codes to support rigorous qualitative analysis. We apply this framework to focus group interviews with 13 U.S. teachers discussing the conceptualization and assessment of communication and digital literacy skills within competency-based education (CBE) contexts. The multi-stage process includes semantic clustering, LLM-assisted topic labeling, and iterative codebook refinement, enabling both scale and interpretive depth. Our findings demonstrate that this approach supports construct alignment, thematic stability, and methodological transparency, while preserving the contextual richness of qualitative data. We also highlight the importance of human oversight in guiding LLM outputs and ensuring theoretical coherence. This work contributes to emerging best practices for integrating AI tools into qualitative educational research by offering a replicable approach for analyzing complex, open-ended data that maintains both scalability and interpretability. The framework demonstrates how computational tools can augment human interpretive expertise while maintaining the epistemological integrity essential to qualitative inquiry. Supplemental materials are available at: https://doi.org/10.17605/osf.io/4q6w8
How to Cite
Details
large language models, topic modeling, qualitative analysis, skills assessment, human-AI
Barany, A., Nasiar, N., Porter, C., Zambrano, A. F., Andres, A. L., Bright, D., Shah, M., Liu, A., Gao, S., Zhang, J., Mehta, S., Choi, J., Giordano, C., and Baker, R. S. 2024, July. ChatGPT for education research: exploring the potential of large language models for qualitative codebook development. In International conference on artificial intelligence in education, 134–149. Cham: Springer Nature Switzerland. https://link.springer.com/chapter/10.1007/978-3-031-64299-9_10
Blackwell, R. E., Barry, J., and Cohn, A. G. 2024. Towards reproducible LLM evaluation: Quantifying uncertainty in LLM benchmark scores. arXiv preprint arXiv:2410.03492. https://doi.org/10.48550/arXiv.2410.03492
Blei, D. M., Ng, A. Y., and Jordan, M. I. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research, 3(Jan), 993–1022. https://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf
Braun, V., and Clarke, V. 2006. Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. https://doi.org/10.1191/1478088706qp063oa
Chatfield, S. L., and Debois, K. A. 2022. Engaging students in socially constructed qualitative research pedagogies, Strategies for collaborative classroom practice in qualitative data analysis. 234–252. https://doi.org/10.1163/9789004518438_015
Cheng, A., and Zamarro, G. 2018. Measuring teacher non-cognitive skills and its impact on students: Insight from the Measures of Effective Teaching Longitudinal Database. Economics of Education Review, 64, 251–260. https://doi.org/10.1016/j.econedurev.2018.03.001
Chew, R., Bollenbacher, J., Wenger, M., Speer, J., and Anand, A. 2023. LLM-assisted content analysis: Using large language models to support deductive coding. Behavior Research Methods, 55(4), 1485–1497. https://arxiv.org/abs/2306.14924
Churchill, R., and Singh, L. 2022. The evolution of topic modeling. ACM Computing Surveys, 54(10s), 1–35. https://doi.org/10.1145/3507900
Dai, W., Lin, J., Jin, F., Li, T., Tsai, Y. S., Gašević, D., and Chen, G. 2023. Can large language models provide feedback to students? A case study on ChatGPT. In Proceedings of the IEEE International Conference on Advanced Learning Technologies, N.-S. Chen, G. Rudolph, D. G. Sampson, M. Chang, R. Kuo, & A. Tlili, Eds., 323–327. https://ieeexplore.ieee.org/abstract/document/10260740
De Paoli, S. 2024. Performing an inductive thematic analysis of semi-structured interviews with a large language model: An exploration and provocation on the limits of the approach. Social Science Computer Review, 42(4), 997–1019. https://doi.org/10.1177/08944393231220483
Elliott, J. G., Stemler, S. E., Sternberg, R. J., Grigorenko, E. L., and Hoffman, N. 2011. The socially skilled teacher and the development of tacit knowledge. British Educational Research Journal, 37(1), 83–103. https://doi.org/10.1080/01411920903420016
Evans, C. M., Landl, E., and Thompson, J. 2020. Making sense of K-12 competency-based education: A systematic literature review of implementation and outcomes research from 2000 to 2019. The Journal of Competency-Based Education, 5(4), e01228. https://doi.org/10.1002/cbe2.1228
Forbus, K. D. 2019. Qualitative representations: How people reason and learn about the continuous world. MIT Press. https://direct.mit.edu/books/monograph/4167/Qualitative-RepresentationsHow-People-Reason-and
Frey, B. J., and Dueck, D. 2007. Clustering by passing messages between data points. Science, 315(5814), 972–976. https://doi.org/10.1126/science.1136800
Gao, T., Dang, A., and Reinecke, K. 2023. CollabCoder: A GPT-powered workflow for collaborative qualitative analysis. In Proceedings of the ACM on Human-Computer Interaction, 7(CSCW1), M. D. Choudhury, X. Ding, S. Guha, A. F. P. de Carvalho, H. Kuzuoka, K. Reinecke, H.-C. Wang, & N. Yamashita, Eds., 1–27. https://arxiv.org/pdf/2304.07366
Glaser, B., and Strauss, A. 2017. Discovery of grounded theory: Strategies for qualitative research. Routledge. https://doi.org/10.4324/9780203793206
Gratsanis, P., Karydis, I., Sioutas, S., and Vonitsanos, G. 2025. Human-AI Co-creation: LLMs, contextual hints, performance. In Artificial Intelligence Applications and Innovations. AIAI 2025 IFIP WG 12.5 International Workshops, A. Papaleonidas, E. Pimenidis, H. Papadopoulos, & I. Chochliouros, Eds. AIAI 2025. IFIP Advances in Information and Communication Technology, vol 754, 81–94. Springer, Cham, 7. https://doi.org/10.1007/978-3-031-97313-0_7
Hall, A. B., and Trespalacios, J. 2019. Personalized professional learning and teacher self-efficacy for integrating technology in K–12 classrooms. Journal of Digital Learning in Teacher Education, 35(4), 221–235. https://doi.org/10.1080/21532974.2019.1647579
Hayes, A. S. 2025. “Conversing” with qualitative data: Enhancing qualitative research through Large Language Models (LLMs). International Journal of Qualitative Methods, 24, 16094069251322346. https://doi.org/10.1177/16094069251322346
Hitch, D. 2024. Artificial intelligence augmented qualitative analysis: the way of the future?. Qualitative Health Research, 34(7), 595-606. https://doi.org/10.1177/10497323231217392
Isoaho, K., Gritsenko, D., and Mäkelä, E. 2021. Topic modeling and text analysis for qualitative policy research. Policy Studies Journal, 49(1), 300–324. https://doi.org/10.1111/psj.12343
Jung, H. S., Lee, H., Woo, Y. S., Baek, S. Y., and Kim, J. H. 2024. Expansive data, extensive model: Investigating discussion topics around LLM through unsupervised machine learning in academic papers and news. Plos One, 19(5), e0304680. https://doi.org/10.1371/journal.pone.0304680
Levine, E., and Patrick, S. 2019. What is competency-based education? An updated definition. Aurora Institute. https://aurora-institute.org/wp-content/uploads/what-is-competency-based-education-an-updated-definition-web.pdf
Li, Z., Dohan, D., and Abramson, C. M. 2021. Qualitative coding in the computational era: A hybrid approach to improve reliability and reduce effort for coding ethnographic interviews. Socius, 7, 23780231211062345. https://doi.org/10.1177/23780231211062345
Liu, O. L., Kell, H. J., Liu, L., Ling, G., Wang, Y., Wylie, C., Sevak, A., Sherer, D., Lemahieu, P., and Knowles, T. 2023. A new vision for skills-based assessment. Educational Testing Service. https://www.ets.org/pdfs/rd/new-vision-skills-based-assessment.pdf
Liu, X., Zambrano, A. F., Baker, R. S., Barany, A., Ocumpaugh, J., Zhang, J., Pankiewicz, M., Nasiar, N., and Wei, Z. 2025. Qualitative coding with GPT-4: Where it works better. Journal of Learning Analytics, 12(1), 169–185. https://doi.org/10.18608/jla.2025.8575
Magida, A. 2024. The Use of Digital Tools and Emerging Technologies in Qualitative Research—A Systematic Review of Literature. In World Conference on Qualitative Research, 257-269. Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-65735-1_16
Maier, D., Waldherr, A., Miltner, P., Wiedemann, G., Niekler, A., Keinert, A., Pfetsch, B., Heyer, G. , Reber, U. , Häussler, T. , Schmid-Petri, H., and Adam, S. 2021. Applying LDA topic modeling in communication research: Toward a valid and reliable methodology. In Computational methods for communication science (1st Ed), W. van Atteveldt & T.-Q. Peng, Eds., 13–38. New York: Routledge. https://www.taylorfrancis.com/chapters/edit/10.4324/9781003082606-2/applying-lda-topic-modeling-communication-research-toward-valid-reliable-methodology-daniel-maier-waldherr-miltner-wiedemann-niekler-keinert-pfetsch-heyer-reber-h%C3%A4ussler-schmid-petri-adam
Malterud, K., Siersma, V. D., and Guassora, A. D. 2016. Sample size in qualitative interview studies: Guided by information power. Qualitative Health Research, 26(13), 1753–1760. https://doi.org/10.1177/1049732315617444
Männikkö, I., and Husu, J. 2019. Examining teachers’ adaptive expertise through personal practical theories. Teaching and Teacher Education, 77, 126–137. https://doi.org/10.1016/j.tate.2018.09.016
Marion, S., Worthen, M., and Evans, C. 2020. How systems of assessments aligned with competency-based education can support equity. Aurora Institute and Center for Assessment. https://files.eric.ed.gov/fulltext/ED603989.pdf
McClure, C., Smyslova, O., Hall, A., and Jiang, Y. 2024. Deductive coding’s role in AI vs. human performance. In Proceedings of the 17th International Conference on Educational Data Mining (EDM 2024), C. Demmans Epp, B. Paaßen, & D. Joyner, Eds.. https://educationaldatamining.org/edm2024/proceedings/2024.EDM-posters.91/
Merchant, S., Klinger, D., and Love, A. 2018. Assessing and reporting non-cognitive skills: A cross-Canada survey. Canadian Journal of Educational Administration and Policy, (187), 2–17. https://journalhosting.ucalgary.ca/index.php/cjeap/article/view/43135
Morgan, D. L. 2023. Exploring the use of artificial intelligence for qualitative data analysis: The case of ChatGPT. International Journal of Qualitative Methods, 22, 16094069231211248. https://doi.org/10.1177/16094069231211248
Morreale, S., Lowenthal, P., Thorpe, J., and Olesova, L. 2024. Instructional communication competence and instructor social presence: enhancing teaching and learning in the online environment. Frontiers in Communication, 9, 1397570. https://doi.org/10.3389/fcomm.2024.1397570
Nicmanis, M., and Spurrier, H. 2025. Getting started with Artificial Intelligence assisted qualitative analysis: An introductory guide to qualitative research approaches with exploratory examples from reflexive content analysis. International Journal of Qualitative Methods, 24, 16094069251354863. https://doi.org/10.1177/16094069251354863
Player, L., Hughes, R., Mitev, K., Whitmarsh, L., Demski, C., Nash, N., Papakonstantinou, T., and Wilson, M. 2025. The use of large language models for qualitative research: The Deep Computational Text Analyser (DECOTA). Psychological Methods. https://doi.org/10.1037/met0000753
Prescott, M. R., Yeager, S., Ham, L., Rivera Saldana, C. D., Serrano, V., Narez, Paltin, D., Delgado, J., Moore, D., And Montoya, J. 2024. Comparing the efficacy and efficiency of human and generative AI: Qualitative thematic analyses. JMIR AI, 3, e54482. https://ai.jmir.org/2024/1/e54482
Qiao, T., Walker, C., Cunningham, C., and Koh, Y. S. 2025. Thematic-LM: a LLM-based multi-agent system for large-scale thematic analysis. In Proceedings of the ACM on Web Conference 2025, G. Long, M. Blumestein, Y. Chang, L. Lewin-Eytan, H. Huang, & E. Yom-Tov, Eds., 649–658. https://doi.org/10.1145/3696410.3714595
Reimers, N., and Gurevych, I. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), R. Huang & S. Padó, Eds., 3982–3992. Association for Computational Linguistics, Hong Kong, China. https://arxiv.org/abs/1908.10084
Sankaranarayanan, S., Borchers, C., Simon, S., Tajik, E., Ataş, A. H., Celik, B., and Balzan, F. 2025. Automating thematic analysis with multi-agent LLM systems. OSF preprint. https://osf.io/preprints/edarxiv/kq8zh_v1
Schroeder, H., Aubin Le Quéré, M., Randazzo, C., Mimno, D., and Schoenebeck, S. 2025, April. Large Language Models in qualitative research: Uses, tensions, and intentions. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, N. Yamashita, V. Evers, K. Yatani, X. Ding, B. Lee, M. Chetty, & P. Toups-Dugas, Eds., 1–17. https://doi.org/10.1145/3706598.3713120
Singh, C., Inala, J. P., Galley, M., Caruana, R., and Gao, J. 2024. Rethinking interpretability in the era of large language models. arXiv preprint arXiv:2402.01761. https://arxiv.org/abs/2402.01761
Sinha, R., Solola, I., Nguyen, H., Swanson, H., and Lawrence, L. 2024, June. The role of generative AI in qualitative research: GPT-4’s contributions to a grounded theory analysis. In Proceedings of the 2024 Symposium on Learning, Design and Technology, G. Arastoopour Irgens & H. Swanson, Eds., 17–25. https://doi.org/10.1145/3663433.3663456
Stammbach, D., Zouhar, V., Hoyle, A., Sachan, M., and Ash, E. 2023. Revisiting automated topic model evaluation with large language models. arXiv preprint arXiv:2305.12152. https://doi.org/10.48550/arXiv.2305.12152
Sturgis, C., and Casey, K. 2018. Designing for equity: Leveraging competency-based education to ensure all students succeed. CompetencyWorks Final Paper. iNACOL. https://files.eric.ed.gov/fulltext/ED589907.pdf
Than, N., Fan, L., Law, T., Nelson, L. K., and Mccall, L. 2025. Updating “The Future of Coding”: Qualitative coding with generative Large Language Models. Sociological Methods & Research, 5(3), 849–888. https://doi.org/10.1177/00491241251339188
Trent, A., and Cho, J. 2020. Interpretation in qualitative research: What, why, how. In The Oxford Handbook of Qualitative Research (2nd Ed.), P. Leavy, Ed., 956–982, Oxford Handbooks. https://doi.org/10.1093/oxfordhb/9780190847388.013.35
Wang, H., Prakash, N., Hoang, N. K., Hee, M. S., Naseem, U., and Lee, R. K. W. 2023. Prompting large language models for topic modeling. In 2023 IEEE International Conference on Big Data (BigData), A. Cuzzocrea & R. Agrawal, Eds., 1236–1241. https://doi.org/10.1109/BigData59044.2023.10386113
Wang, N., and Lester, J. 2023. K-12 Education in the age of AI: A call to action for K-12 AI literacy. International Journal of Artificial Intelligence in Education, 33(2), 228-232. https://doi.org/10.1007/s40593-023-00358-x
Williams, R. T. 2024. Paradigm shifts: Exploring AI’s influence on qualitative inquiry and analysis. Frontiers in Research Metrics and Analytics, 9, 1331589. https://doi.org/10.3389/frma.2024.1331589
Xiao, Z., Yuan, X., Liao, Q. V., Abdelghani, R., and Oudeyer, P. Y. 2023, March. Supporting qualitative analysis with large language models: Combining codebook with GPT-3 for deductive coding. In Proceedings of the 28th international conference on intelligent user interfaces, F. Chen, M. Billinghurst, & M. Zhou, Eds., 75–78. https://doi.org/10.1145/3581754.3584136
Xing, W., Nixon, N., Crossley, S., Denny, P., Lan, A., Stamper, J., and Yu, Z. 2025. The use of Large Language Models in education. International Journal of Artificial Intelligence in Education, 35, 1–5. https://doi.org/10.1007/s40593-025-00457-x
Yan, L., Echeverria, V., Fernandez-Nieto, G. M., Jin, Y., Swiecki, Z., Zhao, L., Gašević, D., and Martinez-Maldonado, R. 2024. Human-AI collaboration in thematic analysis using ChatGPT: A user study and design recommendations. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, N. Yamashita, V. Evers, K. Yatani, & X. Ding, Eds., 1–7. https://doi.org/10.1145/3613905.3650732
Yang, J., Jin, H., Tang, R., Han, X., Feng, Q., Jiang, H., Zhong, S., Yin, B., and Hu, X. 2024. Harnessing the power of LLMs in practice: A survey on ChatGPT and beyond. ACM Transactions on Knowledge Discovery from Data, 18(6), 1–32. https://doi.org/10.1145/3649506
Yang, X., Zhao, H., Xu, W., Qi, Y., Lu, J., Phung, D., and Du, L. 2024. Neural topic modeling with Large Language Models in the loop. arXiv preprint arXiv:2411.08534. https://arxiv.org/abs/2411.08534
Zhang, H. E., Wu, C., Xie, J., Lyu, Y., Cai, J., and Carroll, J. M. 2025. Harnessing the power of AI in qualitative research: Exploring, using and redesigning ChatGPT. Computers in Human Behavior: Artificial Humans, 4, 100144. https://doi.org/10.1016/j.chbah.2025.100144
Zhang, S., Meshram, P. S., Ganapathy Prasad, P., Israel, M., and Bhat, S. 2025, February. An LLM-based framework for simulating, classifying, and correcting students’ programming knowledge with the SOLO taxonomy. In Proceedings of the 56th ACM Technical Symposium on Computer Science Education v.2, J. A. Stone, T. Yuen, L. Shoop, S. A. Rebelsky, & J. Prather, Eds., 1681–1682. https://dl.acm.org/doi/abs/10.1145/3641555.3705125
Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., Dong, Z., Du, Y., Yang, C., Chen, Y., Chen, Z., Jiang, J., Ren, R., Li, Y., Tang, X., Liu, Z., … Wen, J. R. 2023. A survey of large language models. arXiv preprint arXiv:2303.18223, 1(2). https://doi.org/10.48550/arXiv.2303.18223

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish with this journal agree to the following terms:
- The Author retains copyright in the Work, where the term “Work” shall include all digital objects that may result in subsequent electronic publication or distribution.
- Upon acceptance of the Work, the author shall grant to the Publisher the right of first publication of the Work.
- The Author shall grant to the Publisher and its agents the nonexclusive perpetual right and license to publish, archive, and make accessible the Work in whole or in part in all forms of media now or hereafter known under a Creative Commons 4.0 License (Attribution-Noncommercial-No Derivatives 4.0 International), or its equivalent, which, for the avoidance of doubt, allows others to copy, distribute, and transmit the Work under the following conditions:
- Attribution—other users must attribute the Work in the manner specified by the author as indicated on the journal Web site;
- Noncommercial—other users (including Publisher) may not use this Work for commercial purposes;
- No Derivative Works—other users (including Publisher) may not alter, transform, or build upon this Work,with the understanding that any of the above conditions can be waived with permission from the Author and that where the Work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
- The Author is able to enter into separate, additional contractual arrangements for the nonexclusive distribution of the journal's published version of the Work (e.g., post it to an institutional repository or publish it in a book), as long as there is provided in the document an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post online a pre-publication manuscript (but not the Publisher’s final formatted PDF version of the Work) in institutional repositories or on their Websites prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (see The Effect of Open Access). Any such posting made before acceptance and publication of the Work shall be updated upon publication to include a reference to the Publisher-assigned DOI (Digital Object Identifier) and a link to the online abstract for the final published Work in the Journal.
- Upon Publisher’s request, the Author agrees to furnish promptly to Publisher, at the Author’s own expense, written evidence of the permissions, licenses, and consents for use of third-party material included within the Work, except as determined by Publisher to be covered by the principles of Fair Use.
- The Author represents and warrants that:
- the Work is the Author’s original work;
- the Author has not transferred, and will not transfer, exclusive rights in the Work to any third party;
- the Work is not pending review or under consideration by another publisher;
- the Work has not previously been published;
- the Work contains no misrepresentation or infringement of the Work or property of other authors or third parties; and
- the Work contains no libel, invasion of privacy, or other unlawful matter.
- The Author agrees to indemnify and hold Publisher harmless from Author’s breach of the representations and warranties contained in Paragraph 6 above, as well as any claim or proceeding relating to Publisher’s use and publication of any content contained in the Work, including third-party content.
https://orcid.org/0000-0002-3320-5729