Understanding Student Language: An Unsupervised Dialogue Act Classification Approach



Published Feb 24, 2015
Aysu Ezen-Can Kristy Elizabeth Boyer


Within the landscape of educational data, textual natural language is an increasingly vast source of learning-centered interactions. In natural language dialogue, student contributions hold important information about knowledge and goals. Automatically modeling the dialogue act of these student utterances is crucial for scaling natural language understanding of educational dialogues. Automatic dialogue act modeling has long been addressed with supervised classification techniques that require substantial manual time and effort. Recently, there is emerging interest in unsupervised dialogue act classification, which addresses the challenges related to manually labeling corpora. This paper builds on the growing body of work in unsupervised dialogue act classification and reports on the novel application of an information retrieval technique, the Markov Random Field, for the task of unsupervised dialogue act classification.  Evaluation against manually labeled dialogue acts on a tutorial dialogue corpus in the domain of introductory computer science demonstrates that the proposed technique outperforms existing approaches to education-centered unsupervised dialogue act classification. Unsupervised dialogue act classification techniques have broad application in educational data mining in areas such as collaborative learning, online message boards, classroom discourse, and intelligent tutoring systems.

How to Cite

Ezen-Can, A., & Boyer, K. E. (2015). Understanding Student Language: An Unsupervised Dialogue Act Classification Approach. JEDM | Journal of Educational Data Mining, 7(1), 51-78. Retrieved from https://jedm.educationaldatamining.org/index.php/JEDM/article/view/JEDM095
Abstract 445 | PDF Downloads 166


ALEVEN, V., POPESCU, O., AND KOEDINGER, K. R. 2001. Towards tutorial dialog to support self- explanation : Adding natural language understanding to a Cognitive Tutor. Proceedings of Artificial Intelligence in Education, 246–255.

ALLEN, J. F., SCHUBERT, L. K., FERGUSON, G., HEEMAN, P., HWANG, C. H., KATO, T., LIGHT, M., MARTIN, N., MILLER, B., POESIO, M., ET AL. 1995. The TRAINS project: A case study in building a conversational planning agent. Journal of Experimental & Theoretical Artificial Intelligence 7, 1, 7–48.

ATAPATTU, T., FALKNER, K., AND FALKNER, N. 2014. Acquisition of triples of knowledge from lecture notes: A natural language processing approach. In Proceedings of the International Conference on Educational Data Mining. 193–196.

AUSTIN, J. L. 1975. How to do things with words. Vol. 1955. Oxford university press.

BANGALORE, S., DI FABBRIZIO, G., AND STENT, A. 2008. Learning the structure of taskdriven human–human dialogs. IEEE Transactions on Audio, Speech, and Language Processing 16, 7, 1249–1259.

BECKER, L., BASU, S., AND VANDERWENDE, L. 2012. Mind the gap: learning to choose gaps for question generation. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 742–751.

BOYER, K. E., HA, E. Y., PHILLIPS, R., WALLIS, M. D., VOUK, M. A., AND LESTER, J. C. 2010. Dialogue act modeling in a complex task-oriented domain. In Proceedings 24 74 Journal of Educational Data Mining, Volume 7, No 1, 2015of the 11th Annual Meeting of the Special Interest Group
on Discourse and Dialogue. Association for Computational Linguistics, 297–305.

BOYER, K. E., PHILLIPS, R., INGRAM, A., HA, E. Y., WALLIS, M., VOUK, M., AND LESTER, J. 2011. Investigating the relationship between dialogue structure and tutoring effectiveness: a hidden Markov modeling approach. International Journal of Artificial Intelligence in Education 21, 1, 65–81.

BOYER, K. E., VOUK, M. A., AND LESTER, J. C. 2007. The influence of learner characteristics on task-oriented tutorial dialogue. In Proceedings of the 13th International Conference on Artificial Intelligence in Education (AIED). 365–372.

CHEN, S. S. AND GOPALAKRISHNAN, P. S. 1998. Clustering via the Bayesian information criterion with applications in speech recognition. In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing. Vol. 2. 645–648.

CORE, M. G. AND ALLEN, J. 1997. Coding dialogs with the DAMSL annotation scheme. In Proceedings of the AAAI Fall Symposium on Communicative Action in Humans and Machines. 28–35.

CROOK, N., GRANELL, R., AND PULMAN, S. 2009. Unsupervised classification of dialogue acts using a Dirichlet process mixture model. In Proceedings of the SIGDIAL 2009 Conference. Association for Computational Linguistics, 341–348.

DI EUGENIO, B., XIE, Z., AND SERAFIN, R. 2010. Dialogue act classification, higher order dialogue structure, and instance-based learning. Dialogue & Discourse 1, 2, 1–24. D’MELLO, S., OLNEY, A., AND PERSON, N. 2010. Mining collaborative patterns in tutorial dialogues. Journal of
Educational Data Mining 2, 1, 2–37.

EZEN-CAN, A. AND BOYER, K. E. 2013. Unsupervised classification of student dialogue acts with query-likelihood clustering. In Proceedings of the International Conference on Educational Data Mining. 20–27.

EZEN-CAN, A. AND BOYER, K. E. 2014a. A Preliminary Investigation of Learner Characteristics for Unsupervised Dialogue Act Classification. In Proceedings of the 7th International Conference on Educational Data Mining (EDM). 373–374.

EZEN-CAN, A. AND BOYER, K. E. 2014b. Combining task and dialogue streams in unsupervised dialogue act models. In Proceedings of the 15th Annual SIGDIAL Meeting on Discourse and Dialogue. 113–122.

FERGUSON, R., WEI, Z., HE, Y., AND BUCKINGHAM SHUM, S. 2013. An evaluation of learning analytics to identify exploratory dialogue in online discussions. In Proceedings of the Third International Conference on Learning Analytics and Knowledge. ACM, 85– 93.

FORBES-RILEY, K. AND LITMAN, D. J. 2005. Using bigrams to identify relationships between student certainness states and tutor responses in a spoken dialogue corpus. In Proceedings of the 6th SIGDIAL Workshop on Discourse and Dialogue. 87–96.

FORSYTH, C. M., GRAESSER, A. C., PAVLIK JR, P., CAI, Z., BUTLER, H., HALPERN, D., AND MILLIS, K. 2013. Operation aries!: Methods, mystery, and mixed models: Discourse features predict affect in a serious game. Journal of Educational Data Mining 5, 1, 147–189.

GONZALEZ-BRENES, J. P., MOSTOW, J., AND DUAN, W. 2011. How to classify tutorial dialogue? comparing feature vectors vs. sequences. In Proceedings of the International Conference on Educational Data Mining. 169–178.