Understanding Student Language: An Unsupervised Dialogue Act Classification Approach
##plugins.themes.bootstrap3.article.main##
##plugins.themes.bootstrap3.article.sidebar##
Abstract
Within the landscape of educational data, textual natural language is an increasingly vast source of learning-centered interactions. In natural language dialogue, student contributions hold important information about knowledge and goals. Automatically modeling the dialogue act of these student utterances is crucial for scaling natural language understanding of educational dialogues. Automatic dialogue act modeling has long been addressed with supervised classification techniques that require substantial manual time and effort. Recently, there is emerging interest in unsupervised dialogue act classification, which addresses the challenges related to manually labeling corpora. This paper builds on the growing body of work in unsupervised dialogue act classification and reports on the novel application of an information retrieval technique, the Markov Random Field, for the task of unsupervised dialogue act classification. Evaluation against manually labeled dialogue acts on a tutorial dialogue corpus in the domain of introductory computer science demonstrates that the proposed technique outperforms existing approaches to education-centered unsupervised dialogue act classification. Unsupervised dialogue act classification techniques have broad application in educational data mining in areas such as collaborative learning, online message boards, classroom discourse, and intelligent tutoring systems.
How to Cite
##plugins.themes.bootstrap3.article.details##
unsupervised dialogue act classification, Markov Random Field, natural language dialogue
ALLEN, J. F., SCHUBERT, L. K., FERGUSON, G., HEEMAN, P., HWANG, C. H., KATO, T.,
LIGHT, M., MARTIN, N., MILLER, B., POESIO, M., ET AL. 1995. The TRAINS project: A case study in building a conversational planning agent. Journal of Experimental & Theoretical Artificial Intelligence 7, 1, 7–48.
ATAPATTU, T., FALKNER, K., AND FALKNER, N. 2014. Acquisition of triples of knowledge from lecture notes: A natural language processing approach. In Proceedings of the International Conference on Educational Data Mining. 193–196.
AUSTIN, J. L. 1975. How to do things with words. Vol. 1955. Oxford university press.
BANGALORE, S., DI FABBRIZIO, G., AND STENT, A. 2008. Learning the structure of taskdriven human–human dialogs. IEEE Transactions on Audio, Speech, and Language Processing 16, 7, 1249–1259.
BECKER, L., BASU, S., AND VANDERWENDE, L. 2012. Mind the gap: learning to choose gaps for question generation. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 742–751.
BOYER, K. E., HA, E. Y., PHILLIPS, R., WALLIS, M. D., VOUK, M. A., AND LESTER, J. C. 2010. Dialogue act modeling in a complex task-oriented domain. In Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics, 297–305.
BOYER, K. E., PHILLIPS, R., INGRAM, A., HA, E. Y., WALLIS, M., VOUK, M., AND
LESTER, J. 2011. Investigating the relationship between dialogue structure and tutoring effectiveness: a hidden Markov modeling approach. International Journal of Artificial Intelligence in Education 21, 1, 65–81.
BOYER, K. E., VOUK, M. A., AND LESTER, J. C. 2007. The influence of learner characteristics on task-oriented tutorial dialogue. In Proceedings of the 13th International Conference on Artificial Intelligence in Education (AIED). 365–372.
CHEN, S. S. AND GOPALAKRISHNAN, P. S. 1998. Clustering via the Bayesian information criterion with applications in speech recognition. In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing. Vol. 2. 645–648.
CORE, M. G. AND ALLEN, J. 1997. Coding dialogs with the DAMSL annotation scheme. In Proceedings of the AAAI Fall Symposium on Communicative Action in Humans and Machines. 28–35.
CROOK, N., GRANELL, R., AND PULMAN, S. 2009. Unsupervised classification of dialogue acts using a Dirichlet process mixture model. In Proceedings of the SIGDIAL 2009 Conference. Association for Computational Linguistics, 341–348.
DI EUGENIO, B., XIE, Z., AND SERAFIN, R. 2010. Dialogue act classification, higher order dialogue structure, and instance-based learning. Dialogue & Discourse 1, 2, 1–24.
D’MELLO, S., OLNEY, A., AND PERSON, N. 2010. Mining collaborative patterns in tutorial dialogues. Journal of Educational Data Mining 2, 1, 2–37.
EZEN-CAN, A. AND BOYER, K. E. 2013. Unsupervised classification of student dialogue acts with query-likelihood clustering. In Proceedings of the International Conference on Educational Data Mining. 20–27.
EZEN-CAN, A. AND BOYER, K. E. 2014a. A Preliminary Investigation of Learner Characteristics for Unsupervised Dialogue Act Classification. In Proceedings of the 7th International Conference on Educational Data Mining (EDM). 373–374.
EZEN-CAN, A. AND BOYER, K. E. 2014b. Combining task and dialogue streams in unsupervised dialogue act models. In Proceedings of the 15th Annual SIGDIAL Meeting on Discourse and Dialogue. 113–122.
FERGUSON, R., WEI, Z., HE, Y., AND BUCKINGHAM SHUM, S. 2013. An evaluation of learning analytics to identify exploratory dialogue in online discussions. In Proceedings of the Third International Conference on Learning Analytics and Knowledge. ACM, 85–93.
FORBES-RILEY, K. AND LITMAN, D. J. 2005. Using bigrams to identify relationships between student certainness states and tutor responses in a spoken dialogue corpus. In Proceedings of the 6th SIGDIAL Workshop on Discourse and Dialogue. 87–96.
FORSYTH, C. M., GRAESSER, A. C., PAVLIK JR, P., CAI, Z., BUTLER, H., HALPERN, D., AND MILLIS, K. 2013. Operation aries!: Methods, mystery, and mixed models: Discourse features predict affect in a serious game. Journal of Educational Data Mining 5, 1, 147–189.
GONZ´ALEZ-BRENES, J. P., MOSTOW, J., AND DUAN, W. 2011. How to classify tutorial dialogue? comparing feature vectors vs. sequences. In Proceedings of the International Conference on Educational Data Mining. 169–178.
GRAESSER, A., PERSON, N. K., AND MAGLIANO, J. P. 1995. Collaborative dialogue patterns in naturalistic one-to-one tutoring. Applied cognitive psychology 9, 6, 495–522.
GRAESSER, A. C., VANLEHN, K., ROS´E, C. P., JORDAN, P. W., AND HARTER, D. 2001. Intelligent tutoring systems with conversational dialogue. AI magazine 22, 4, 39.
HIGASHINAKA, R., KAWAMAE, N., SADAMITSU, K., MINAMI, Y., MEGURO, T., DOHSAKA, K., AND INAGAKI, H. 2011. Unsupervised clustering of utterances using non-parametric Bayesian methods. In INTERSPEECH. 2081–2084.
JOTY, S., CARENINI, G., AND LIN, C.-Y. 2011. Unsupervised modeling of dialog acts in asynchronous conversations. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 1807–1813.
JURAFSKY, D. AND MARTIN, J. H. 2000. Speech & language processing. Pearson Education.
KLEIN, D. AND MANNING, C. D. 2003. Accurate unlexicalized parsing. Proceedings of the 41st Meeting of the Association for Computational Linguistics, 423–430.
KUMAR, R., BEUTH, J. L., AND ROS´E , C. P. 2011. Conversational strategies that support idea generation productivity in groups. In Proceedings of the Computer Supported Collaborative Learning (CSCL) Conference. 398–405.
LEE, D., JEONG, M., KIM, K., RYU, S., AND LEE, G. 2013. Unsupervised spoken language understanding for a multi-domain dialog system. In IEEE Transactions On Audio, Speech, and Language Processing. Vol. 21. 2451–2464.
LITMAN, D. J., ROS´E, C. P., FORBES-RILEY, K., VANLEHN, K., BHEMBE, D., AND SILLIMAN, S. 2006. Spoken versus typed human and computer dialogue tutoring. International Journal of Artificial Intelligence in Education 16, 2, 145–170.
MANNING, C. D., RAGHAVAN, P., AND SCH¨U TZE, H. 2008. Introduction to information retrieval. Vol. 1. Cambridge University Press.
MARCUS, M. P., SANTORINI, B., AND MARCINKIEWICZ, M. A. 1993. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 19, 2, 313– 330.
MARINEAU, J., WIEMER-HASTINGS, P., HARTER, D., OLDE, B., CHIPMAN, P., KARNAVAT, A., POMEROY, V., RAJAN, S., GRAESSER, A., GROUP, T. R., ET AL. 2000. Classification of speech acts in tutorial dialog. In Proceedings of the Workshop on Modeling Human Teaching Tactics and Strategies at the Intelligent Tutoring Systems 2000 Conference. 65– 71.
METZLER, D. AND CROFT, W. B. 2005. A Markov random field model for term dependencies. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and development in information retrieval. 472–479.
MOSTOW, J., BECK, J., CEN, H., CUNEO, A., GOUVEA, E., AND HEINER, C. 2005. An educational data mining tool to browse tutor-student interactions: Time will tell. In Proceedings of the Workshop on Educational Data Mining, National Conference on Artificial Intelligence. 15–22.
NG, R. T. AND HAN, J. 1994. Efficient and effective clustering methods for spatial data mining. In Proceedings of the 20th International Conference on Very Large Data Bases. 144–155.
NIRAULA, N. B., RUS, V., STEFANESCU, D., AND GRAESSER, A. C. 2014. Mining gapfill questions from tutorial dialogues. In Proceedings of the International Conference on Educational Data Mining. 265–268.
QUARTERONI, S., IVANOV, A. V., AND RICCARDI, G. 2011. Simultaneous dialog act segmentation and classification from human-human spoken conversations. In IEEE International Conference on Acoustics, Speech and Signal Processing. 5596–5599.
RANGARAJAN SRIDHAR, V. K., BANGALORE, S., AND NARAYANAN, S. 2009. Combining lexical, syntactic and prosodic cues for improved online dialog act tagging. Computer Speech & Language 23, 4, 407–422.
RICARDO, B.-Y. AND RIBEIRO-NETO, B. 1999. Modern information retrieval. Vol. 463. ACM press, New York.
RITTER, A., CHERRY, C., AND DOLAN, B. 2010. Unsupervised modeling of Twitter conversations. In Proceedings of the Association for Computational Linguistics. 172–180.
RUS, V., MOLDOVAN, C., NIRAULA, N., AND GRAESSER, A. C. 2012. Automated discovery of speech act categories in educational games. In Proceedings of the International Educational Data Mining Society. 25–32.
SADOHARA, K., KOJIMA, H., NARITA, T., NIHEI, M., KAMATA, M., ONAKA, S., FUJITA, Y., AND INOUE, T. 2013. Sub-lexical dialogue act classification in a spoken dialogue system support for the elderly with cognitive disabilities. In Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies. 93–98.
SAMEI, B., LI, H., KESHTKAR, F., RUS, V., AND GRAESSER, A. C. 2014. Context-based speech act classification in intelligent tutoring systems. In Proceedings of International Conference on Intelligent Tutoring Systems. 236–241.
SEARLE, J. R. 1969. Speech acts: An essay in the philosophy of language. Cambridge university press.
SERAFIN, R. AND DI EUGENIO, B. 2004. FLSA: Extending latent semantic analysis with features for dialogue act classification. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. 692–699.
SHRIBERG, E., STOLCKE, A., JURAFSKY, D., COCCARO, N., METEER, M., BATES, R.,
TAYLOR, P., RIES, K., MARTIN, R., AND VAN ESS-DYKEMA, C. 1998. Can prosody aid the automatic classification of dialog acts in conversational speech? Language and speech 41, 3-4, 443–492.
STEFANESCU, D., RUS, V., AND GRAESSER, A. C. 2014. Towards assessing students’ prior knowledge from tutorial dialogues. In Proceedings of the International Conference on Educational Data Mining. 197–200.
STOLCKE, A., RIES, K., COCCARO, N., SHRIBERG, E., BATES, R., JURAFSKY, D., TAYLOR, P., MARTIN, R., VAN ESS-DYKEMA, C., AND METEER, M. 2000. Dialogue act modeling for automatic tagging and recognition of conversational speech. Computational linguistics 26, 3, 339–373.
STROHMAN, T., METZLER, D., TURTLE, H., AND CROFT, W. B. 2005. Indri: A language model-based search engine for complex queries. In Proceedings of the International Conference on Intelligent Analysis. Vol. 2. 2–6.
TRAUM, D. R. 1999. Speech acts for dialogue agents. In Foundations of Rational Agency. Springer, 169–201.
WEERASINGHE, A., MITROVIC, A., AND MARTIN, B. 2009. Towards individualized dialogue support for ill-defined domains. International Journal of Artificial Intelligence in Education 19, 4, 357–379.
WEN, M., YANG, D., AND ROS´E , C. P. 2014. Sentiment analysis in MOOC discussion forums: What does it tell us? In Proceedings of the International Conference on Educational Data Mining.
XIONG, W. AND LITMAN, D. 2014. Evaluating topic-word review analysis for understand- ing student peer review performance. In Proceedings of the International Conference on Educational Data Mining.
XU, X., MURRAY, T., WOOLF, B. P., AND SMITH, D. 2013. Mining social deliberation in online communication–if you were me and I were you. In Proceedings of the International Conference on Educational Data Mining.
YOO, J. AND KIM, J. 2014. Capturing difficulty expressions in student online Q&A discussions. In Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence. 208–214.
ZHAI, C. AND LAFFERTY, J. 2001. A study of smoothing methods for language models applied to ad-hoc information retrieval. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 334–342.
Authors who publish with this journal agree to the following terms:
- The Author retains copyright in the Work, where the term “Work” shall include all digital objects that may result in subsequent electronic publication or distribution.
- Upon acceptance of the Work, the author shall grant to the Publisher the right of first publication of the Work.
- The Author shall grant to the Publisher and its agents the nonexclusive perpetual right and license to publish, archive, and make accessible the Work in whole or in part in all forms of media now or hereafter known under a Creative Commons 4.0 License (Attribution-Noncommercial-No Derivatives 4.0 International), or its equivalent, which, for the avoidance of doubt, allows others to copy, distribute, and transmit the Work under the following conditions:
- Attribution—other users must attribute the Work in the manner specified by the author as indicated on the journal Web site;
- Noncommercial—other users (including Publisher) may not use this Work for commercial purposes;
- No Derivative Works—other users (including Publisher) may not alter, transform, or build upon this Work,with the understanding that any of the above conditions can be waived with permission from the Author and that where the Work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
- The Author is able to enter into separate, additional contractual arrangements for the nonexclusive distribution of the journal's published version of the Work (e.g., post it to an institutional repository or publish it in a book), as long as there is provided in the document an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post online a pre-publication manuscript (but not the Publisher’s final formatted PDF version of the Work) in institutional repositories or on their Websites prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (see The Effect of Open Access). Any such posting made before acceptance and publication of the Work shall be updated upon publication to include a reference to the Publisher-assigned DOI (Digital Object Identifier) and a link to the online abstract for the final published Work in the Journal.
- Upon Publisher’s request, the Author agrees to furnish promptly to Publisher, at the Author’s own expense, written evidence of the permissions, licenses, and consents for use of third-party material included within the Work, except as determined by Publisher to be covered by the principles of Fair Use.
- The Author represents and warrants that:
- the Work is the Author’s original work;
- the Author has not transferred, and will not transfer, exclusive rights in the Work to any third party;
- the Work is not pending review or under consideration by another publisher;
- the Work has not previously been published;
- the Work contains no misrepresentation or infringement of the Work or property of other authors or third parties; and
- the Work contains no libel, invasion of privacy, or other unlawful matter.
- The Author agrees to indemnify and hold Publisher harmless from Author’s breach of the representations and warranties contained in Paragraph 6 above, as well as any claim or proceeding relating to Publisher’s use and publication of any content contained in the Work, including third-party content.