The Pinyin Tutor has been used the past few years at over thirty institutions around the world to teach students to transcribe spoken Chinese phrases into Pinyin. Large amounts of data have been collected from this program on the types of errors students make on this task. We analyze these data to discover what makes this task difficult and use our findings to iteratively improve the tutor. For instance, is a particular set of consonants, vowels, or tones causing the most difficulty? Or perhaps do certain challenges arise in the context in which these sounds are spoken? Since each Pinyin phrase can be broken down into a set of features (for example, consonants, vowel sounds, and tones), we apply machine learning techniques to uncover the most confounding aspects of this task. We then exploit what we learned to construct and maintain an accurate representation of what the student knows for best individual instruction. Our goal is to allow the learner to focus on the aspects of the task on which he or she is having most difficulty, thereby accelerating his or her understanding of spoken Chinese beyond what would be possible without such focused "intelligent" instruction.
How to Cite
Pinyin Tutor, least angle regression (LARS), LIBLINEAR-trained model, understanding of spoken Chinese, knowledge tracing, Hidden Markov Model
BEST, C. T. 1995. A direct realist view of cross-language speech perception. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 171–204). Baltimore: York Press.
BOURGERIE, D. S. 2003. Computer aided language learning for Chinese: A survey and annotated bibliography. Journal of the Chinese Language Teachers Association, 38(2), 17-47.
CASELLA, G. AND BERGER, R. L. 2002. Statistical Inference (2nd. Edition) Duxbury Press, Pacific Grove: California.
CEN, H. 2009, Generalized Learning Factors Analysis: Improving Cognitive Models with Machine Learning. Unpublished doctoral dissertation. Carnegie Mellon University, Pittsburgh, PA.
CHAO, Y-R. 1968. A Grammar of Spoken Chinese. Berkeley & Los Angeles: University of California.
CHI, M., KOEDINGER, K., GORDON, G., JORDAN, P., AND VANLEHN, K. 2011. Instructional factors analysis: A cognitive model for multiple instructional interventions. Proceedings of the 4th International Conference on Educational Data Mining (EDM 2011). pp. 61-70.
CORBETT, A. AND ANDERSON, J. 1995. Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction, Volume 4, Issue 4, pp. 253-278.
CORBETT, A.T., KAUFFMAN, L., MACLAREN, B., WAGNER, A., AND JONES, E. 2010. A Cognitive Tutor for genetics problem solving: Learning gains and student modeling. Journal of Educational Computing Research, 42, 219-239.
DAVIES, G., OTTO, S. E., & RÜSCHOFF, B. 2012. Historical perspectives on CALL. Contemporary Computer-Assisted Language Learning, 18, 19.
DUANMU, S. 2007. The Phonology of Standard Chinese. (2nd Edition). New York: Oxford University Press.
EFRON, B., HASTIE, T., JOHNSTONE, I., AND TIBSHIRANI, R. 2004. Least Angle Regression, Annals of Statistics, Volume 32, Issue 2, pp. 407-499.
FAN, R.-E., CHANG, K.-W., HSIEH, C.-J., WANG, X.-R., AND LIN, C.-J. 2008. LIBLINEAR: A library for large linear classification, Journal of Machine Learning Research. Volume 9, pp. 1871-1874.
FLEGE, J. E. 1995. Second language speech learning: Theory, findings, and problems. Speech perception and linguistic experience: Issues in cross-language research, 233-277.
FLEGE, J. AND MACKAY, I. 2004. Perceiving vowels in a second language. Studies in Second Language Acquisition, Volume 26, pp. 1-34.
GAMPER, J., & KNAPP, J. 2002. A review of intelligent CALL systems. Computer Assisted Language Learning, 15(4), 329-342.
GRAESSER, A. C., CONLEY, M. W., & OLNEY, A. 2012. Intelligent tutoring systems. APA handbook of educational psychology. Washington, DC: American Psychological Association.
FRIEDMAN, J., HASTIE, T., AND TIBSHIRANI, R. 2010. Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software. Volume 33, Issue 3.
HASTIE, T., AND EFRON, B. 2012. lars: Least Angle Regression, Lasso and Forward Stagewise. R package version 1.1. http://CRAN.R-project.org/package=lars
HASTIE, T., AND EFRON, B. 2007. Matrix: Sparse and Dense Matrix Classes and Methods. URL: http://cran.r-project.org/web/packages/Matrix/index.html (last retrieved on June 30, 2011).
HASTIE, T., AND FRIEDMAN, J. H. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd Edition). New York: Springer.
HEIFT, T. 2010. Developing an intelligent language tutor. CALICO journal, 27(3), 443-459.
HEIFT, T. 2008. Modeling learner variability in CALL. Computer Assisted Language Learning, 21(4), 305-321.
HEILMAN, M., AND ESKENAZI, M. 2006. Language Learning: Challenges for Intelligent Tutoring Systems. Proceedings of the Workshop of Intelligent Tutoring Systems for Ill-Defined Domains. Presented at The 8th International Conference on Intelligent Tutoring Systems.
KOEDINGER, K. R., & ANDERSON, J. R. 1993. Reifying Implicit Planning in Geometry: Guidelines for Model-Based Intelligent Tutoring System Design. In S. P. Lajoie, Ed. & S. J. Derry, Ed (Eds.), Computers as Cognitive Tools (pp. 15-45). Hillsdale, New Jersey: Lawrence Erlbaum Associates, Publishers.
KOEDINGER, K. R., ANDERSON, J. R., HADLEY, W.H., & MARK, M. A. 1997. Intelligent tutoring goes to school in the big city. International Journal of Artificial Intelligence in Education, 8, 30-43.
KOH, K., KIM, S., AND BOYD, S. 2007. An Interior-Point Method for Large-Scale L1- Regularized Logistic Regression, Journal of Machine Learning Research. Number 8, pp. 1519-1555.
KOH, K., KIM, S.J., AND BOYD, S. 2009. l1_logreg: A large-scale solver for l1-regularized logistic regression problems. URL: http://www.stanford.edu/~boyd/l1_logreg/ (last retrieved on June 30, 2011).
MAJOR, R. C. 2001. Foreign accent. Amsterdam: Benjamins.
MICHAUD, L. N., MCCOY, K. F., & PENNINGTON, C. A. 2000. An intelligent tutoring system for deaf learners of written English. In Proceedings of the fourth international ACM conference on Assistive technologies (pp. 92-100). ACM.
MISLEVY, R. J., STEINBERG, L. S., ALMOND, R. G., AND LUKAS, J. F. 2006. Concepts, terminology and basic models of evidence-centered design. In Williamson, D. M.,
Mislevy, R. J., and Bejar, I. I. (Eds.), Automated Scoring of Complex Tasks in Computer-Based Testing, Lawrence Erlbaum Associates, pp.15-47. PAVLIK JR, P. I., BRAWNER, K., OLNEY, A., & MITROVIC, A. 2013. A Review of Student Models Used in Intelligent Tutoring Systems. Design Recommendations for Intelligent Tutoring Systems, 39.
PAVLIK, P. I., CEN, H., AND KOEDINGER, K. R. 2009. Performance factors analysis—a new alternative to knowledge tracing. Proceedings of the 2009 conference on Artificial Intelligence in Education. IOS Press, pp.531–538. The Pittsburgh Science of Learning Center 2013. Robust Learning. URL: http://www.learnlab.org/research/wiki/index.php/Robust_learning (last retrieved on January 24, 2013)
RABINER, L. R. 1989. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE, Volume77, Issue 3 pp2. 2, pp.257-286
SHEI, C., & HSIEH, H. P. 2012. Linkit: a CALL system for learning Chinese characters, words, and phrases. Computer Assisted Language Learning, 25(4), 319-338.
TIBSHIRANI, R. 1996. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society, Series B (Methodological), Volume 58, Issue 1, pp. 267-288.
ZHANG, Y. 2009. Cue Focusing for Robust Phonological Perception in Chinese. Unpublished doctoral dissertation. Carnegie Mellon University, Pittsburgh, PA.
ZHAO, H., KOEDINGER, K., & KOWALSKI, J. 2013. Knowledge tracing and cue contrast: Second language English grammar instruction. Proceedings of the 35th Annual Conference of the Cognitive Science Society.
Authors who publish with this journal agree to the following terms:
- The Author retains copyright in the Work, where the term “Work” shall include all digital objects that may result in subsequent electronic publication or distribution.
- Upon acceptance of the Work, the author shall grant to the Publisher the right of first publication of the Work.
- The Author shall grant to the Publisher and its agents the nonexclusive perpetual right and license to publish, archive, and make accessible the Work in whole or in part in all forms of media now or hereafter known under a Creative Commons 4.0 License (Attribution-Noncommercial-No Derivatives 4.0 International), or its equivalent, which, for the avoidance of doubt, allows others to copy, distribute, and transmit the Work under the following conditions:
- Attribution—other users must attribute the Work in the manner specified by the author as indicated on the journal Web site;
- Noncommercial—other users (including Publisher) may not use this Work for commercial purposes;
- No Derivative Works—other users (including Publisher) may not alter, transform, or build upon this Work,with the understanding that any of the above conditions can be waived with permission from the Author and that where the Work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
- The Author is able to enter into separate, additional contractual arrangements for the nonexclusive distribution of the journal's published version of the Work (e.g., post it to an institutional repository or publish it in a book), as long as there is provided in the document an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post online a pre-publication manuscript (but not the Publisher’s final formatted PDF version of the Work) in institutional repositories or on their Websites prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (see The Effect of Open Access). Any such posting made before acceptance and publication of the Work shall be updated upon publication to include a reference to the Publisher-assigned DOI (Digital Object Identifier) and a link to the online abstract for the final published Work in the Journal.
- Upon Publisher’s request, the Author agrees to furnish promptly to Publisher, at the Author’s own expense, written evidence of the permissions, licenses, and consents for use of third-party material included within the Work, except as determined by Publisher to be covered by the principles of Fair Use.
- The Author represents and warrants that:
- the Work is the Author’s original work;
- the Author has not transferred, and will not transfer, exclusive rights in the Work to any third party;
- the Work is not pending review or under consideration by another publisher;
- the Work has not previously been published;
- the Work contains no misrepresentation or infringement of the Work or property of other authors or third parties; and
- the Work contains no libel, invasion of privacy, or other unlawful matter.
- The Author agrees to indemnify and hold Publisher harmless from Author’s breach of the representations and warranties contained in Paragraph 6 above, as well as any claim or proceeding relating to Publisher’s use and publication of any content contained in the Work, including third-party content.