Published Apr 1, 2014
John Kowalski Yanhui Zhang Geoffrey J. Gordon


The Pinyin Tutor has been used the past few years at over thirty institutions around the world to teach students to transcribe spoken Chinese phrases into Pinyin. Large amounts of data have been collected from this program on the types of errors students make on this task. We analyze these data to discover what makes this task difficult and use our findings to iteratively improve the tutor. For instance, is a particular set of consonants, vowels, or tones causing the most difficulty? Or perhaps do certain challenges arise in the context in which these sounds are spoken? Since each Pinyin phrase can be broken down into a set of features (for example, consonants, vowel sounds, and tones), we apply machine learning techniques to uncover the most confounding aspects of this task. We then exploit what we learned to construct and maintain an accurate representation of what the student knows for best individual instruction. Our goal is to allow the learner to focus on the aspects of the task on which he or she is having most difficulty, thereby accelerating his or her understanding of spoken Chinese beyond what would be possible without such focused "intelligent" instruction.


BECK , J. E., & SISON , J. 2006. Using knowledge tracing in a noisy environment to measure
student reading proficiencies. International Journal of Artificial Intelligence in
Education, 16(2), 129-143.

BEST , C. T. 1995. A direct realist view of cross-language speech perception. In W. Strange
(Ed.), Speech perception and linguistic experience: Issues in cross-language research
(pp. 171–204). Baltimore: York Press.

BOURGERIE , D. S. 2003. Computer aided language learning for Chinese: A survey and
annotated bibliography. Journal of the Chinese Language Teachers Association, 38(2),

CASELLA , G. and BERGER , R. L. 2002. Statistical Inference (2nd. Edition) Duxbury Press,
Pacific Grove: California.

CEN , H. 2009, Generalized Learning Factors Analysis: Improving Cognitive Models with
Machine Learning. Unpublished doctoral dissertation. Carnegie Mellon University,
Pittsburgh, PA.

CHAO , Y-R. 1968. A Grammar of Spoken Chinese. Berkeley & Los Angeles: University of

CHI , M., KOEDINGER , K., GORDON , G., JORDAN , P., and VANLEHN , K. 2011. Instructional
factors analysis: A cognitive model for multiple instructional interventions.
Proceedings of the 4th International Conference on Educational Data Mining (EDM
2011). pp. 61-70.

CORBETT , A. and ANDERSON , J. 1995. Knowledge tracing: Modeling the acquisition of
procedural knowledge. User Modeling and User-Adapted Interaction, Volume 4, Issue
4, pp. 253-278.

CORBETT , A.T., KAUFFMAN , L., MAC L AREN , B., WAGNER , A., and JONES , E. 2010. A
Cognitive Tutor for genetics problem solving: Learning gains and student modeling.
Journal of Educational Computing Research, 42, 219-239.

DAVIES , G., OTTO , S. E., & R ÜSCHOFF , B. 2012. Historical perspectives on CALL.
Contemporary Computer-Assisted Language Learning, 18, 19.DUANMU , S. 2007. The Phonology of Standard Chinese. (2nd Edition). New York: Oxford
University Press.

EFRON , B., HASTIE , T., JOHNSTONE , I., and TIBSHIRANI , R. 2004. Least Angle Regression,
Annals of Statistics, Volume 32, Issue 2, pp. 407-499.

FAN , R.-E., CHANG , K.-W., HSIEH , C.-J., WANG , X.-R., and LIN , C.-J. 2008. LIBLINEAR: A
library for large linear classification, Journal of Machine Learning Research. Volume
9, pp. 1871-1874.

FLEGE , J. E. 1995. Second language speech learning: Theory, findings, and problems. Speech
perception and linguistic experience: Issues in cross-language research, 233-277.

FLEGE , J. and MACKAY , I. 2004. Perceiving vowels in a second language. Studies in
Second Language Acquisition, Volume 26, pp. 1-34.

GAMPER , J., & KNAPP , J. 2002. A review of intelligent CALL systems. Computer Assisted
Language Learning, 15(4), 329-342.

GRAESSER , A. C., CONLEY , M. W., & OLNEY , A. 2012. Intelligent tutoring systems. APA
handbook of educational psychology. Washington, DC: American Psychological

FRIEDMAN , J., HASTIE , T., and TIBSHIRANI , R. 2010. Regularization Paths for Generalized
Linear Models via Coordinate Descent. Journal of Statistical Software. Volume 33,
Issue 3.

HASTIE , T., and EFRON , B. 2012. lars: Least Angle Regression, Lasso and Forward Stagewise.
R package version 1.1. http://CRAN.R-project.org/package=lars

HASTIE , T., and EFRON , B. 2007. Matrix: Sparse and Dense Matrix Classes and Methods.
URL: http://cran.r-project.org/web/packages/Matrix/index.html (last retrieved on June
30, 2011).

HASTIE , T., and FRIEDMAN , J. H. 2009. The Elements of Statistical Learning: Data Mining,
Inference, and Prediction (2nd Edition). New York: Springer.

HEIFT , T. 2010. Developing an intelligent language tutor. CALICO journal, 27(3), 443-459.

HEIFT , T. 2008. Modeling learner variability in CALL. Computer Assisted Language
Learning, 21(4), 305-321.

HEILMAN , M., and ESKENAZI , M. 2006. Language Learning: Challenges for Intelligent
Tutoring Systems. Proceedings of the Workshop of Intelligent Tutoring Systems for
Ill-Defined Domains. Presented at The 8th International Conference on Intelligent
Tutoring Systems.

KOEDINGER , K. R., & ANDERSON , J. R. 1993. Reifying Implicit Planning in Geometry:
Guidelines for Model-Based Intelligent Tutoring System Design. In S. P. Lajoie, Ed. &
S. J. Derry, Ed (Eds.), Computers as Cognitive Tools (pp. 15-45). Hillsdale, New
Jersey: Lawrence Erlbaum Associates, Publishers.

KOEDINGER , K. R., ANDERSON , J. R., HADLEY , W.H., & MARK , M. A. 1997. Intelligent
tutoring goes to school in the big city. International Journal of Artificial Intelligence
in Education, 8, 30-43.

KOH , K., KIM , S., and BOYD , S. 2007. An Interior-Point Method for Large-Scale L1-
Regularized Logistic Regression, Journal of Machine Learning Research. Number 8,
pp. 1519-1555.

KOH , K., KIM , S.J., and BOYD , S. 2009. l1_logreg: A large-scale solver for l1-regularized
logistic regression problems. URL: http://www.stanford.edu/~boyd/l1_logreg/ (last
retrieved on June 30, 2011).

MAJOR , R. C. 2001. Foreign accent. Amsterdam: Benjamins.

MICHAUD , L. N., MCCOY , K. F., & PENNINGTON , C. A. 2000. An intelligent tutoring system for
deaf learners of written English. In Proceedings of the fourth international ACM
conference on Assistive technologies (pp. 92-100). ACM.

MISLEVY , R. J., STEINBERG , L. S., ALMOND , R. G., and LUKAS , J. F. 2006. Concepts,
terminology and basic models of evidence-centered design. In Williamson, D. M.,

Mislevy, R. J., and Bejar, I. I. (Eds.), Automated Scoring of Complex Tasks in
Computer-Based Testing, Lawrence Erlbaum Associates, pp.15-47.

PAVLIK J R , P. I., BRAWNER , K., OLNEY , A., & MITROVIC , A. 2013. A Review of Student
Models Used in Intelligent Tutoring Systems. Design Recommendations for Intelligent
Tutoring Systems, 39.

PAVLIK , P. I., CEN , H., and KOEDINGER , K. R. 2009. Performance factors analysis—a new
alternative to knowledge tracing. Proceedings of the 2009 conference on Artificial
Intelligence in Education. IOS Press, pp.531–538.
The Pittsburgh Science of Learning Center 2013. Robust Learning. URL:
http://www.learnlab.org/research/wiki/index.php/Robust_learning (last retrieved on
January 24, 2013)

RABINER , L. R. 1989. A Tutorial on Hidden Markov Models and Selected Applications in
Speech Recognition. Proceedings of the IEEE, Volume77, Issue 3 pp2. 2, pp.257-286

SHEI , C., & HSIEH , H. P. 2012. Linkit: a CALL system for learning Chinese characters, words,
and phrases. Computer Assisted Language Learning, 25(4), 319-338.

TIBSHIRANI , R. 1996. Regression Shrinkage and Selection via the Lasso. Journal of the Royal
Statistical Society, Series B (Methodological), Volume 58, Issue 1, pp. 267-288.

ZHANG , Y. 2009. Cue Focusing for Robust Phonological Perception in Chinese. Unpublished
doctoral dissertation. Carnegie Mellon University, Pittsburgh, PA.