Multi-Armed Bandits for Intelligent Tutoring Systems
##plugins.themes.bootstrap3.article.main##
##plugins.themes.bootstrap3.article.sidebar##
Abstract
We present an approach to Intelligent Tutoring Systems which adaptively personalizes sequences of learning activities to maximize skills acquired by students, taking into account the limited time and motivational resources. At a given point in time, the system proposes to the students the activity which makes them progress faster. We introduce two algorithms that rely on the empirical estimation of the learning progress, RiARiT that uses information about the difficulty of each exercise and ZPDES that uses much less knowledge about the problem. The system is based on the combination of three approaches. First, it leverages recent models of intrinsically motivated learning by transposing them to active teaching, relying on empirical estimation of learning progress provided by specific activities to particular students. Second, it uses state-of-the-art Multi-Arm Bandit (MAB) techniques to efficiently manage the exploration/exploitation challenge of this optimization process. Third, it leverages expert knowledge to constrain and bootstrap initial exploration of the MAB, while requiring only coarse guidance information of the expert and allowing the system to deal with didactic gaps in its knowledge. The system is evaluated in a scenario where 7-8 year old schoolchildren learn how to decompose numbers while manipulating money. Systematic experiments are presented with simulated students, followed by results of a user study across a population of 400 school children.
How to Cite
##plugins.themes.bootstrap3.article.details##
intelligent tutoring systems, multi-armed bandits, personalization, intrinsic motivation, active teaching, active learning
AUER, P., CESA-BIANCHI, N., FREUND, Y., AND SCHAPIRE, R. 2003. The nonstochastic multiarmed bandit problem. SIAM Journal on Computing 32, 1, 48–77.
AZAR, M. G., LAZARIC, A., AND BRUNSKILL, E. 2013. Sequential transfer in multi-armed bandit with finite set of models. In NIPS. 2220–2228.
BAKER, R. S., CORBETT, A. T., AND ALEVEN, V. 2008. More accurate student modeling through contextual estimation of slip and guess probabilities in bayesian knowledge tracing. In Intelligent Tutoring Systems. 406–415.
BARNES, T., STAMPER, J., AND CROY, M. 2011. Using markov decision processes for automatic hint generation. Handbook of Educational Data Mining, 467.
BECK, J. E. AND CHANG, K.-M. 2007. Identifiability: A fundamental problem of student modeling. In User Modeling 2007. Springer, 137–146.
BECK, J. E. AND XIONG, X. 2013. Limits to accuracy: How well can we do at student modeling? In Educational Data Mining.
BERLYNE, D. 1960. Conflict, arousal, and curiosity. McGraw-Hill Book Company.
BRUNSKILL, E. AND RUSSELL, S. 2010. Rapid: A reachable anytime planner for imprecisely-sensed domains. In UAI.
BUBECK, S. AND CESA-BIANCHI, N. 2012. Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends R in Stochastic Systems 1, 4.
CHI, M., VANLEHN, K., LITMAN, D., AND JORDAN, P. 2011. Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies. User Modeling and User-Adapted Interaction 21, 1, 137–180.
CLEMENT, B., ROY, D., OUDEYER, P.-Y., AND LOPES, M. 2014. Online optimization of teaching sequences with multi-armed bandits. In Educational Data Mining (EDM’14).
CORBETT, A. AND ANDERSON, J. 1994. Knowledge tracing: Modeling the acquisition of procedural knowledge. User modeling and user-adapted interaction 4, 4, 253–278.
CSIKSZENTMIHALYI, I. S. 1992. Optimal experience: Psychological studies of flow in consciousness. Cambridge University Press.
DESMARAIS, M. C. 2011. Performance comparison of item-to-item skills models with the IRT single latent trait model. In User Modeling, Adaption and Personalization. Springer, 75–86.
DHANANI, A., LEE, S. Y., PHOTHILIMTHANA, P., AND PARDOS, Z. 2014. A comparison of error metrics for learning model parameters in bayesian knowledge tracing. In Inter. Conf. on Educational Data Mining Workshops.
ENGESER, S. AND RHEINBERG, F. 2008. Flow, performance and moderators of challenge-skill balance. Motivation and Emotion 32, 3, 158–172.
GAGNE, R. M. AND BRIGGS, L. J. 1974. Principles of instructional design. Holt, Rinehart & Winston.
GERTNER, A. S., CONATI, C., AND VANLEHN, K. 1998. Procedural help in andes: Generating hints using a bayesian network student model. AAAI/IAAI 1998, 106–11.
GONZ´A LEZ-BRENES, J., HUANG, Y., AND BRUSILOVSKY, P. 2014. General features in knowledge tracing: Applications to multiple subskills, temporal item response theory, and expert knowledge. In Inter. Conf. on Educational Data Mining.
GONZ´A LEZ-BRENES, J. P. AND MOSTOW, J. 2012. Dynamic cognitive tracing: Towards unified discovery of student and cognitive models. In EDM. 49–56.
GOTTLIEB, J., OUDEYER, P.-Y., LOPES, M., AND BARANES, A. 2013. Information-seeking, curiosity, and attention: computational and neural mechanisms. Trends in Cognitive Sciences 17, 11, 585–593.
HABGOOD, M. J. AND AINSWORTH, S. E. 2011. Motivating children to learn effectively: Exploring the value of intrinsic integration in educational games. The Journal of the Learning Sciences 20, 2, 169–206.
HAMBLETON, R. K. 1991. Fundamentals of item response theory. Vol. 2. Sage publications.
KOEDINGER, K., ANDERSON, J., HADLEY, W., MARK, M., ET AL. 1997. Intelligent tutoring goes to school in the big city. Inter. Journal of Artificial Intelligence in Education (IJAIED) 8, 30–43.
KOEDINGER, K. R., BRUNSKILL, E., BAKER, R. S., MCLAUGHLIN, E. A., AND STAMPER, J. 2013. New potentials for data-driven intelligent tutoring system development and optimization. AI Magazine.
KRZYWINSKI, M., SCHEIN, J., BIROL, ˙I., CONNORS, J., GASCOYNE, R., HORSMAN, D., JONES, S. J., AND MARRA, M. A. 2009. Circos: an information aesthetic for comparative genomics. Genome research 19, 9, 1639–1645.
LEE, C. D. 2005. Signifying in the zone of proximal development. An introduction to Vygotsky 2, 253– 284.
LEE, J. AND BRUNSKILL, E. 2012. The impact on individualizing student models on necessary practice opportunities. In Inter. Conf. on Educational Data Mining (EDM).
LOPES, M. AND OUDEYER, P.-Y. 2012. The strategic student approach for life-long exploration and learning. In IEEE Inter. Conf. on Development and Learning (ICDL’12). San Diego, USA.
LUCKIN, R. 2001. Designing childrens software to ensure productive interactivity through collaboration in the zone of proximal development (zpd). Information Technology in Childhood Education Annual 2001, 1, 57–85.
NKAMBOU, R., MIZOGUCHI, R., AND BOURDEAU, J. 2010. Advances in intelligent tutoring systems. Vol. 308. Springer.
OUDEYER, P. AND KAPLAN, F. 2007. What is intrinsic motivation? a typology of computational approaches. Frontiers in Neurorobotics 1.
RAFFERTY, A., BRUNSKILL, E., GRIFFITHS, T., AND SHAFTO, P. 2011. Faster teaching by pomdp planning. In Artificial Intelligence in Education. Springer, 280–287.
ROY, D. 2012. Usage d’un robot pour la rem´ediation en math´ematiques. M.S. thesis, Universit´e de Bordeaux.
SCHATTEN, C., JANNING, R., MAVRIKIS, M., AND SCHMIDT-THIEME, L. 2014. Matrix factorization feasibility for sequencing and adaptive support in its. In 7th International Conference on Educational Data Mining EDM 2014.
SEMET, Y., YAMONT, Y., BIOJOUT, R., LUTON, E., AND COLLET, P. 2003. Artificial ant colonies and e-learning: An optimisation of pedagogical paths. In International Conference on Human-Computer Interaction.
SHUTE, V. J. 2011. Stealth assessment in computer-based games to support learning. Computer games and instruction 55, 2, 503–524.
SHUTE, V. J., HANSEN, E. G., AND ALMOND, R. G. 2008. You can’t fatten a hog by weighing it–or can you? evaluating an assessment for learning system called aced. International Journal of Artificial Intelligence in Education 18, 4, 289–316.
WANG, Y. AND HEFFERNAN, N. 2013. Extending knowledge tracing to allow partial credit: using continuous versus binary nodes. In Artificial Intelligence in Education. Springer, 181–188.
Authors who publish with this journal agree to the following terms:
- The Author retains copyright in the Work, where the term “Work” shall include all digital objects that may result in subsequent electronic publication or distribution.
- Upon acceptance of the Work, the author shall grant to the Publisher the right of first publication of the Work.
- The Author shall grant to the Publisher and its agents the nonexclusive perpetual right and license to publish, archive, and make accessible the Work in whole or in part in all forms of media now or hereafter known under a Creative Commons 4.0 License (Attribution-Noncommercial-No Derivatives 4.0 International), or its equivalent, which, for the avoidance of doubt, allows others to copy, distribute, and transmit the Work under the following conditions:
- Attribution—other users must attribute the Work in the manner specified by the author as indicated on the journal Web site;
- Noncommercial—other users (including Publisher) may not use this Work for commercial purposes;
- No Derivative Works—other users (including Publisher) may not alter, transform, or build upon this Work,with the understanding that any of the above conditions can be waived with permission from the Author and that where the Work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
- The Author is able to enter into separate, additional contractual arrangements for the nonexclusive distribution of the journal's published version of the Work (e.g., post it to an institutional repository or publish it in a book), as long as there is provided in the document an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post online a pre-publication manuscript (but not the Publisher’s final formatted PDF version of the Work) in institutional repositories or on their Websites prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (see The Effect of Open Access). Any such posting made before acceptance and publication of the Work shall be updated upon publication to include a reference to the Publisher-assigned DOI (Digital Object Identifier) and a link to the online abstract for the final published Work in the Journal.
- Upon Publisher’s request, the Author agrees to furnish promptly to Publisher, at the Author’s own expense, written evidence of the permissions, licenses, and consents for use of third-party material included within the Work, except as determined by Publisher to be covered by the principles of Fair Use.
- The Author represents and warrants that:
- the Work is the Author’s original work;
- the Author has not transferred, and will not transfer, exclusive rights in the Work to any third party;
- the Work is not pending review or under consideration by another publisher;
- the Work has not previously been published;
- the Work contains no misrepresentation or infringement of the Work or property of other authors or third parties; and
- the Work contains no libel, invasion of privacy, or other unlawful matter.
- The Author agrees to indemnify and hold Publisher harmless from Author’s breach of the representations and warranties contained in Paragraph 6 above, as well as any claim or proceeding relating to Publisher’s use and publication of any content contained in the Work, including third-party content.