We review the history and current trends in the field of Educational Data Mining (EDM). We consider the methodological profile of research in the early years of EDM, compared to in 2008 and 2009, and discuss trends and shifts in the research conducted by this community. In particular, we discuss the increased emphasis on prediction, the emergence of work using existing models to make scientific discoveries ("discovery with models"), and the reduction in the frequency of relationship mining within the EDM community. We discuss two ways that researchers have attempted to categorize the diversity of research in educational data mining research, and review the types of research problems that these methods have been used to address. The most cited papers in EDM between 1995 and 2005 are listed, and their influence on the EDM community (and beyond the EDM community) is discussed.
How to Cite
educational data mining, visualization, prediction, clustering, relationship mining, discovery with models
ALEVEN, V. and KOEDINGER, K.R. 2001. Investigations into help seeking and learning with a Cognitive Tutor. In Proceedings of the AIED-2001 Workshop on Help Provision and Help Seeking in Interactive Learning Environments, 47-58. R. LUCKIN Ed.
BAKER, R.S., CORBETT, A.T. and KOEDINGER, K.R. 2004. Detecting Student Misuse of Intelligent Tutoring Systems. In Proceedings of the 7th International Conference on Intelligent Tutoring Systems, Maceio, Brazil, 531-540.
BAKER, R.S.J.D. 2007. Modeling and Understanding Students' Off-Task Behavior in Intelligent Tutoring Systems. In Proceedings of the ACM CHI 2007: Computer-Human Interaction conference, 1059-1068.
BAKER, R.S.J.D. in press. Data Mining For Education. In International Encyclopedia of Education (3rd edition), B. MCGAW, PETERSON, P., BAKER Ed. Elsevier, Oxford, UK.
BAKER, R.S.J.D., BARNES, T. and BECK, J.E. 2008. 1 st International Conference on Educational Data Mining, Montreal, Quebec, Canada.
BARNES, T. 2005. The q-matrix method: Mining student response data for knowledge. In Proceedings of the AAAI-2005 Workshop on Educational Data Mining.
BARNES, T., DESMARAIS, M., ROMERO, C. and VENTURA, S. 2009. Educational Data Mining 2009: 2nd International Conference on Educational Data Mining, Proceedings, Cordoba, Spain.
BARTNECK, C. and HU, J. 2009. Scientometric Analysis of the CHI Proceedings. In Proceedings of the Conference on Human Factors in Computing Systems (CHI2009), 699-708.
BECK, J. and WOOLF, B. 2000. High-level student modeling with machine learning. In Proceedings of the International Conference on Intelligent tutoring systems, 584-593.
BECK, J.E. 2007. Difficulties in inferring student knowledge from observations (and why you should care). Proceedings of the AIED2007 Workshop on Educational Data Mining, 21-30.
BECK, J.E. and MOSTOW, J. 2008. How who should practice: Using learning decomposition to evaluate the efficacy of different types of practice for different types of students. In Proceedings of the 9th International Conference on Intelligent Tutoring Systems, 353-362.
CHOQUET, C., LUENGO, V. and YACEF, K. 2005. Proceedings of "Usage Analysis in Learning Systems" workshop, held in conjunction with AIED 2005, Amsterdam, The Netherlands, July 2005.
COCEA, M., HERSHKOVITZ, A. and BAKER, R.S.J.D. 2009. The Impact of Off-task and Gaming Behaviors on Learning: Immediate or Aggregate? In Proceedings of the 14th International Conference on Artificial Intelligence in Education, 507-514.
CORBETT, A.T. 2001. Cognitive Computer Tutors: Solving the Two-Sigma Problem. In Proceedings of the International Conference on User Modeling, 137-147. D'MELLO, S.K., CRAIG, S.D., WITHERSPOON, A.W., MCDANIEL, B.T. and
GRAESSER, A.C. 2008. Automatic Detection of Learner's Affect from Conversational Cues. User Modeling and User-Adapted Interaction 18, 45-80.
DEKKER, G., PECHENIZKIY, M. and VLEESHOUWERS, J. 2009. Predicting Students Drop Out: A Case Study. In Proceedings of the International Conference on Educational Data Mining, Cordoba, Spain, T. BARNES, M. DESMARAIS, C. ROMERO and S. VENTURA Eds., 41-50.
DESMARAIS, M.C. and PU, X. 2005. A Bayesian Student Model without Hidden Nodes and Its Comparison with Item Response Theory. International Journal of Artificial Intelligence in Education 15, 291-323.
DONMEZ, P., ROSÉ, C., STEGMANN, K., WEINBERGER, A. and FISCHER, F. 2005. Supporting CSCL with automatic corpus analysis technology. In Proceedings of the International Conference of Computer Support for Collaborative Learning (CSCL 2005), 125-134.
GONG, Y., RAI, D., BECK, J. and HEFFERNAN, N. 2009. Does Self-Discipline Impact Students' Knowledge and Learning? In Proceedings of the 2nd International Conference on Educational Data Mining, 61-70.
JEONG, H. and BISWAS, G. 2008. Mining Student Behavior Models in Learning-byTeaching Environments. In Proceedings of the 1st International Conference on Educational Data Mining, 127-136.
KAY, J., MAISONNEUVE, N., YACEF, K. and REIMANN, P. 2006. The Big Five and Visualisations of Team Work Activity. In Intelligent Tutoring Systems, M. IKEDA, K.D. ASHLEY and T.-W. CHAN Eds. Springer-Verlag, Taiwan, 197-206.
KOEDINGER, K.R., CUNNINGHAM, K., A., S. and LEBER, B. 2008. An open repository and analysis tools for fine-grained, longitudinal learner data. In Proceedings of the 1st International Conference on Educational Data Mining, 157-166.
MADHYASTHA, T. and TANIMOTO, S. 2009. Student Consistency and Implications for Feedback in Online Assessment Systems. In Proceedings of the 2nd International Conference on Educational Data Mining, 81-90.
MAVRIKIS, M. 2008. Data-driven modeling of students' interactions in an ILE. In Proceedings of the 1st International Conference on Educational Data Mining, 87-96.
MCQUIGGAN, S., MOTT, B. and LESTER, J. 2008. Modeling Self-Efficacy in Intelligent Tutoring Systems: An Inductive Approach. User Modeling and User-Adapted Interaction 18, 81-123.
MERCERON, A. and YACEF, K. 2003. A Web-based Tutoring Tool with Mining Facilities to Improve Learning and Teaching. In 11th International Conference on Artificial Intelligence in Education., F. VERDEJO and U. HOPPE Eds. IOS Press, Sydney, 201-208.
MERCERON, A. and YACEF, K. 2005. Educational Data Mining: a Case Study. In Artificial Intelligence in Education (AIED2005), C.-K. LOOI, G. MCCALLA, B. BREDEWEG and J. BREUKER Eds. IOS Press, Amsterdam, The Netherlands, 467-474.
MOORE, A.W. 2006. Statistical Data Mining Tutorials. Downloaded 1 August 2009 from http://www.autonlab.org/tutorials/
PAVLIK, P., CEN, H. and KOEDINGER, K.R. 2009. Learning Factors Transfer Analysis: Using Learning Curve Analysis to Automatically Generate Domain Models. In Proceedings of the 2nd International Conference on Educational Data Mining, 121-130.
PAVLIK, P., CEN, H., WU, L. and KOEDINGER, K. 2008. Using Item-type Performance Covariance to Improve the Skill Model of an Existing Tutor. In Proceedings of the 1st International Conference on Educational Data Mining, 77-86.
PECHENIZKIY, M., CALDERS, T., VASILYEVA, E. and DE BRA, P. 2008. Mining the Student Assessment Data: Lessons Drawn from a Small Scale Case Study. In Proceedings of the 1st International Conference on Educational Data Mining, 187-191.
PERERA, D., KAY, J., KOPRINSKA, I., YACEF, K. and ZAIANE, O. 2009. Clustering and sequential pattern mining to support team learning. IEEE Transactions on Knowledge and Data Engineering 21, 759-772
ROMERO, C. and VENTURA, S. 2007. Educational Data Mining: A Survey from 1995 to 2005. Expert Systems with Applications 33, 125-146.
ROMERO, C., VENTURA, S., DE BRA, P. and CASTRO, C. 2003. Discovering prediction rules in aha! courses. In Proceedings of the International Conference on User Modeling, 25–34.
ROMERO, C., VENTURA, S., ESPEJO, P.G. and HERVAS, C. 2008. Data Mining Algorithms to Classify Students. In Proceedings of the 1st International Conference on Educational Data Mining, 8-17.
SCHOFIELD, J. 1995. Computers and Classroom Culture. Cambridge University Press Cambridge, UK.
SUPERBY, J.F., VANDAMME, J.-P. and MESKENS, N. 2006. Determination of factors influencing the achievement of the first-year university students using data mining methods. In Proceedings of the Workshop on Educational Data Mining at the 8th International Conference on Intelligent Tutoring Systems (ITS 2006), 37-44.
TAIT, K., HARTLEY, J.R. and ANDERSON, R.C. 1973. Feedback Procedures in Computer-Assisted Arithmetic Instruction. British Journal of Educational Psychology 43, 161-171.
TANG, T. and MCCALLA, G. 2004. Utilizing Artificial Learners to Help Overcome the Cold-Start Problem in a Pedagogically-Oriented Paper Recommendation System. In Proceedings of the International Conference on Adaptive Hypermedia, 245-254.
TANG, T. and MCCALLA, G. 2005. Smart recommendation for an evolving e-learning system: architecture and experiment. International Journal on E-Learning 4, 105-129.
TANIMOTO, S.L. 2007. Improving the Prospects for Educational Data Mining. In Proceedings of the Complete On-Line Proceedings of the Workshop on Data Mining for User Modeling, at the 11th International Conference on User Modeling (UM 2007), 106- 110.
WITTEN, I.H. and FRANK, E. 1999. Data mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Fransisco, CA.
ZAÏANE, O. 2001. Web usage mining for a better web-based learning environment. In Proceedings of conference on advanced technology for education, 60-64.
ZAÏANE, O. 2002. Building a recommender agent for e-learning systems. In Proceedings of the International Conference on Computers in Education, 55–59.
Authors who publish with this journal agree to the following terms:
- The Author retains copyright in the Work, where the term “Work” shall include all digital objects that may result in subsequent electronic publication or distribution.
- Upon acceptance of the Work, the author shall grant to the Publisher the right of first publication of the Work.
- The Author shall grant to the Publisher and its agents the nonexclusive perpetual right and license to publish, archive, and make accessible the Work in whole or in part in all forms of media now or hereafter known under a Creative Commons 4.0 License (Attribution-Noncommercial-No Derivatives 4.0 International), or its equivalent, which, for the avoidance of doubt, allows others to copy, distribute, and transmit the Work under the following conditions:
- Attribution—other users must attribute the Work in the manner specified by the author as indicated on the journal Web site;
- Noncommercial—other users (including Publisher) may not use this Work for commercial purposes;
- No Derivative Works—other users (including Publisher) may not alter, transform, or build upon this Work,with the understanding that any of the above conditions can be waived with permission from the Author and that where the Work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
- The Author is able to enter into separate, additional contractual arrangements for the nonexclusive distribution of the journal's published version of the Work (e.g., post it to an institutional repository or publish it in a book), as long as there is provided in the document an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post online a pre-publication manuscript (but not the Publisher’s final formatted PDF version of the Work) in institutional repositories or on their Websites prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (see The Effect of Open Access). Any such posting made before acceptance and publication of the Work shall be updated upon publication to include a reference to the Publisher-assigned DOI (Digital Object Identifier) and a link to the online abstract for the final published Work in the Journal.
- Upon Publisher’s request, the Author agrees to furnish promptly to Publisher, at the Author’s own expense, written evidence of the permissions, licenses, and consents for use of third-party material included within the Work, except as determined by Publisher to be covered by the principles of Fair Use.
- The Author represents and warrants that:
- the Work is the Author’s original work;
- the Author has not transferred, and will not transfer, exclusive rights in the Work to any third party;
- the Work is not pending review or under consideration by another publisher;
- the Work has not previously been published;
- the Work contains no misrepresentation or infringement of the Work or property of other authors or third parties; and
- the Work contains no libel, invasion of privacy, or other unlawful matter.
- The Author agrees to indemnify and hold Publisher harmless from Author’s breach of the representations and warranties contained in Paragraph 6 above, as well as any claim or proceeding relating to Publisher’s use and publication of any content contained in the Work, including third-party content.