AutoML Feature Engineering for Student Modeling Yields High Accuracy, but Limited Interpretability
##plugins.themes.bootstrap3.article.main##
##plugins.themes.bootstrap3.article.sidebar##
Abstract
Automatic machine learning (AutoML) methods automate the time-consuming, feature-engineering process so that researchers produce accurate student models more quickly and easily. In this paper, we compare two AutoML feature engineering methods in the context of the National Assessment of Educational Progress (NAEP) data mining competition. The methods we compare, Featuretools and TSFRESH (Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests), have rarely been applied in the context of student interaction log data. Thus, we address research questions regarding the accuracy of models built with AutoML features, how AutoML feature types compare to each other and to expert-engineered features, and how interpretable the features are. Additionally, we developed a novel feature selection method that addresses problems applying AutoML feature engineering in this context, where there were many heterogeneous features (over 4,000) and relatively few students. Our entry to the NAEP competition placed 3rd overall on the final held-out dataset and 1st on the public leaderboard, with a final Cohen's kappa = .212 and area under the receiver operating characteristic curve (AUC) = .665 when predicting whether students would manage their time effectively on a math assessment. We found that TSFRESH features were significantly more effective than either Featuretools features or expert-engineered features in this context; however, they were also among the most difficult features to interpret based on a survey of six experts' judgments. Finally, we discuss the tradeoffs between effort and interpretability that arise in AutoML-based student modeling.
How to Cite
##plugins.themes.bootstrap3.article.details##
AutoML, feature engineering, feature selection, student modeling
ALYUZ, N., OKUR, E., GENC, U., ASLAN, S., TANRIOVER, C., AND ESME, A.A. 2017. An unobtrusive and multimodal approach for behavioral engagement detection of students. In Proceedings of the 1st ACM SIGCHI International Workshop on Multimodal Interaction for Education. Association for Computing Machinery, New York, NY, USA, 26–32.
BAKER, B., GUPTA, O., NAIK, N., AND RASKAR, R. 2017. Designing neural network architectures using reinforcement learning. arXiv:1611.02167 [cs].
BENJAMINI, Y. AND HOCHBERG, Y. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological) 57, 1, 289–300.
BREIMAN, L. 2001. Random forests. Machine Learning 45, 1, 5–32.
BREIMAN, L., FRIEDMAN, J., STONE, C.J., AND OLSHEN, R.A. 1984. Classification and regression trees. CRC Press.
CHEN, F. AND CUI, Y. 2020. Utilizing student time series behaviour in learning management systems for early prediction of course performance. Journal of Learning Analytics 7, 2, 1–17.
CHEN, T. AND GUESTRIN, C. 2016. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACMSIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 785–794.
CHRIST, M., BRAUN, N., NEUFFER, J., AND KEMPA-LIEHR, A.W. 2018. Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python package). Neurocomputing 307, 72–77.
COHEN, J. 1988. Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum, Hillsdale, NJ.
DANG, S.C. AND KOEDINGER, K.R. 2020. Opportunities for human-AI collaborative tools to advance development of motivation analytics. In Companion Proceedings of the 10th International Conference on Learning Analytics & Knowledge (LAK20). SoLAR, 322–329.
EYBEN, F., WÖLLMER, M., AND SCHULLER, B. 2010. openSMILE: The Munich versatile and fast open-source audio feature extractor. In Proceedings of the 18th ACM International Conference on Multimedia. ACM, New York, NY, USA, 1459–1462.
FEI, M. AND YEUNG, D.-Y. 2015. Temporal models for predicting student dropout in massive open online courses. In 2015 IEEE International Conference on Data Mining Workshop (ICDMW). IEEE, 256–263.
FEURER, M., EGGENSPERGER, K., FALKNER, S., LINDAUER, M., AND HUTTER, F. 2020. Auto-sklearn 2.0: The next generation. arXiv:2007.04074 [cs, stat].
FISCHER, C., PARDOS, Z.A., BAKER, R.S., ET AL. 2020. Mining big data in education: Affordances and challenges. Review of Research in Education 44, 1, 130–160.
FULCHER, B.D. AND JONES, N.S. 2017. hctsa: A computational framework for automated time-series phenotyping using massive feature extraction. Cell Systems 5, 5, 527-531.e3.
GERVET, T., KOEDINGER, K., SCHNEIDER, J., AND MITCHELL, T. 2020. When is deep learning the best approach to knowledge tracing? Journal of Educational Data Mining 12, 3, 31–54.
GEURTS, P., ERNST, D., AND WEHENKEL, L. 2006. Extremely randomized trees. Machine Learning 63, 1, 3–42.
GOSWAMI, M., MANUJA, M., AND LEEKHA, M. 2020. Towards social & engaging peer learning: Predicting backchanneling and disengagement in children. arXiv:2007.11346 [cs].
HEAD, T., MECHC ODER, LOUPPE, G., ET AL. 2018. scikit-optimize/scikit-optimize: v0.5.2..
HOLLANDS, F. AND BAKIR, I. 2015. Efficiency of automated detectors of learner engagement and affect compared with traditional observation methods. New York, NY: Center for Benefit-Cost Studies of Education, Teachers College, Columbia University.
HORN, F., PACK, R., AND RIEGER, M. 2020. The autofeat Python library for automated feature engineering and selection. In Machine Learning and Knowledge Discovery in Databases, P. Cellier and K. Driessens, Eds. Springer International Publishing, Cham, CH, 111–120.
HUR, P., BOSCH, N., PAQUETTE, L., AND MERCIER, E. 2020. Harbingers of collaboration? The role of early-class behaviors in predicting collaborative problem solving. In Proceedings of the 13th International Conference on Educational Data Mining (EDM 2020). International Educational Data Mining Society, 104–114.
HUTTER, F., KOTTHOFF, L., AND VANSCHOREN, J. 2019. Automated Machine Learning: Methods, Systems, Challenges. Springer Nature, Cham, CH.
JIANG, Y., BOSCH, N., BAKER, R.S., ET AL. 2018. Expert feature-engineering vs. deep neural networks: Which is better for sensor-free affect detection? In Proceedings of the 19th International Conference on Artificial Intelligence in Education (AIED 2018), C.P. Rosé, R. Martínez-Maldonado, H.U. Hoppe, et al., Eds. Springer, Cham, CH, 198–211.
KANTER, J.M. AND VEERAMACHANENI, K. 2015. Deep feature synthesis: Towards automating data science endeavors. In 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 1–10.
KARUMBAIAH, S., OCUMPAUGH, J., LABRUM, M., AND BAKER, R.S. 2019. Temporally rich features capture variable performance associated with elementary students’ lower math self-concept. In Companion Proceedings of the 9th International Learning Analytics and Knowledge Conference (LAK’19). Society for Learning Analytics Research (SoLAR), Tempe, AZ, USA, 384–388.
KAY, J. 2000. Stereotypes, student models and scrutability. In Proceedings of the 5th International Conference on Intelligent Tutoring Systems, G. Gauthier, C. Frasson and K. VanLehn, Eds. Springer, Berlin, Heidelberg, 19–30.
KHAJAH, M., LINDSEY, R.V., AND MOZER, M.C. 2016. How deep is knowledge tracing? In Proceedings of the 9th International Conference on Educational Data Mining (EDM 2016), T. Barnes, M. Chi and M. Feng, Eds. International Educational Data Mining Society, 94–101.
KONONENKO, I. 1994. Estimating attributes: Analysis and extensions of RELIEF. In European Conference on Machine Learning (ECML 94), F. Bergadano and L.D. Raedt, Eds. Berlin Heidelberg: Springer, 171–182.
KUHN, M. 2008. Building predictive models in R using the caret package. Journal of Statistical Software 28, 5, 1–26.
LANG, M., BINDER, M., RICHTER, J., ET AL. 2019. mlr3: A modern object-oriented machine learning framework in R. Journal of Open Source Software 4, 44, 1903.
LE, T.T., FU, W., AND MOORE, J.H. 2020. Scaling tree-based automated machine learning to biomedical big data with a feature set selector. Bioinformatics 36, 1, 250–256.
LE CUN, Y., BENGIO, Y., AND HINTON, G. 2015. Deep learning. Nature 521, 7553, 436–444.
LUNDBERG, S.M. AND LEE, S.-I. 2017. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30, I. Guyon, U.V. Luxburg, S. Bengio, et al., Eds. Curran Associates, Inc., 4765–4774.
MOHAMAD, N., AHMAD, N.B., JAWAWI, D.N.A., AND HASHIM, S.Z.M. 2020. Feature engineering for predicting MOOC performance. IOP Conference Series: Materials Science and Engineering 884, 012070.
OLSON, R.S., URBANOWICZ, R.J., ANDREWS, P.C., LAVENDER, N.A., KIDD, L.C., AND MOORE, J.H. 2016. Automating biomedical data science through tree-based pipeline optimization. In Applications of Evolutionary Computation, G. Squillero and P. Burelli, Eds. Springer International Publishing, Cham, CH, 123–137.
PAQUETTE, L., BAKER, R.S., DECARVALHO, A., AND OCUMPAUGH, J. 2015. Cross-system transfer of machine learned and knowledge engineered models of gaming the system. In Proceedings of the 23rd International Conference on User Modeling, Adaptation and Personalization (UMAP 2015), F. Ricci, K. Bontcheva, O. Conlan and S. Lawless, Eds. Springer International Publishing, Cham, CH, 183–194.
PAQUETTE, L., DECARVAHLO, A.M.J.A., BAKER, R.S., AND OCUMPAUGH, J. 2014. Reengineering the feature distillation process: A case study in detection of gaming the system. In Proceedings of the 7th International Conference on Educational Data Mining (EDM 2014). Educational Data Mining Society, 284–287.
PARDOS, Z.A., FAN, Z., AND JIANG, W. 2019. Connectionist recommendation in the wild: On the utility and scrutability of neural networks for personalized course guidance. User Modeling and User-Adapted Interaction 29, 2, 487–525.
PARDOS, Z.A., TANG, S., D AVIS, D., AND LE, C.V. 2017. Enabling real-time adaptivity in MOOCs with a personalized next-step recommendation framework. In Proceedings of the Fourth (2017) ACM Conference on Learning @ Scale. Association for Computing Machinery, New York, NY, 23–32.
PEDREGOSA, F., VAROQUAUX, G., GRAMFORT, A., ET AL. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830.
PIECH, C., BASSEN, J., HUANG, J., ET AL. 2015. Deep knowledge tracing. In Advances in Neural Information Processing Systems 28 (NIPS 2015), C. Cortes, N.D. Lawrence, D.D. Lee, M. Sugiyama and R. Garnett, Eds. Curran Associates, Inc., 505–513.
RICKER, N. 1953. The form and laws of propagation of seismic wavelets. Geophysics 18, 1, 10–40.
ROSÉ, C.P., MCLAUGHLIN, E.A., LIU, R., AND KOEDINGER, K.R. 2019. Explanatory learner models: Why machine learning (alone) is not the answer. British Journal of Educational Technology 50, 6, 2943–2958.
SANYAL, D., BOSCH, N., AND PAQUETTE, L. 2020. Feature selection metrics: Similarities, differences, and characteristics of the selected models. In Proceedings of the 13th International Conference on Educational Data Mining (EDM 2020). International Educational Data Mining Society, 212–223.
SEGEDY, J.R., KINNEBREW, J.S., AND BISWAS, G. 2015. Using coherence analysis to characterize self-regulated learning behaviours in open-ended learning environments. Journal of Learning Analytics 2, 1, 13–48.
SEN, A., PATEL, P., RAU, M.A., ET AL. 2018. Machine beats human at sequencing visuals for perceptual-fluency practice. In Proceedings of the 11th International Conference on Educational Data Mining (EDM 2018), K.E. Boyer and M. Yudelson, Eds. International Educational Data Mining Society.
SHAHROKHIANG HAHFAROKHI, B., SIVARAMAN, A., AND VANLEHN, K. 2020. Toward an automatic speech classifier for the teacher. In Proceedings of the 21st International Conference on Artificial Intelligence in Education (AIED 2020), I.I. Bittencourt, M. Cukurova, K. Muldner, R. Luckin and E. Millán, Eds. Springer International Publishing, Cham, CH, 279–284.
SIMARD, P.Y., AMERSHI, S., CHICKERING, D.M., ET AL. 2017. Machine teaching: A new paradigm for building machine learning systems. arXiv:1707.06742 [cs, stat].
STANDEN, P.J., BROWN, D.J., TAHERI, M., ET AL. 2020. An evaluation of an adaptive learning system based on multimodal affect recognition for learners with intellectual disabilities. British Journal of Educational Technology 51, 5, 1748–1765.
THORNTON, C., HUTTER, F., HOOS, H.H., AND LEYTON-BROWN, K. 2013. Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge discovery and data mining. Association for Computing Machinery, New York, NY, USA, 847–855.
TSIAKMAKI, M., KOSTOPOULOS, G., KOTSIANTIS, S., AND RAGOS, O. 2020. Implementing AutoML in educational data mining for prediction tasks. Applied Sciences 10, 1, 90.
VISWANATHAN, S.A. AND VANLEHN, K. 2019. Collaboration detection that preserves privacy of students’ speech. In Proceedings of the 20th International Conference on Artificial Intelligence in Education (AIED 2019), S. Isotani, E. Millán, A. Ogan, P. Hastings, B. McLaren and R. Luckin, Eds. Springer International Publishing, Cham, CH, 507–517.
XIONG, X., ZHAO, S., VANINWEGEN, E.G., AND BECK, J.E. 2016. Going deeper with deep knowledge tracing. In Proceedings of the 9th International Conference on Educational Data Mining (EDM 2016). International Educational Data Mining Society, 545–550.
ZEHNER, F., HARRISON, S., EICHMANN, B., ET AL. 2020. The NAEP EDM competition: On the value of theory-driven psychometrics and machine learning for predictions based on log data. In Proceedings of The 13th International Conference on Educational DataMining (EDM 2020), A.N. Rafferty, J. Whitehill, V. Cavalli-Sforza and C. Romero, Eds. International Educational Data Mining Society, 302–312.
ZOPH, B. AND LE, Q.V. 2017. Neural architecture search with reinforcement learning. arXiv:1611.01578 [cs].
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish with this journal agree to the following terms:
- The Author retains copyright in the Work, where the term “Work” shall include all digital objects that may result in subsequent electronic publication or distribution.
- Upon acceptance of the Work, the author shall grant to the Publisher the right of first publication of the Work.
- The Author shall grant to the Publisher and its agents the nonexclusive perpetual right and license to publish, archive, and make accessible the Work in whole or in part in all forms of media now or hereafter known under a Creative Commons 4.0 License (Attribution-Noncommercial-No Derivatives 4.0 International), or its equivalent, which, for the avoidance of doubt, allows others to copy, distribute, and transmit the Work under the following conditions:
- Attribution—other users must attribute the Work in the manner specified by the author as indicated on the journal Web site;
- Noncommercial—other users (including Publisher) may not use this Work for commercial purposes;
- No Derivative Works—other users (including Publisher) may not alter, transform, or build upon this Work,with the understanding that any of the above conditions can be waived with permission from the Author and that where the Work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
- The Author is able to enter into separate, additional contractual arrangements for the nonexclusive distribution of the journal's published version of the Work (e.g., post it to an institutional repository or publish it in a book), as long as there is provided in the document an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post online a pre-publication manuscript (but not the Publisher’s final formatted PDF version of the Work) in institutional repositories or on their Websites prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (see The Effect of Open Access). Any such posting made before acceptance and publication of the Work shall be updated upon publication to include a reference to the Publisher-assigned DOI (Digital Object Identifier) and a link to the online abstract for the final published Work in the Journal.
- Upon Publisher’s request, the Author agrees to furnish promptly to Publisher, at the Author’s own expense, written evidence of the permissions, licenses, and consents for use of third-party material included within the Work, except as determined by Publisher to be covered by the principles of Fair Use.
- The Author represents and warrants that:
- the Work is the Author’s original work;
- the Author has not transferred, and will not transfer, exclusive rights in the Work to any third party;
- the Work is not pending review or under consideration by another publisher;
- the Work has not previously been published;
- the Work contains no misrepresentation or infringement of the Work or property of other authors or third parties; and
- the Work contains no libel, invasion of privacy, or other unlawful matter.
- The Author agrees to indemnify and hold Publisher harmless from Author’s breach of the representations and warranties contained in Paragraph 6 above, as well as any claim or proceeding relating to Publisher’s use and publication of any content contained in the Work, including third-party content.