Structural Neural Networks Meet Piecewise Exponential Models for Interpretable College Dropout Prediction

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Jun 27, 2024
Chuan Cai Adam Fleischhacker

Abstract

We propose a novel approach to address the issue of college student attrition by developing a hybrid model
that combines a structural neural network with a piecewise exponential model. This hybrid model not only
shows the potential to robustly identify students who are at high risk of dropout, but also provides insights
into which factors are most influential in dropout prediction. To evaluate its effectiveness, we compared the
predictive performance of our hybrid model with two other survival analysis models: the piecewise
exponential model and a hybrid model combining a fully-connected neural network with a piecewise
exponential model. Additionally, we compared it to five other cross-sectional models: Ridge Logistic
Regression, Lasso Logistic Regression, CART decision tree, Random Forest, and XGBoost decision tree. Our
findings demonstrate that the hybrid model outperforms or performs comparably to the other models when
predicting dropout among students at the University of Delaware in Spring 2020, Spring 2021, and Spring
2022. Moreover, by categorizing predictors into three distinct groups—academic, economic, and socialdemographic—
we discovered that academic predictors play a prominent role in distinguishing between
dropout and retained students, while other predictors may indirectly influence predictions by impacting
academic variables. Consequently, we recommend implementing an intervention program aimed at
identifying at-risk students based on their academic performance and activities, with additional consideration
for economic and social-demographic factors in customized intervention plans.

How to Cite

Cai, C., & Fleischhacker, A. (2024). Structural Neural Networks Meet Piecewise Exponential Models for Interpretable College Dropout Prediction. Journal of Educational Data Mining, 16(1), 279–302. https://doi.org/10.5281/zenodo.11236277
Abstract 139 | HTML Downloads 79 PDF Downloads 115

##plugins.themes.bootstrap3.article.details##

Keywords

college dropout, structural neural network, piecewise exponential model, interpretable hybrid model

References
AGARWAL, R., MELNICK, L., FROSST, N., ZHANG, X., LENGERICH, B., CARUANA, R. AND HINTON, G.E. 2021. Neural additive models: Interpretable machine learning with neural nets. Advances in neural information processing systems, 34, 4699-4711.

AINA, C., BAICI, E., CASALONE, G. AND PASTORE, F. 2022. The determinants of university dropout: A review of the socio-economic literature. Socio-Economic Planning Sciences, 79, 101102.

ALBREIKI, B., ZAKI, N., AND ALASHWAL, H. 2021. A systematic literature review of student’ performance prediction using machine learning techniques. Education Sciences 11, 9, 552.

ALMARABEH, H. 2017. Analysis of students’ performance by using different data mining classifiers. International Journal of Modern Education and Computer Science 9, 8, 9.

AMERI, S., FARD, M. J., CHINNAM, R. B., AND REDDY, C. K. 2016. Survival analysis based framework for early prediction of student dropouts. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. Association for Computing Machinery, New York, NY, USA, 903–912.

AULCK, L., NAMBI, D., VELAGAPUDI, N., BLUMENSTOCK, J., AND WEST, J. 2019. Mining university registrar records to predict first-year undergraduate attrition. In Proceedings of the 12th International Conference on Educational Data Mining (EDM 2019), C. F. Lynch,
A. Merceron, M. Desmarais, and R. Nkambou, Eds. International Educational Data Mining Society, 9–18.

AULCK, L., VELAGAPUDI, N., BLUMENSTOCK, J., AND WEST, J. 2016. Predicting student dropout in higher education. arXiv preprint, arXiv:1606.06364.

BAKER, R. S. AND YACEF, K. 2009. The state of educational data mining in 2009: A review and future visions. Journal of Educational Data Mining 1, 1, 3–17.

BARANYI, M., NAGY, M. AND MOLONTAY, R. 2020, October. Interpretable deep learning for university dropout prediction. In Proceedings of the 21st annual conference on information technology education. Association for Computing Machinery, New York, NY, USA, 13-19.

BEAN, J. P. 1980. Dropouts and turnover: The synthesis and test of a causal model of student attrition. Research in Higher Education 12, 155–187.

BOUCHRIKA, I. 2023. College dropout rates: 2023 statistics by race, gender & income. https://research.com/universities-colleges/college-dropout-rates Accessed: 12-05-2023.

CAI, C. AND FLEISCHHACKER, A. 2022. The effect of loan debt on graduation by department: A Bayesian hierarchical approach. Journal of Student Financial Aid 51, 2, Article 5.

CANNISTRA, M., MASCI, C., IEVA, F., AGASISTI, T., AND PAGANONI, A. M. 2022. Earlypredicting dropout of university students: An application of innovative multilevel machine learning and statistical techniques. Studies in Higher Education 47, 9, 1935–1956.

COHAUSZ, L. 2022. Towards Real Interpretability of Student Success Prediction Combining Methods of XAI and Social Science. In Proceedings of the 15th International Conference on Educational Data Mining (EDM 2022), A. Mitrovic and N. Bosch, Eds. International Educational Data Mining Society, 361–367.

COHAUSZ, L., TSCHALZEV, A., BARTELT, C. AND STUCKENSCHMIDT, H. 2023. Investigating the Importance of Demographic Features for EDM-Predictions. In Proceedings of the 16th International Conference on Educational Data Mining (EDM 2023), M. Feng, T. Käser and P. Talukdar, Eds. International Educational Data Mining Society, 125–136.

DEIKE, R.C. 2003. A study of college student graduation using discrete-time survival analysis. Ph.D. thesis, The Pennsylvania State University.

DURKHEIM, E. 2005. Suicide: A study in sociology. Routledge.

FAN, J., KE, Z. T., LIAO, Y., AND NEUHIERL, A. 2022. Structural deep learning in conditional asset pricing. Available at SSRN: https://ssrn.com/abstract=4117882 or http://dx.doi.org/10.2139/ssrn.4117882.

FRIEDMAN, M. 1982. Piecewise exponential models for survival data with covariates. The Annals of Statistics 10, 1, 101–113.

HEREDIA-JIMENEZ, V., JIMENEZ, A., ORTIZ-ROJAS, M., MARÍN, J. I., MORENO-MARCOS, P. M., MUÑOZ-MERINO, P. J., AND KLOOS, C. D. 2020. An early warning dropout model in higher education degree programs: A case study in Ecuador. In Proceedings of the Workshop on Adoption, Adaptation and Pilots of Learning Analytics in Under-represented Regions colocated
with the 15th European Conference on Technology Enhanced Learning (LAUR@EC-TEL 2020). P. J. M. Merino, C. D. Kloos, Y.-S. Tsai, D. Gasevic, K. Verbert, M. Perez-Sanagustin, I. Hilliger, M. A. Z. Prieto, M. Ortiz-Rojas and E. Scheihing, Eds. 58–67.

INNES, M., SABA, E., FISCHER, K., GANDHI, D., RUDILOSSO, M. C., JOY, N. M., KARMALI, T., PAL, A., AND SHAH, V. 2018. Fashionable modelling with flux. arXiv preprint arXiv:1811.01457.

KÖHLER, M. AND LANGER, S. 2021. On the rate of convergence of fully connected deep neural network regression estimates. The Annals of Statistics 49, 4, 2231–2249.

KOPPER, P., PÖLSTERL, S., WACHINGER, C., BISCHL, B., BENDER, A. AND RÜGAMER, D. 2021, May. Semi-structured deep piecewise exponential models. In Proceedings of AAAI Spring Symposium on Survival Prediction - Algorithms, Challenges, and Applications (2021), R. Greiner, N. Kumar, T. A. Gerds, M. van der Schaar, Eds. PMLR, 40–53.

KOPPER, P., WIEGREBE, S., BISCHL, B., BENDER, A. AND RÜGAMER, D. 2022, May. DeepPAMM: Deep Piecewise Exponential Additive Mixed Models for Complex Hazard Structures in Survival Analysis. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (2022). Springer International Publishing, 249–261.

LUBIN, M., DOWSON, O., GARCIA, J. D., HUCHETTE, J., LEGAT, B., AND VIELMA, J. P. 2023. JuMP 1.0: Recent improvements to a modeling language for mathematical optimization. Mathematical Programming Computation. Mathematical Programming Computation. 15: 581–589. https://doi.org/10.1007/s12532-023-00239-3.

MANRIQUE, R., NUNES, B.P., MARINO, O., CASANOVA, M.A. AND NURMIKKO-FULLER, T. 2019, March. An analysis of student representation, representative features and classification algorithms to predict degree dropout. In Proceedings of the 9th International Conference on Learning Analytics & Knowledge. Association for Computing Machinery, New York, NY, USA, 401-410.

MÁRQUEZ-VERA, C., CANO, A., ROMERO, C., NOAMAN, A. Y. M., MOUSA FARDOUN, H., AND VENTURA, S. 2016. Early dropout prediction using data mining: a case study with high school students. Expert Systems, 33: 107–124. doi: 10.1111/exsy.12135. NATIONAL CENTER FOR EDUCATION STATISTICS (NCES). 2022. Undergraduate retention and graduation rates. https://nces.ed.gov/programs/coe/indicator/ctr Accessed: 05-22-2023.

QUADRI, M. AND KALYANKAR, D. N. 2010. Drop out feature of student data for academic performance using decision tree techniques. Global Journal of Computer Science and Technology 10, 2, 2–5.

RAISMAN, N. 2013. The cost of college attrition at four-year colleges & universities-an analysis of 1669 US institutions. Policy perspectives.

SANDOVAL-PALIS, I., NARANJO, D., VIDAL, J., AND GILAR-CORBI, R. 2020. Early dropout prediction model: A case study of university leveling course students. Sustainability 12, 22, 9314.

SCHMIDT-HIEBER, J. 2020. Nonparametric regression using deep neural networks with ReLU activation function. Annals of Statistics, 48, 4, 1875–1897. https://doi.org/10.1214/19-AOS1875.

SCHNEIDER, M. 2010. Finishing the first lap: The cost of first year student attrition in America’s four-year colleges and universities. Presented in the 50th annual Meeting of the Association for Institutional Research (AIR 2010).

SPADY, W. G. 1970. Dropouts from higher education: An interdisciplinary review and synthesis. Interchange 1, 1, 64–85.

STAGE, F. K. 1989. Motivation, academic and social integration, and the early dropout. American Educational Research Journal 26, 3, 385–402.

STINEBRICKNER, R. AND STINEBRICKNER, T. 2014. Academic performance and college dropout: Using longitudinal expectations data to estimate a learning model. Journal of Labor Economics 32, 3, 601–644.

TINTO, V. 1975. Dropout from higher education: A theoretical synthesis of recent research. Review of Educational Research 45, 1, 89–125.

WAGNER, K., VOLKENING, H., BASYIGIT, S., MERCERON, A., SAUER, P., AND PINKWART, N. 2023. Which approach best predicts dropouts in higher education? In Proceedings of the 15th International Conference on Computer Supported Education (CSEDU 2023), SCITEPRESS – Science and Technology Publications, 2, 15–26.
Section
Articles