Structural Neural Networks Meet Piecewise Exponential Models for Interpretable College Dropout Prediction
##plugins.themes.bootstrap3.article.main##
##plugins.themes.bootstrap3.article.sidebar##
Abstract
We propose a novel approach to address the issue of college student attrition by developing a hybrid model
that combines a structural neural network with a piecewise exponential model. This hybrid model not only
shows the potential to robustly identify students who are at high risk of dropout, but also provides insights
into which factors are most influential in dropout prediction. To evaluate its effectiveness, we compared the
predictive performance of our hybrid model with two other survival analysis models: the piecewise
exponential model and a hybrid model combining a fully-connected neural network with a piecewise
exponential model. Additionally, we compared it to five other cross-sectional models: Ridge Logistic
Regression, Lasso Logistic Regression, CART decision tree, Random Forest, and XGBoost decision tree. Our
findings demonstrate that the hybrid model outperforms or performs comparably to the other models when
predicting dropout among students at the University of Delaware in Spring 2020, Spring 2021, and Spring
2022. Moreover, by categorizing predictors into three distinct groups—academic, economic, and socialdemographic—
we discovered that academic predictors play a prominent role in distinguishing between
dropout and retained students, while other predictors may indirectly influence predictions by impacting
academic variables. Consequently, we recommend implementing an intervention program aimed at
identifying at-risk students based on their academic performance and activities, with additional consideration
for economic and social-demographic factors in customized intervention plans.
How to Cite
##plugins.themes.bootstrap3.article.details##
college dropout, structural neural network, piecewise exponential model, interpretable hybrid model
AINA, C., BAICI, E., CASALONE, G. AND PASTORE, F. 2022. The determinants of university dropout: A review of the socio-economic literature. Socio-Economic Planning Sciences, 79, 101102.
ALBREIKI, B., ZAKI, N., AND ALASHWAL, H. 2021. A systematic literature review of student’ performance prediction using machine learning techniques. Education Sciences 11, 9, 552.
ALMARABEH, H. 2017. Analysis of students’ performance by using different data mining classifiers. International Journal of Modern Education and Computer Science 9, 8, 9.
AMERI, S., FARD, M. J., CHINNAM, R. B., AND REDDY, C. K. 2016. Survival analysis based framework for early prediction of student dropouts. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. Association for Computing Machinery, New York, NY, USA, 903–912.
AULCK, L., NAMBI, D., VELAGAPUDI, N., BLUMENSTOCK, J., AND WEST, J. 2019. Mining university registrar records to predict first-year undergraduate attrition. In Proceedings of the 12th International Conference on Educational Data Mining (EDM 2019), C. F. Lynch,
A. Merceron, M. Desmarais, and R. Nkambou, Eds. International Educational Data Mining Society, 9–18.
AULCK, L., VELAGAPUDI, N., BLUMENSTOCK, J., AND WEST, J. 2016. Predicting student dropout in higher education. arXiv preprint, arXiv:1606.06364.
BAKER, R. S. AND YACEF, K. 2009. The state of educational data mining in 2009: A review and future visions. Journal of Educational Data Mining 1, 1, 3–17.
BARANYI, M., NAGY, M. AND MOLONTAY, R. 2020, October. Interpretable deep learning for university dropout prediction. In Proceedings of the 21st annual conference on information technology education. Association for Computing Machinery, New York, NY, USA, 13-19.
BEAN, J. P. 1980. Dropouts and turnover: The synthesis and test of a causal model of student attrition. Research in Higher Education 12, 155–187.
BOUCHRIKA, I. 2023. College dropout rates: 2023 statistics by race, gender & income. https://research.com/universities-colleges/college-dropout-rates Accessed: 12-05-2023.
CAI, C. AND FLEISCHHACKER, A. 2022. The effect of loan debt on graduation by department: A Bayesian hierarchical approach. Journal of Student Financial Aid 51, 2, Article 5.
CANNISTRA, M., MASCI, C., IEVA, F., AGASISTI, T., AND PAGANONI, A. M. 2022. Earlypredicting dropout of university students: An application of innovative multilevel machine learning and statistical techniques. Studies in Higher Education 47, 9, 1935–1956.
COHAUSZ, L. 2022. Towards Real Interpretability of Student Success Prediction Combining Methods of XAI and Social Science. In Proceedings of the 15th International Conference on Educational Data Mining (EDM 2022), A. Mitrovic and N. Bosch, Eds. International Educational Data Mining Society, 361–367.
COHAUSZ, L., TSCHALZEV, A., BARTELT, C. AND STUCKENSCHMIDT, H. 2023. Investigating the Importance of Demographic Features for EDM-Predictions. In Proceedings of the 16th International Conference on Educational Data Mining (EDM 2023), M. Feng, T. Käser and P. Talukdar, Eds. International Educational Data Mining Society, 125–136.
DEIKE, R.C. 2003. A study of college student graduation using discrete-time survival analysis. Ph.D. thesis, The Pennsylvania State University.
DURKHEIM, E. 2005. Suicide: A study in sociology. Routledge.
FAN, J., KE, Z. T., LIAO, Y., AND NEUHIERL, A. 2022. Structural deep learning in conditional asset pricing. Available at SSRN: https://ssrn.com/abstract=4117882 or http://dx.doi.org/10.2139/ssrn.4117882.
FRIEDMAN, M. 1982. Piecewise exponential models for survival data with covariates. The Annals of Statistics 10, 1, 101–113.
HEREDIA-JIMENEZ, V., JIMENEZ, A., ORTIZ-ROJAS, M., MARÍN, J. I., MORENO-MARCOS, P. M., MUÑOZ-MERINO, P. J., AND KLOOS, C. D. 2020. An early warning dropout model in higher education degree programs: A case study in Ecuador. In Proceedings of the Workshop on Adoption, Adaptation and Pilots of Learning Analytics in Under-represented Regions colocated
with the 15th European Conference on Technology Enhanced Learning (LAUR@EC-TEL 2020). P. J. M. Merino, C. D. Kloos, Y.-S. Tsai, D. Gasevic, K. Verbert, M. Perez-Sanagustin, I. Hilliger, M. A. Z. Prieto, M. Ortiz-Rojas and E. Scheihing, Eds. 58–67.
INNES, M., SABA, E., FISCHER, K., GANDHI, D., RUDILOSSO, M. C., JOY, N. M., KARMALI, T., PAL, A., AND SHAH, V. 2018. Fashionable modelling with flux. arXiv preprint arXiv:1811.01457.
KÖHLER, M. AND LANGER, S. 2021. On the rate of convergence of fully connected deep neural network regression estimates. The Annals of Statistics 49, 4, 2231–2249.
KOPPER, P., PÖLSTERL, S., WACHINGER, C., BISCHL, B., BENDER, A. AND RÜGAMER, D. 2021, May. Semi-structured deep piecewise exponential models. In Proceedings of AAAI Spring Symposium on Survival Prediction - Algorithms, Challenges, and Applications (2021), R. Greiner, N. Kumar, T. A. Gerds, M. van der Schaar, Eds. PMLR, 40–53.
KOPPER, P., WIEGREBE, S., BISCHL, B., BENDER, A. AND RÜGAMER, D. 2022, May. DeepPAMM: Deep Piecewise Exponential Additive Mixed Models for Complex Hazard Structures in Survival Analysis. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (2022). Springer International Publishing, 249–261.
LUBIN, M., DOWSON, O., GARCIA, J. D., HUCHETTE, J., LEGAT, B., AND VIELMA, J. P. 2023. JuMP 1.0: Recent improvements to a modeling language for mathematical optimization. Mathematical Programming Computation. Mathematical Programming Computation. 15: 581–589. https://doi.org/10.1007/s12532-023-00239-3.
MANRIQUE, R., NUNES, B.P., MARINO, O., CASANOVA, M.A. AND NURMIKKO-FULLER, T. 2019, March. An analysis of student representation, representative features and classification algorithms to predict degree dropout. In Proceedings of the 9th International Conference on Learning Analytics & Knowledge. Association for Computing Machinery, New York, NY, USA, 401-410.
MÁRQUEZ-VERA, C., CANO, A., ROMERO, C., NOAMAN, A. Y. M., MOUSA FARDOUN, H., AND VENTURA, S. 2016. Early dropout prediction using data mining: a case study with high school students. Expert Systems, 33: 107–124. doi: 10.1111/exsy.12135. NATIONAL CENTER FOR EDUCATION STATISTICS (NCES). 2022. Undergraduate retention and graduation rates. https://nces.ed.gov/programs/coe/indicator/ctr Accessed: 05-22-2023.
QUADRI, M. AND KALYANKAR, D. N. 2010. Drop out feature of student data for academic performance using decision tree techniques. Global Journal of Computer Science and Technology 10, 2, 2–5.
RAISMAN, N. 2013. The cost of college attrition at four-year colleges & universities-an analysis of 1669 US institutions. Policy perspectives.
SANDOVAL-PALIS, I., NARANJO, D., VIDAL, J., AND GILAR-CORBI, R. 2020. Early dropout prediction model: A case study of university leveling course students. Sustainability 12, 22, 9314.
SCHMIDT-HIEBER, J. 2020. Nonparametric regression using deep neural networks with ReLU activation function. Annals of Statistics, 48, 4, 1875–1897. https://doi.org/10.1214/19-AOS1875.
SCHNEIDER, M. 2010. Finishing the first lap: The cost of first year student attrition in America’s four-year colleges and universities. Presented in the 50th annual Meeting of the Association for Institutional Research (AIR 2010).
SPADY, W. G. 1970. Dropouts from higher education: An interdisciplinary review and synthesis. Interchange 1, 1, 64–85.
STAGE, F. K. 1989. Motivation, academic and social integration, and the early dropout. American Educational Research Journal 26, 3, 385–402.
STINEBRICKNER, R. AND STINEBRICKNER, T. 2014. Academic performance and college dropout: Using longitudinal expectations data to estimate a learning model. Journal of Labor Economics 32, 3, 601–644.
TINTO, V. 1975. Dropout from higher education: A theoretical synthesis of recent research. Review of Educational Research 45, 1, 89–125.
WAGNER, K., VOLKENING, H., BASYIGIT, S., MERCERON, A., SAUER, P., AND PINKWART, N. 2023. Which approach best predicts dropouts in higher education? In Proceedings of the 15th International Conference on Computer Supported Education (CSEDU 2023), SCITEPRESS – Science and Technology Publications, 2, 15–26.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish with this journal agree to the following terms:
- The Author retains copyright in the Work, where the term “Work” shall include all digital objects that may result in subsequent electronic publication or distribution.
- Upon acceptance of the Work, the author shall grant to the Publisher the right of first publication of the Work.
- The Author shall grant to the Publisher and its agents the nonexclusive perpetual right and license to publish, archive, and make accessible the Work in whole or in part in all forms of media now or hereafter known under a Creative Commons 4.0 License (Attribution-Noncommercial-No Derivatives 4.0 International), or its equivalent, which, for the avoidance of doubt, allows others to copy, distribute, and transmit the Work under the following conditions:
- Attribution—other users must attribute the Work in the manner specified by the author as indicated on the journal Web site;
- Noncommercial—other users (including Publisher) may not use this Work for commercial purposes;
- No Derivative Works—other users (including Publisher) may not alter, transform, or build upon this Work,with the understanding that any of the above conditions can be waived with permission from the Author and that where the Work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
- The Author is able to enter into separate, additional contractual arrangements for the nonexclusive distribution of the journal's published version of the Work (e.g., post it to an institutional repository or publish it in a book), as long as there is provided in the document an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post online a pre-publication manuscript (but not the Publisher’s final formatted PDF version of the Work) in institutional repositories or on their Websites prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (see The Effect of Open Access). Any such posting made before acceptance and publication of the Work shall be updated upon publication to include a reference to the Publisher-assigned DOI (Digital Object Identifier) and a link to the online abstract for the final published Work in the Journal.
- Upon Publisher’s request, the Author agrees to furnish promptly to Publisher, at the Author’s own expense, written evidence of the permissions, licenses, and consents for use of third-party material included within the Work, except as determined by Publisher to be covered by the principles of Fair Use.
- The Author represents and warrants that:
- the Work is the Author’s original work;
- the Author has not transferred, and will not transfer, exclusive rights in the Work to any third party;
- the Work is not pending review or under consideration by another publisher;
- the Work has not previously been published;
- the Work contains no misrepresentation or infringement of the Work or property of other authors or third parties; and
- the Work contains no libel, invasion of privacy, or other unlawful matter.
- The Author agrees to indemnify and hold Publisher harmless from Author’s breach of the representations and warranties contained in Paragraph 6 above, as well as any claim or proceeding relating to Publisher’s use and publication of any content contained in the Work, including third-party content.