A Comprehensive Study on Evaluating and Mitigating Algorithmic Unfairness with the MADD Metric

##plugins.themes.bootstrap3.article.main##

##plugins.themes.bootstrap3.article.sidebar##

Published Jun 27, 2024
Mélina Verger Chunyang Fan Sébastien Lallé François Bouchet Vanda Luengo

Abstract

Predictive student models are increasingly used in learning environments due to their ability to enhance
educational outcomes and support stakeholders in making informed decisions. However, predictive models
can be biased and produce unfair outcomes, leading to potential discrimination against certain individuals
and harmful long-term implications. This has prompted research on fairness metrics meant to
capture and quantify such biases. Nonetheless, current metrics primarily focus on predictive performance
comparisons between groups, without considering the behavior of the models or the severity of the biases
in the outcomes. To address this gap, we proposed a novel metric in a previous work (Verger et al., 2023)
named Model Absolute Density Distance (MADD), measuring algorithmic unfairness as the difference of
the probability distributions of the model’s outcomes. In this paper, we extended our previous work with
two major additions. Firstly, we provided theoretical and practical considerations on a hyperparameter
of MADD, named bandwidth, useful for optimal measurement of fairness with this metric. Secondly, we
demonstrated how MADD can be used not only to measure unfairness but also to mitigate it through postprocessing
of the model’s outcomes while preserving its accuracy. We experimented with our approach
on the same task of predicting student success in online courses as our previous work, and obtained successful
results. To facilitate replication and future usages of MADD in different contexts, we developed
an open-source Python package called maddlib (https://pypi.org/project/maddlib/). Altogether,
our work contributes to advancing the research on fair student models in education.

How to Cite

Verger, M., Fan, C., Lallé, S., Bouchet, F., & Luengo, V. (2024). A Comprehensive Study on Evaluating and Mitigating Algorithmic Unfairness with the MADD Metric. Journal of Educational Data Mining, 16(1), 365–409. https://doi.org/10.5281/zenodo.12180668
Abstract 174 | HTML Downloads 130 PDF Downloads 130

##plugins.themes.bootstrap3.article.details##

Keywords

fairness metric, unfairness mitigation, classification, student modeling, models' behaviors, sensitive features

References
ANDERSON, H., BOODHWANI, A., AND BAKER, R. 2019. Assessing the Fairness of Graduation Predictions. In Proceedings of The 12th International Conference on Educational Data Mining (EDM), C. F. Lynch, A. Merceron, M. Desmarais, and R. Nkambou, Eds. International Educational Data Mining Society, Montreal, Canada, 488–491.

BAKER, R. S., ESBENSHADE, L., VITALE, J., AND KARUMBAIAH, S. 2023. Using Demographic Data as Predictor Variables: a Questionable Choice. Journal of Educational Data Mining (JEDM) 15, 2 (Jun.), 22–52.

BAKER, R. S. AND HAWN, A. 2021. Algorithmic Bias in Education. International Journal of Artificial Intelligence in Education (IJAIED) 32, 4 (Nov.), 1052–1092.

BAROCAS, S., HARDT, M., AND NARAYANAN, A. 2019. Fairness and Machine Learning: Limitations and Opportunities. MIT Press, Cambridge, MA. http://www.fairmlbook.org.

BOLUKBASI, T., CHANG, K.-W., ZOU, J., SALIGRAMA, V., AND KALAI, A. 2016. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. In Proceedings of the 30th Conference on Neural Information Processing Systems (NIPS 2016), D. D. Lee, M. Sugiyama, U. von Luxburg, I. Guyon, and R. Garnett, Eds. Curran Associates Inc., Barcelona, Spain, 4356––4364.

BUOLAMWINI, J. AND GEBRU, T. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Proceedings of the 1st Conference on Fairness, Accountability and Transparency (ACM FAccT,). Proceedings of Machine Learning Research, vol. 81. PMLR, New York, NY, USA, 77–91.

CALVI, A. AND KOTZINOS, D. 2023. Enhancing AI fairness through impact assessment in the European Union: a legal and computer science perspective. In Proceedings of the 6th Conference on Fairness, Accountability, and Transparency (ACM FAccT). Association for Computing Machinery, Chicago, IL, USA, 1229–1245.

CASTELNOVO, A., CRUPI, R., GRECO, G., REGOLI, D., PENCO, I. G., AND COSENTINI, A. C. 2022. A clarification of the nuances in the fairness metrics landscape. Scientific Reports 12, 1–21. Nature Publishing Group.

CATON, S. AND HAAS, C. 2024. Fairness in Machine Learning: A Survey. ACM Computing Surveys 56, 7 (Apr.), 1–38.

CHA, S.-H. AND SRIHARI, S. N. 2002. On measuring the distance between histograms. Pattern Recognition 35, 1355–1370.

CHRISTIE, S. T., JARRATT, D. C., OLSON, L. A., AND TAIJALA, T. T. 2019. Machine-Learned School Dropout EarlyWarning at Scale. In Proceedings of The 12th International Conference on Educational Data Mining (EDM), C. F. Lynch, A. Merceron, M. Desmarais, and R. Nkambou, Eds. International Educational Data Mining Society, Montreal, Canada, 726–731.

D’ALESSANDRO, B., O’NEIL, C., AND LAGATTA, T. 2017. Conscientious Classification: A Data Scientist’s Guide to Discrimination-Aware Classification. Big Data 5, 2 (Jun.), 120–134.

DASTIN, J. 2018. Amazon scraps secret AI recruiting tool that showed bias against women. Reuters. https://www.reuters.com/article/us-amazon-com-jobs-automationinsight- idUSKCN1MK08G.

DEHO, O. B., ZHAN, C., LI, J., LIU, J., LIU, L., AND DUY LE, T. 2022. How do the existing fairness metrics and unfairness mitigation algorithms contribute to ethical learning analytics? British Journal of Educational Technology 53, 4, 822–843.

DEVROYE, L. 1986. Non-Uniform Random Variate Generation. Springer, New York, NY.

DEVROYE, L. AND GYORFI, L. 1985. Nonparametric Density Estimation: The L1 View. In Wiley Series in Probability and Statistics. John Wiley and Sons, New York. https://www.szit.bme.hu/ ˜gyorfi/L1bookBW.pdf.

GARDNER, J., BROOKS, C., AND BAKER, R. 2019. Evaluating the Fairness of Predictive Student Models Through Slicing Analysis. In Proceedings of the 9th International Conference on Learning Analytics & Knowledge. ACM, Tempe AZ USA, p. 225–234.

HOLSTEIN, K. AND DOROUDI, S. 2021. Equity and Artificial Intelligence in Education: Will “AIEd” Amplify or Alleviate Inequities in Education? CoRR abs/2104.12920.

HU, Q. AND RANGWALA, H. 2020. Towards Fair Educational Data Mining: A Case Study on Detecting At-risk Students. In Proceedings of the 13th International Conference on Educational Data Mining (EDM), A. N. Rafferty, J. Whitehill, V. Cavalli-Sforza, and C. Romero, Eds. International Educational Data Mining Society, Fully virtual, 431–437.

HUTCHINSON, B. AND MITCHELL, M. 2019. 50 Years of Test (Un)fairness: Lessons for Machine Learning. In Proceedings of the 2nd Conference on Fairness, Accountability, and Transparency (ACM FAccT). Association for Computing Machinery, Atlanta GA USA, 49–58.

HUTT, S., GARDNER, M., DUCKWORTH, A. L., AND D’MELLO, S. K. 2019. Evaluating Fairness and Generalizability in Models Predicting On-Time Graduation from College Applications. In Proceedings of the 12th International Conference on Educational Data Mining (EDM), C. F. Lynch, A. Merceron, M. Desmarais, and R. Nkambou, Eds. International Educational Data Mining Society, Montreal, Canada, 79–88.

JIANG, W. AND PARDOS, Z. A. 2021. Towards Equity and Algorithmic Fairness in Student Grade Prediction. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, I. Roll, D. Mc- Namara, S. Sosnovsky, R. Luckin, and V. Dimitrova, Eds. Association for Computing Machinery, New York, NY, USA, 608––617.

KAI, S., ANDRES, J. M. L., PAQUETTE, L., BAKER, R. S., MOLNAR, K., WATKINS, H., AND MOORE, M. 2017. Predicting Student Retention from Behavior in an Online Orientation Course. In Proceedings of the 10th International Conference on Educational Data Mining (EDM), X. Hu, T. Barnes, A. Hershkovitz, and L. Paquette, Eds. International Educational Data Mining Society, Wuhan, Hubei, China.

KIZILCEC, R. F. AND LEE, H. 2022. Algorithmic Fairness in Education. In Ethics in Artificial Intelligence in Education, W. Holmes and K. Porayska-Pomsta, Eds. Taylor & Francis, New York.

KUZILEK, J., HLOSTA, M., AND ZDRAHAL, Z. 2017. Open University Learning Analytics dataset. Scientific data 4, 1, 1–8.

LALLÉ , S., BOUCHET, F., VERGER, M., AND LUENGO, V. 2024. Fairness of MOOC Completion Predictions Across Demographics and Contextual Variables. In Proceedings of the 25th International Conference on Artificial Intelligence in Education (AIED). Springer, Recife, Brazil. In press.

LARSON, J., MATTU, S., KIRCHNER, L., AND ANGWIN, J. 2016. How We Analyzed the COMPAS Recidivism Algorithm. ProPublica. https://www.propublica.org/article/howwe- analyzed-the-compas-recidivism-algorithm.

LE QUY, T., ROY, A., IOSIFIDIS, V., ZHANG, W., AND NTOUTSI, E. 2022. A survey on datasets for fairness-aware machine learning. WIREs Data Mining and Knowledge Discovery 12, 3, e1452.

LEE, H. AND KIZILCEC, R. F. 2020. Evaluation of Fairness Trade-offs in Predicting Student Success. FATED (Fairness, Accountability, and Transparency in Educational Data) Workshop at EDM 2020. https://doi.org/10.48550/arXiv.2007.00088.

LI, C., XING, W., AND LEITE, W. 2021. Yet another predictive model? Fair predictions of students’ learning outcomes in an online math learning platform. In Proceedings of the 11th International Learning Analytics and Knowledge Conference. Association for Computing Machinery, Irvine, CA, USA, 572–578.

LOPEZ, P. 2021. Bias does not equal bias: a socio-technical typology of bias in data-based algorithmic systems. Internet Policy Review 10, 4 (Dec.), 1–29.

MAKHLOUF, K., ZHIOUA, S., AND PALAMIDESSI, C. 2021. Machine learning fairness notions: Bridging the gap with real-world applications. Information Processing & Management 58, 5 (Sep.), 102642.

MEHRABI, N., MORSTATTER, F., SAXENA, N., LERMAN, K., AND GALSTYAN, A. 2022. A Survey on Bias and Fairness in Machine Learning. ACM Computing Surveys 54, 6, 1–35.

PESSACH, D. AND SHMUELI, E. 2023. Algorithmic Fairness. In Machine Learning for Data Science Handbook: Data Mining and Knowledge Discovery Handbook, L. Rokach, O. Maimon, and E. Shmueli, Eds. Springer, Cham, Switzerland, 867–886.

ROMERO, C. AND VENTURA, S. 2020. Educational data mining and learning analytics: An updated survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 10, 3, e1355.

SELBST, A. D., BOYD, D., FRIEDLER, S. A., VENKATASUBRAMANIAN, S., AND VERTESI, J. 2019. Fairness and Abstraction in Sociotechnical Systems. In Proceedings of the 2nd Conference on Fairness, Accountability, and Transparency (ACM FAccT). Association for Computing Machinery, Atlanta GA USA, 59–68.

SHA, L., RAKOVIC, M., WHITELOCK-WAINWRIGHT, A., CARROLL, D., YEW, V. M., GASEVIC, D., AND CHEN, G. 2021. Assessing algorithmic fairness in automatic classifiers of educational forum posts. In Proceedings of the 22nd International Conference on Artificial Intelligence in Education (AIED), I. Roll, D. McNamara, S. Sosnovsky, R. Luckin, and V. Dimitrova, Eds. Springer, Utrecht, The Netherlands, 381–394.

SOVRANO, F., SAPIENZA, S., PALMIRANI, M., AND VITALI, F. 2022. A Survey on Methods and Metrics for the Assessment of Explainability under the Proposed AI Act. In Legal Knowledge and Information Systems: JURIX 2021: The Thirty-fourth Annual Conference, E. Schweighofer, Ed. IOS Press, Vilnius, Lithuania, 235–242.

SURESH, H. AND GUTTAG, J. V. 2021. A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle. In Proceedings of the 1st ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization. Association for Computing Machinery, New York, NY, USA, 1–9.

VASQUEZ VERDUGO, J., GITIAUX, X., ORTEGA, C., AND RANGWALA, H. 2022. FairEd: A Systematic Fairness Analysis Approach Applied in a Higher Educational Context. In LAK22: 12th International Learning Analytics and Knowledge Conference. Association for Computing Machinery, Online USA, 271–281.

VERGER, M., LALLÉ , S., BOUCHET, F., AND LUENGO, V. 2023. Is Your Model “MADD”? A Novel Metric to Evaluate Algorithmic Fairness for Predictive Student Models. In Proceedings of the 16th International Conference on Educational Data Mining, M. Feng, T. K¨aser, and P. Talukdar, Eds. International Educational Data Mining Society, Bengaluru, India, 91–102.

VERMA, S. AND RUBIN, J. S. 2018. Fairness Definitions Explained. In Proceedings of the 2018 IEEE/ACM International Workshop on Software Fairness (FairWare). Association for Computing Machinery, Gothenburg Sweden, 1–7.

YU, R., LEE, H., AND KIZILCEC, R. F. 2021. Should college dropout prediction models include protected attributes? In Proceedings of the Eighth ACM Conference on Learning @ Scale. L@S ’21. Association for Computing Machinery, New York, NY, USA, p. 91–100.

YU, R., LI, Q., FISCHER, C., DOROUDI, S., AND XU, D. 2020. Towards accurate and fair prediction of college success: Evaluating different sources of student data. In Proceedings of The 13th International Conference on Educational Data Mining (EDM), A. N. Rafferty, J. Whitehill, V. Cavalli-Sforza, and C. Romero, Eds. International Educational Data Mining Society, Fully virtual, 292–301.

ŠVÁBENSKÝ , V., VERGER, M., RODRIGO, M. M. T., MONTEROZO, C. J. G., BAKER, R. S., SAAVEDRA, M. Z. N. L., LALLÉ, S., AND SHIMADA, A. 2024. Evaluating Algorithmic Bias in Models for Predicting Academic Performance of Filipino Students. In Proceedings of The 17th International Conference on Educational Data Mining (EDM). International Educational Data Mining Society, Atlanta, Georgia, USA. In press.
Section
Extended Articles from the EDM 2023 Conference