Who's Learning? Using Demographics in EDM Research
##plugins.themes.bootstrap3.article.main##
##plugins.themes.bootstrap3.article.sidebar##
Abstract
The growing use of machine learning for the data-driven study of social issues and the implementation of data-driven decision processes has required researchers to re-examine the often implicit assumption that data-driven models are neutral and free of biases. The careful examination of machine-learned models has identified examples of how existing biases can inadvertently be perpetuated in fields such as criminal justice, where failing to account for racial prejudices in the prediction of recidivism can perpetuate or exasperate them, and natural language processing, where algorithms trained on human languages corpora have been shown to reproduce strong biases in gendered descriptions. These examples highlight the importance of thinking about how biases might impact the study of educational data and how data-driven models used in educational contexts may perpetuate inequalities. To understand this question, we ask whether and how demographic information, including age, educational level, gender, race/ethnicity, socioeconomic status (SES), and geographical location, is used in Educational Data Mining (EDM) research. Specifically, we conduct a systematic survey of the last five years of EDM publications that investigates whether and how demographic information about the students is reported in EDM research and how this information is used to 1) investigate issues related to demographics, 2) use the information as input features for data-driven analyses, or 3) to test and validate models. This survey shows that, although a majority of publications reported at least one category of demographic information, the frequency of reporting for different categories of demographic information is very uneven (ranging from 5% to 59%), and only 15% of publications used demographic information in their analyses.
How to Cite
##plugins.themes.bootstrap3.article.details##
machine learning bias, equity, fairness, meta-analysis
ALVI, M., ZISSERMAN, A., AND NELLAKER, C. 2018. Turning a blind eye: Explicit removal of biases and variation from deep neural network embeddings. In L. Leal-Taixé, & S. Roth (Eds.) Proceedings of the European Conference on Computer Vision 2018 Workshops, Munich, Germany, 556-572.
AULCK, L., NAMBI, D., VELAGAPUDI, N., BLUMENSTOCK, J., AND WEST, J. 2019. Mining university registrar records to predict first-year undergraduate attrition. In C.F. Lynch, A. Merceron, M. Desmarais, & R. Nkambou (Eds.) Proceedings of the 12th International Conference on Educational Data Mining, Montreal, Canada, 9-18.
BACKENKOHLER, M., SCHERZINGER, F., SINGLA, A., AND WOLF, V. 2018. Data-driven approach towards a personalized curriculum. In K.E. Boyer, & M. Yudelson (Eds.) Proceedings of the 11th International Conference on Educational Data Mining, Buffalo, NY, 246-251.
BAKHTIN, M.M. 2010. The dialogic imagination: Four essays (Vol. 1). University of Texas Press.
BANDURA, A. 2001. Social cognitive theory: An agentic perspective. Annual Review of Psychology 52, 1, 1-26.
BAROCAS, S., AND SELBST, A.D. 2016. Big data’s disparate impact. California Law Review 104, 671.
BHARTIYA, D., CONTRACTOR, D., BISWAS, S., SENGUPTA, B., AND MOHANIA, M. 2016. Document segmentation for labeling with academic learning objectives. In T. Barnes, M. Chi, & M. Feng (Eds.) Proceedings of the 9th International Conference on Educational Data Mining, Raleigh, NC, 282-287.
BHATNAGAR, S., DESMARAIS, M., WHITTAKER, C., LASRY, N., DUGDALE, M., LENTON, K., AND CHARLES, E. 2015. An analysis of peer-submitted and peer-reviewed answer rationales in a web-based peer instruction based learning environment. In O.C. Santos, J.G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, J.M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. Desmarais (Eds.) Proceedings of the 8th International Conference on Educational Data Mining, Madrid, Spain, 456-459.
BIERDMAN, J., FARAONE, S.V., AND MONUTEAUX, M.C. 2002. Differential effect of environmental adversity by gender: Rutter’s index of adversity in a group of boys and girls with and without ADHD. American Journal of Psychiatry 159, 9, 1556-1562.
BLODGETT, S.L., AND O’CONNOR, B. 2017. Racial disparity in natural language processing: A case study of social media African-American English. In Workshop on Fairness, Accountability and Transparency in Machine Learning (FATML), Halifax, Nova Scotia, arXiv:1707.00061.
BOLUKBASI, T., CHANG, K.W., ZOU, J., Y., SALIGRAMA, B., AND KALAI, A.T. 2016. Man is to computer as woman is to homemaker? Debiasing word embeddings. In Advances in Neural Information Processing Systems, Barcelona, Spain, 4349-4357.
BOSTROM, N. 2016. Superintelligence: Paths, dangers, strategies. Oxford University Press: Oxford.
BRAVO, J., ROMERO, S.J., LUNA, M., AND PAMPLONA, S. 2015. Exploring the influence of ICT in online education through data mining tools. In O.C. Santos, J.G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, J.M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. Desmarais (Eds.) Proceedings of the 8th International Conference on Educational Data Mining, Madrid, Spain, 540-543.
BROSCH, T., BAR-DAVID, E., AND PHELPS, E.A. 2013. Implicit race bias decreases the similarity of neural representations of black and white faces. Psychological Science 24, 2, 160-166.
BURR, C., CRISTIANINI, N., AND LADYMAN, J. 2018. An analysis of the interaction between intelligent software agents and human users. Minds and Machines 28, 4, 735-774.
BYDŽOVSKÁ, H. 2016. Course enrollment recommender system. In T. Barnes, M. Chi, & M. Feng (Eds.) Proceedings of the 9th International Conference on Educational Data Mining, Raleigh, NC, 312-317.
CALDERS, T., AND VERWER, S. 2010. Three naïve Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery 21, 2, 277-292.
CALISKKAN, A., BRYSON, J.J., AND NARAYANAN, A. 2017. Semantics derived automatically from language corpora contain human-line biases. Science 356, 6334, 183-186.
CELLIS, L.E., HUANG, L., KESWANI, V., AND VISHNOI, N.K. 2019. Classification with fairness constraints: A meta-algorithm with provable guarantees. In Proceedings of the Conference on Fairness, Accountability, and Transparency, Atlanta, GA, 319-328.
CHEN, L., MA, R., HANNÁK, A., AND WILSON, C. 2018. Investigating the impact of gender on rank in resume search engines. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montréal, Canada, 651.
CHILDS, D.S. 2017. Effects of math identity and learning opportunities on racial differences in math engagement, advanced course-taking, and STEM aspiration. PhD Dissertation, Temple University.
CHOPRA, S., GAUTREAU, H., KHAN, A., MIRSAFIAN, M., AND GOLAB, L. 2018. Gender differences in undergraduate engineering applicants: A text mining approach. In K.E. Boyer, & M. Yudelson (Eds.) Proceedings of the 11th International Conference on Educational Data Mining, Buffalo, NY, 44-54.
CHOULDECHOVA, A. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data 5, 2, 153-163.
CORBETT-DAVIES, S., AND GOEL, S. 2018. The measure and mismeasure of fairness: A critical review of fair machine learning. arXiv preprint, arXiv:1808.00023.
CROSSLEY, S., ALLEN, L.K., SNOW, E.L., AND MCNAMARA, D.S. 2016. Incorporating learning characteristics into automatic essay scoring models: What individual differences and linguistic features tell us about writing quality. Journal of Educational Data Mining 8, 2, 1-19.
CRUES, R.W., BOSCH, N., ANDERSON, C.J., PERRY, M., BHAT, S., AND SHAIK, N. 2018. Who they are and what they want: Understanding the reasons for MOOC enrollment. In K.E. Boyer, & M. Yudelson (Eds.) Proceedings of the 11th International Conference on Educational Data Mining, Buffalo, NY, 177-186.
D’ALESSANDRO, B., O’NEIL, C., AND LAGATTA, T. 2017. Conscientious classification: A data scientist’s guide to discrimination-aware classification. Big Data 5, 2, 120-134.
DEE MILLER, L., SOH, L.-K., SAMAL, A., KUPZYK, K., AND NUGENT, G. 2015. A comparison of educational statistics and data mining approaches to identify characteristics that impact online learning. Journal of Educational Data Mining 7, 3, 117-150.
DOROUDI, S., AND BRUNSKILL, E. 2017. The misidentified identifiability problem of bayesian knowledge tracing. In X. Hu, T. Barnes, A. Hershkovitz, & Paquette, L. (Eds.) Proceedings of the 10th International Conference on Educational Data Mining, Wuhan, China, 143-149.
DU, X., DUIVESTEIJN, W., KLABBERS, M., AND PECHENIZKIY, M. 2018. ELBA: Exceptional Learning Behavior Analysis. In K.E. Boyer, & M. Yudelson (Eds.) Proceedings of the 11th International Conference on Educational Data Mining, Buffalo, NY, 312-317.
ECKERT, P. 1989. The whole woman: Sex and gender differences in variation. Language Variation and Change 1, 3, 245-267.
EZEN-CAN, A., AND BOYER, K.E. 2015. Choosing to interact: Exploring the relationship between learner personality, attitudes, and tutorial dialogue participation. In O.C. Santos, J.G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, J.M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. Desmarais (Eds.) Proceedings of the 8th International Conference on Educational Data Mining, Madrid, Spain, 125-128.
FELDMAN, M., FRIEDLER, S.A., MOELLER, J., SCHEIDEGGER, C., AND VENKATASUBRAMANIAN, S. 2015. Certifying and removing disparate impact. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 259-268.
FENG, M., ROSCHELLE, J., MASON, C., AND BHANOT, R. 2016. Investigating gender difference on homework in middle school mathematics. In T. Barnes, M. Chi, & M. Feng (Eds.) Proceedings of the 9th International Conference on Educational Data Mining, Raleigh, NC, 364-369.
FERGUSON, R. 2012. Learning Analytics: Drivers, developments and challenges. International Journal of Technology Enhanced Learning 4, 5/6, 304-317.
FLETCHER, A.C., HUNTER, A.G. 2003. Strategies for obtaining parental consent to participate in research. Family Relations 52, 3, 216-221.
FRIEDMAN, B. 1996. Value-sensitive design. Interactions 3, 6, 16-23.
GAIANE, P., AND PECHENIZKIY, M. 2018. On formalizing fairness in prediction with machine learning. arXiv preprint, arXiv:1710.03184.
GALHOTRA, S., BRUN, Y., AND MELIOU, A. 2017. Fairness testing: testing software for discrimination. In Proceedings of the 11th Joint Meeting on Foundations of Software Engineering, Paderborn, Germany, 498-510.
GARDNER, J., BROOKS, C., AND BAKER, R. 2019. Evaluating the fairness of predictive student models. In Proceedings of the 9th International Conference on Learning Analytics and Knowledge, Tempe, AZ.
GAUB, M., CARLSON, C.L. 1997. Gender differences in ADHD: A meta-analysis and critical review. Journal of the American Academy of Child & Adolescent Psychiatry 36, 8, 1036-1045.
GELLERT R., DE VRIES, K., DE HERT, P., AND GUTWIRTH, S. 2013. A comparative analysis of anti-discrimination and data protection legislations. In Discrimination and Privacy in the Information Society, B. Custers, T. Calders, B. Schermer, and T. Zarsky, Eds. Springer, Berlin, 61-89.
GIANFRANCESCO, M.A., TAMANG, S., YAZDANY, J., AND SCHMAJUK, G. 2018. Potential biases in machine learning algorithms using electronic health record data. JAMA Internal Medicine 178, 11, 1544-1547.
GOODMAN, B., AND FLAXMAN, S. 2017. European Union regulations on algorithmic decision-making and a “right to explanation”. AI Magazine 38, 3, 50-57.
GOODMAN, S.N., GOEL, S., AND CULLEN, M.R. 2018. Machine learning, health disparities, and causal reasoning. Annals of Internal Medicine 169, 12, 883-884.
GOSSE, D., AND ARNOCKY, S. 2012. The state of Canadian boyhood–beyond literacy to a holistic approach. in Education 18, 2, 67-97.
HACKER, P., AND WIEDMANN, E. 2017. A continuous framework for fairness. arXiv preprint, arXiv:1712.07924.
GUTIÉRREZ, K.D., AND ROGOFF, B. 2003. Cultural ways of learning: Individual traits or repertoires of practice. Educational Researcher 32, 5, 19-25.
HADFIELD-MENELL, D., RUSSELL, S.J., ABBEEL, P., AND DRAGAN, A. 2016. Cooperative inverse reinforcement learning. In Advances in Neural Information Processing Systems, Barcelona, Spain, 3909-3917.
HAJIAN, S., AND DOMINGO-FERRER. 2013. A methodology for direct and indirect discrimination prevention in data mining. IEEE Transactions on Knowledge and Data Engineering 25, 7, 1445-1459.
HENDRIX, L.A., BURNS, K., SAENKO, K., DARRELL, T., AND ROHRBACH, A. 2018. Women also snowboard: Overcoming bias in captioning models. In European Conference on Computer Vision, Munich, Germany, 793-811.
HOLSTEIN, K., WORTMAN VAUGHAN, J., DAUMÉ III, H., DUDIK, M., WALLACH, H. 2019. Improving fairness in machine learning systems: What do industry practitioners need? In Proceedings of the ACM CHI Conference on Human Factors in Computer Systems, Glasgow, UK, 1-16.
HOVY, D. 2015. Demographic factors improve classification performance. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics at the 7th International Joint Conference on Natural Language Processing volume 1, Beijing, China, 752-762.
HOWARD, A., ZHANG, C., AND HORVITZ, E. 2017. Addressing bias in machine learning algorithms: A pilot study on emotion recognition for intelligent systems. In IEEE Workshop on Advanced Robotics and Its Social Impacts, Genoa, Italy, 1-7.
HUTT, S., GARDNER, M., DUCKWORTH, A.L., AND D’MELLO, S. 2019. Evaluating fairness and generalizability in models of predicting on-time graduation from college applications. In C.F. Lynch, A. Merceron, M. Desmarais, & R. Nkambou (Eds.) Proceedings of the 12th International Conference on Educational Data Mining, Montreal, Canada, 79-88.
JATON, F. 2017. We get the algorithms of our ground truths: Designing referential databases in digital image processing. Social Studies of Science 47, 6, 811-840.
JENSEN, E., HUTT, S., AND D’MELLO, S.K. 2019. Generalizability of sensor-free affect detection models in a longitudinal dataset of tens of thousands of students. In C.F. Lynch, A. Merceron, M. Desmarais, & R. Nkambou (Eds.) Proceedings of the 12th International Conference on Educational Data Mining, Montreal, Canada, 324-329.
JURGENS, D., TSVETKOV, Y., AND JURAFSKY, D. 2017. Incorporating dialectal variability for socially equitable language identification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Vancouver, Canada, 51-57.
KAI, S., ANDRES, J.M., PAQUETTE, L., BAKER, R., MOLNAR, K., WATKINS, H., MOORE, M. 2017. Course enrollment recommender system. In Proceedings of the 10th International Conference on Educational Data Mining, Wuhan, China, 250-255.
KAMIRAN, F., AND CALDERS, T. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems 33, 1, 1-33.
KAMIRAN, F., CALDERS, T., AND PECHENIZKIY, M. 2012. Discrimination aware decision tree learning. In Proceedings of the 10th IEEE International Conference on Data Mining, Sydney, Australia, 869-874.
KAMISHIMA, T., AKAHO, S., AND SAKUMA, J. 2011. Fairness-aware learning through regularization approach. In Proceedings of the 11th IEEE International Conference on Data Mining Workshops, Vancouver, Canada, 643-650.
KAMISHIMA, T., AKAHO, S., ASOH, H., AND SAKUMA, J. 2012. Fairness-aware classifier with prejudice remover regularizer. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Bristol, United Kingdom, 35-50.
KARUMBAIAH, S., OCUMPAUGH, J., AND BAKER, R.S. 2019. The influence of school demographics on the relationship between students’ help-seeking behavior and performance and motivational measures. In C.F. Lynch, A. Merceron, M. Desmarais, & R. Nkambou (Eds.) Proceedings of the 12th International Conference on Educational Data Mining, Montreal, Canada, 99-108.
KHAJAH, M., LINDSEY, R.B., & MOZER, M.C. 2016. How deep is knowledge tracing? In T. Barnes, M. Chi, & M. Feng (Eds.) Proceedings of the 9th International Conference on Educational Data Mining, Raleigh, NC, 94-102.
KINZIE, AND M.B., JOSEPH, D.R. 2008. Gender differences in game activity preferences of middle school children: Implications for educational game design. Educational Technology Research and Development 56, 5-6, 643-663.
KIRITCHENKO, S., AND MOHAMMAD, S.M. 2018. Examining gender and race bias in two hundred sentiment analysis systems. In Proceedings of the 7th Joint Conference on Lexical and Computational Semantics, New Orleans, LA, 43-53.
KLARE, B.F., BURGE, M.J., KLONTZ, J.C., VORDER BRUEGGE, R.W., AND JAIN, A.K. 2012. Face recognition performance: Role of demographic information. IEEE Transactions on Information Forensics and Security 7, 6, 1789-1801.
KNOWLES, J.E. 2015. Of needles and haystacks: Building an accurate statewide dropout early warning system in Wisconsin. Journal of Educational Data Mining 7, 3, 1-17.
LABARTHE, H., BOUCHET, F., BACHELET, R., AND YACEF, K. 2016. Does a peer recommender foster students’ engagement in MOOCs? In T. Barnes, M. Chi, & M. Feng (Eds.) Proceedings of the 9th International Conference on Educational Data Mining, Raleigh, NC, 418-423.
LABARTHE, H., LUENGO, V., AND BOUCHET, F. 2018. Analyzing the relationships between learning analytics, educational data mining and A.I. for education. In Workshop on Learning Analytics: Building Bridges Between the Education and Computing Communities at the 14th International Conference on Intelligent Tutoring Systems, Montreal, Canada, 10-19.
LADSON-BILLINGS, G. 1995. Towards a theory of culturally relevant pedagogy. American Educational Research Journal 32, 3, 465-491.
LADSON-BILLINGS, G. 1998. Just what is critical race theory and what’s it doing in a nice field like education? International Journal of Qualitative Studies in Education 11, 1, 7-24.
LABUTOV, V., AND LIPSON, H. 2016. Web as a textbook: Curating targeted learning paths through the heterogeneous learning resources on the web. In T. Barnes, M. Chi, & M. Feng (Eds.) Proceedings of the 9th International Conference on Educational Data Mining, Raleigh, NC, 110-118.
LAUDERDALE, B.E., AND CLARK, T.S. 2014. Scaling politically meaningful dimensions using texts and votes. American Journal of Political Science 58, 3, 754-771.
LAVE, J. 1991. Situating learning in communities of practice. Perspectives on Socially Shared Cognition 2, 63-82.
LIU, Z., BROWN, R., LYNCH, C., BARNES, T., BAKER, R.S., BERGNER, Y., AND MCNAMARA, D. 2016. MOOC Learners behaviors by country and culture; an exploratory analysis. In T. Barnes, M. Chi, & M. Feng (Eds.) Proceedings of the 9th International Conference on Educational Data Mining, Raleigh, NC, 127-134.
LIU, Z., CODY, C., BARNES, T., LYNCH, C., AND RUTHERFORD, T. 2017. The antecedent of and associations with elective replay in an educational game: Is replay worth it? In X. Hu, T. Barnes, A. Hershkovitz, & Paquette, L. (Eds.) Proceedings of the 10th International Conference on Educational Data Mining, Wuhan, China, 40-47.
LUO, L., KOPRINSKA, I., AND LIU, W. 2015. Discrimination-aware classifiers for student performance prediction. In O.C. Santos, J.G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, J.M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. Desmarais (Eds.) Proceedings of the 8th International Conference on Educational Data Mining, Madrid, Spain, 384-387.
LYNCH, J., AND STUCKLER, D. 2012. In God we trust, all others (must) bring data. International Journal of Epidemiology 41, 6, 1503-1506.
MADAIO, M., LASKO, R., CASSELL, J., AND OGAN, A. 2017. Using temporal association rule mining to predict dyadic rapport in peer tutoring. In X. Hu, T. Barnes, A. Hershkovitz, & Paquette, L. (Eds.) Proceedings of the 10th International Conference on Educational Data Mining, Wuhan, China, 318-323.
MADDOX, T.M., RUMSFELD, J.S., AND PAYNE, P.R.O. 2018. Questions for artificial intelligence in health care. JAMA 321, 1, 31-32.
MAKKONEN, T., 2007. Measuring discrimination: Data collection and the E.U. Equality Law. European Network of Legal Experts in Anti-Discrimination. (http://www.migpolgroup.com)
MCBRIDE, L., AND NICHOLS, A. 2016. Retooling poverty targeting using out-of-sample validation and machine learning. World Bank Economic Review 32, 3, 531-550.
MILNER IV, H.R. 2012. Beyond a test score: Explaining opportunity gaps in educational practice. Journal of Black Studies 43, 6, 693-718.
MISRA, I., ZITNICK, C.L., MITCHELL, M., AND GIRSHICK, R. 2016. Seeing through the human reporting bias: Visual classifiers from noisy humancentric labels. In Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, 2930-2939.
MITCHELL, S., POTASH, E., AND BAROCAS, S. 2018. Prediction-based decisions and fairness: A catalogue of choices, assumptions and definitions. arXiv preprint, arXiv:1811.07867.
NAISMITH, B., HAN, N.-R., JUFFS, A., HILL, B., AND ZHENG, D. 2018. Accurate measurement of lexical sophistication with reference to ESL learning data. In K.E. Boyer, & M. Yudelson (Eds.) Proceedings of the 11th International Conference on Educational Data Mining, Buffalo, NY, 259-265.
NASIR, N.I.S., AND HAND, V.M. 2006. Exploring sociocultural perspective on race, culture, and learning. Review of Educational Research 76, 4, 449-475.
NGUYEN, H., AND LIEW, C.W. 2018. Using student logs to build Bayesian models of student knowledge and skills. In K.E. Boyer, & M. Yudelson (Eds.) Proceedings of the 11th International Conference on Educational Data Mining, Buffalo, NY, 312-317.
NIEVA, R. 2015. Google apologizes for algorithm mistakenly calling black people ‘gorillas’. CNet, July 4, 2015.
NIŽNAN, J., PELÁNEK, R., AND RIHÁK, J. 2015. Student models for prior knowledge estimation. In O.C. Santos, J.G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, J.M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. Desmarais (Eds.) Proceedings of the 8th International Conference on Educational Data Mining, Madrid, Spain, 109-116.
NOLL, R.B., ZELLER, M.H., VANNATTA, K., BUKOWSKI, W.M., AND DAVIES, W.H. 1997. Potential bias in classroom research: Comparison of children with permission and those who do not receive permission to participate. Journal of Clinical Child Psychology 26, 36-42.
OCUMPAUGH, J., BAKER, R., GOWDA, S., HEFFERNAN, N., AND HEFFERNAN, C. 2014. Population validity for Educational Data Mining models: A case study in affect detection. British Journal of Educational Technology 45, 3, 487-501.
PALAEZ, K., LEVINE, R., FAN, J., GUARCELLO, M., AND LAUMAKIS, M. 2019. Using a latent class forest to identify at-risk students in higher education. Journal of Educational Data Mining 11, 1, 18-46.
PARIS, S.G., AND BYRNES, J.P. 1989. The constructivist approach to self-regulation and learning in the classroom. In B.J. Zimmerman, D.H. Schunk (Eds.) Self-Regulated Learning and Academic Achievement, Springer, NY, 169-200.
PARK, J., YU, R., RODRIGUEZ, F., BAKER, R., SMYTH, P., AND WARSCHAUER, M. 2018. Understanding student procrastination via mixture models. In K.E. Boyer, & M. Yudelson (Eds.) Proceedings of the 11th International Conference on Educational Data Mining, Buffalo, NY, 187-197.
PITLARZ, I., PU, S., PATEL, M., AND PRABHU, R. 2018. What can we learn from college students’ network transactions? Constructing useful features for student success prediction. In K.E. Boyer, & M. Yudelson (Eds.) Proceedings of the 11th International Conference on Educational Data Mining, Buffalo, NY, 444-448.
PINKARD, N. 2005. How the perceived masculinity and/or femininity of software applications influences students’ software preferences. Journal of Educational Computing Research 32, 1, 57-78.
PEDRESCHI, D., RUGGIERI, AND S., TURINI, F. 2008. Discrimination-aware data mining. In Proceedings of the 14th International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, 560-568.
RAJKOMAR, A., HARDT, M., HOWELL, M.D., CORRADO, G., AND CHIN, M.H. 2018. Ensuring fairness in machine learning to advance health equity. Annals of Internal Medicine 169, 12, 866-872.
REN, Z., NING, X., LAN, A., AND RANGWALA, H. 2019. Grade prediction based on cumulative knowledge and co-taken courses. In C.F. Lynch, A. Merceron, M. Desmarais, & R. Nkambou (Eds.) Proceedings of the 12th International Conference on Educational Data Mining, Montreal, Canada, 158-167.
RIDDLE, T., BHAGAVATULA, S., GUO, W., MURESAN, S., COHEN, G., COOK, J., AND PURDIE-VAUGHNS, V. 2015. Mining a written values affirmation intervention to identify the unique linguistic features of stigmatized groups. In O.C. Santos, J.G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, J.M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. Desmarais (Eds.) Proceedings of the 8th International Conference on Educational Data Mining, Madrid, Spain, 274-281.
ROMEI, A., AND RUGGIERI, S. 2014. A multidisciplinary survey on discrimination analysis. Engineering Review 29, 5, 582-638.
ROSENBAUM, P.R. 2001. Replicating effect and biases. The American Statistician 55, 3, 223-227.
ROTH, W.M., AND LEE, Y.J. 2007. “Vygotsky’s neglected legacy”: Cultural-historical activity theory. Review of Educational Research 77, 2, 186-232.
ROWE, E., ASBELL-CLARKE, J., EAGLE, M., HICKS, A., BARNES, T., BROWN, R. AND EDWARDS, T. 2016. Validating game-based measures of implicit science learning. In T. Barnes, M. Chi, & M. Feng (Eds.) Proceedings of the 9th International Conference on Educational Data Mining, Raleigh, NC, 490-495.
ROWE, E., BAKER, R., AND ASBELL-CLARKE, J. 2015. Strategic game moves mediate implicit science learning. In O.C. Santos, J.G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, J.M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. Desmarais (Eds.) Proceedings of the 8th International Conference on Educational Data Mining, Madrid, Spain, 432-435.
RUGGIERI, S., PEDRESCHI, D., AND TURINI, F. 2010a. Data mining for discrimination discovery. ACM Transactions on Knowledge Discovery from Data (TKDD) 4, 2, 1-40.
RUGGIERI, S., PEDRESCHI, D., AND TURINI, F. 2010b. DCUBE: Discrimination discovery in databases. In Proceedings of the 2010 SIGMOD International Conference on Management of Data, Indianapolis, Indiana, 1127-1130.
SAARELA, M., AND KÄRKKÄINEN, T. 2015. Do country stereotypes exist in educational data? A clustering approach for large, sparse, and weighted data. In O.C. Santos, J.G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, J.M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. Desmarais (Eds.) Proceedings of the 8th International Conference on Educational Data Mining, Madrid, Spain, 156-163.
SAMEI, B., OLNEY, A., KELLY, S., NYSTRAND, M., D’MELLO, S., BLANCHARD, N., AND GRAESSER, A.C. 2015. Modeling classroom discourse: Do models of predicting dialogic instruction properties generalize across populations? In O.C. Santos, J.G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, J.M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. Desmarais (Eds.) Proceedings of the 8th International Conference on Educational Data Mining, Madrid, Spain, 444-447.
SANDVIG, C. 2015. Seeing the sort: The aesthetic and industrial defence of ‘the algorithm’. Media-N 11, 1, 35-51.
SCHNEIDER, B., AND BLIKSTEIN, P. 2015. Unraveling students’ interaction around a tangible interface using multimodal learning analytics. Journal of Educational Data Mining 7, 3, 89-116.
SHAFFER, I.R. 2018. Exploring the performance of facial expression recognition technologies on deaf adults and their children. In Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility, Galway, Ireland, 474-476.
SIEMENS, G. 2005. Connectivism: Learning as network-creation. ASTD Learning News 10, 1, 1-28.
SIEMENS, G., AND BAKER, R.S. 2012. Learning analytics and educational data mining: Towards communication and collaboration. In Proceedings of the 2nd International Conference on Learning Analytics and Knowledge, Vancouver, Canada, 252-254.
SIMOIU, C., CORBETT-DAVIES, AND GOEL, S. 2017. The problem of infra-marginality in outcome tests for discrimination. The Annals of Applied Statistics 11, 3, 1193-1216.
SLIM, A., HUSH, D., OJAH, T., AND BABBITT, T. 2018. Predicting student enrollment based on student and college characteristics. In K.E. Boyer, & M. Yudelson (Eds.) Proceedings of the 11th International Conference on Educational Data Mining, Buffalo, NY, 383-389.
SPOON, K., BEEMER, J., WHITMER, J.C., FAN, J., FRAZEE, J.P., STRONACH, J., BOHONAK, A.J., AND LEVINE, R.A. 2016. Random forests for evaluating pedagogy and informing personalized learning. Journal of Educational Data Mining 8, 2, 20-50.
STOYANOVICH, J., HOWE, B., JAGADISH, H.V., AND MIKLAY, G. 2018. Panel: A debate on data and algorithmic ethics. Proceedings of the VLDB Endowment 11, 12, 2165-2167.
STRECHT, P., CRUZ, L., SOARES, C., MENDES-MOREIRA, J., AND ABREU, R. 2015. A comparative study of regression and classification algorithms for modelling students’ academic performance. In Proceedings of the 8th International Conference on Educational Data Mining, Madrid, Spain, 392-395.
SWEENEY, L. 2013. Discrimination in online ad delivery. Queue 11, 3, 10-29.
SWEENEY, M., LESTER, J., RANGWALA, H., AND JOHRI, A. 2016. Next-term student performance prediction: A recommender systems approach. Journal of Educational Data Mining 8, 1, 22-51.
TATMAN, R. 2017. Gender and dialect bias in YouTube’s automatic captions. In Workshop on Ethics in Natural Language Processing volume 1, Valencia, Spain, 53-59.
TODA, A.M., OLIVEIRA, W., SHI, L., BITTENCOURT, I.I., ISOTANI, S., AND CRISTEA, A. 2019. Planning gamification strategies based on user characteristics and D.M.: A gender-based case study. In C.F. Lynch, A. Merceron, M. Desmarais, & R. Nkambou (Eds.) Proceedings of the 12th International Conference on Educational Data Mining, Montreal, Canada, 438-443.
ÜLTANIR, E. 2012. An epistemological glance at the constructivist approach: Constructivist learning in Dewey, Piaget, and Montessori. International Journal of Instruction 5, 2, 195-212.
VAN MILTERNBURG, E. 2016. Stereotyping and bias in the flickr30k dataset. In J. Edlund, D. Heylen, & P. Paggio (Eds.) Multimodal Corpora: Computer Vision and Language Processing (MMC 2016) Workshop, 1-4.
VEALE, M., AND BINNS, R. 2017. Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data. Big Data & Society 4, 2, 1-17.
VERBERT, K., MANOUSELIS, N., DRACHSLER, H., AND DUVAL, E. 2012. Dataset-driven research to support learning and knowledge analytics. Journal of Educational Technology & Society 15, 3, 133-148.
VERGHESE, A., SHAH, N.H., AND HARRINGTON, R.A. 2018. What this computer needs is a physician: Humanism and artificial intelligence. JAMA 319, 1, 19-20.
WADSWORTH, B.J. 1996. Piaget’s theory of cognitive and affective development: Foundations of constructivism. Longman Publishing.
WARNER, J., DOORENBOS, J., MILLER, B., AND GUO, P. 2015. How high school, college, and online students differentially engage with an interactive digital textbook. In O.C. Santos, J.G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, J.M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. Desmarais (Eds.) Proceedings of the 8th International Conference on Educational Data Mining, Madrid, Spain, 528-531.
WELNER, K.G., AND CARTER, P.L. 2013. Achievement gaps arise from opportunity gaps. In P.L. Carter, & K.G. Welner (Eds.) Closing the Opportunity Gap: What America Must Do to Give Every Child an Even Chance, Oxford University Press, U.K., 1-10.
WILTZ, C. 2017. Bias in, bias out: How A.I. can become racist. https://www.designnews.com/bias-bias-out-how-ai-can-become-racist
WOLFF, A., ZDRAHAL, Z., NIKOLOV, A., AND PANTUCEK, M. 2013. Improving retention: Predicting at-risk students by analyzing clicking behaviour. In Proceedings of the 3rd International Conference on Learning Analytics and Knowledge, Leuven, Belgium, 145-149.
YANG, D., KRAUT, R., AND ROSE, C. 2016. Exploring the effect of student confusion in massive open online courses. Journal of Educational Data Mining 8, 1, 52-83.
YANG, K., AND STOVANOVICH, J. 2017. Measuring fairness in ranked outputs. In Proceedings of the 29th International Conference on Scientific and Statistical Database Management, Chicago, IL, 1-6.
YAO, S., AND HUANG, B. 2017. New fairness metrics for recommendation that embrace differences. In Workshop on Fairness, Accountability and Transparency in Machine Learning (FATML), Halifax, Nova Scotia. arXiv:1706.09838.
ZADROZNY, W., BUDZIKOWSKA, M., CHAI, J., KAMBHATLA, N., LEVESQUE, S., AND NICOLOV, N. 2000. Natural language dialogue for personalized interaction. Communications of the ACM 43, 8, 116-120.
Zafar, M.B., Valera, I., Rodriguez, M.G., and Gummadi, K.P. 2015. Fairness constraints: Mechanisms for fair classification. In A. Singh, & J. Zhu (Eds.) Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 962-970.
Zafar, M.B., Valera, I., Rodriguez, M.G., and Gummadi, K.P. 2017. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 1171-1180.
ZEMEL, R., WU, Y., SWERSKY, K., PITASSI, T., AND DWORK, C. 2013. Learning fair representations. In Proceedings of the International Conference on Machine Learning, Atlanta, GA, 325-333.
ZHENG, Z., VOGELSANG, T., AND PINKWART, N. 2015. The impact of small learning group composition on drop-out rate and learning performance in a MOOC. In O.C. Santos, J.G. Boticario, C. Romero, M. Pechenizkiy, A. Merceron, J.M. Luna, C. Mihaescu, P. Moreno, A. Hershkovitz, S. Ventura, & M. Desmarais (Eds.) Proceedings of the 8th International Conference on Educational Data Mining, Madrid, Spain, 500-503.
ZIEWITZ, M. 2016. Governing algorithms: Myth, mess, and methods. Science, Technology, & Human Values 41, 1, 3-16.
ZIMMERMANN, J., BRODERSEN, K.H., HEINIMANN, H.R., AND BUHMANN, J.M. 2015. A model-based approach to predicting graduate-level performance using indicators of undergraduate-level performance. Journal of Educational Data Mining 7, 3, 151-176.
ŽLIOBAITĖ, I. 2017. Measuring discrimination in algorithmic decision making. Data Mining and Knowledge Discovery 31, 4, 1060-1089.
ŽLIOBAITĖ, I, AND CUSTERS, B. 2016. Using sensitive personal data may be necessary for avoiding discrimination in datadriven decision models. Artificial Intelligence and Law 24, 2, 183-201.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish with this journal agree to the following terms:
- The Author retains copyright in the Work, where the term “Work” shall include all digital objects that may result in subsequent electronic publication or distribution.
- Upon acceptance of the Work, the author shall grant to the Publisher the right of first publication of the Work.
- The Author shall grant to the Publisher and its agents the nonexclusive perpetual right and license to publish, archive, and make accessible the Work in whole or in part in all forms of media now or hereafter known under a Creative Commons 4.0 License (Attribution-Noncommercial-No Derivatives 4.0 International), or its equivalent, which, for the avoidance of doubt, allows others to copy, distribute, and transmit the Work under the following conditions:
- Attribution—other users must attribute the Work in the manner specified by the author as indicated on the journal Web site;
- Noncommercial—other users (including Publisher) may not use this Work for commercial purposes;
- No Derivative Works—other users (including Publisher) may not alter, transform, or build upon this Work,with the understanding that any of the above conditions can be waived with permission from the Author and that where the Work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
- The Author is able to enter into separate, additional contractual arrangements for the nonexclusive distribution of the journal's published version of the Work (e.g., post it to an institutional repository or publish it in a book), as long as there is provided in the document an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post online a pre-publication manuscript (but not the Publisher’s final formatted PDF version of the Work) in institutional repositories or on their Websites prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (see The Effect of Open Access). Any such posting made before acceptance and publication of the Work shall be updated upon publication to include a reference to the Publisher-assigned DOI (Digital Object Identifier) and a link to the online abstract for the final published Work in the Journal.
- Upon Publisher’s request, the Author agrees to furnish promptly to Publisher, at the Author’s own expense, written evidence of the permissions, licenses, and consents for use of third-party material included within the Work, except as determined by Publisher to be covered by the principles of Fair Use.
- The Author represents and warrants that:
- the Work is the Author’s original work;
- the Author has not transferred, and will not transfer, exclusive rights in the Work to any third party;
- the Work is not pending review or under consideration by another publisher;
- the Work has not previously been published;
- the Work contains no misrepresentation or infringement of the Work or property of other authors or third parties; and
- the Work contains no libel, invasion of privacy, or other unlawful matter.
- The Author agrees to indemnify and hold Publisher harmless from Author’s breach of the representations and warranties contained in Paragraph 6 above, as well as any claim or proceeding relating to Publisher’s use and publication of any content contained in the Work, including third-party content.