Incorporating learning characteristics into automatic essay scoring models: What individual differences and linguistic features tell us about writing quality.



Published Dec 25, 2016
Scott Crossley Laura K Allen Erica L Snow Danielle S McNamara


This study investigates a new approach to automatically assessing essay quality that combines traditional approaches based on assessing textual features with new approaches that assess individual differences in writers such as demographic information, standardized test scores, and survey results. The results demonstrate that combining text features and individual differences increases the accuracy of automatically assigned essay scores over using either individual differences or text features alone. The findings presented here have important implications for both educators and researchers because they reveal that essay scoring methods can benefit from the incorporation of features taken not only from the essay itself (e.g., features related to lexical and syntactic complexity), but also from the writer (e.g., vocabulary knowledge and writing attitudes). Such findings expand our knowledge of textual and non-textual features that are predictive of writing success.

How to Cite

Crossley, S., Allen, L. K., Snow, E. L., & McNamara, D. S. (2016). Incorporating learning characteristics into automatic essay scoring models: What individual differences and linguistic features tell us about writing quality. JEDM | Journal of Educational Data Mining, 8(2), 1-19. Retrieved from
Abstract 562 | PDF Downloads 269


Allen, L. K., Crossley, S. A., Snow, E. L., & McNamara, D. S. (2014). Game-based writing strategy tutoring for second language learners: Game enjoyment as a key to engagement. Language Learning and Technology, 18, 124-150.

Allen, L. K., & McNamara, D. S. (in press). You are your words: Modeling students’ vocabulary knowledge with natural language processing. Manuscript submitted to the 8th International Conference on Educational Data Mining (EDM 2015).

Allen, L. K., Snow, E.L., Crossley, S. A., Jackson, G. T., & McNamara, D. S. (2014). Reading comprehension components and their relation to the writing process. L'année psychologique/Topics in Cognitive Psychology, 114, 663-691.

Applebee, A. N., Langer, J. A., Jenkins, L. B., Mullis, I., & Foertsch, M. A. (1990). Learning to write in our nation’s schools. Washington, DC: U.S. Department of Education, Office of Educational Research and Improvement.

Attali, Y., & Burstein, J. (2006). Automated essay scoring with e-rater V.2. Journal of Technology, Learning, and Assessment. 4, 3.

Attali, Y., & Powers, D. (2008). A developmental writing scale. ETS Research Report Series, 2008(1). Princeton, NJ: ETS

Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press.

Biber, D. (1995). Dimensions of register variation: A cross-linguistic comparison. Cambridge, UK: Cambridge University Press.

Bereiter, C. (2003). Foreword. In Mark D. Shermis, & Jill C. Burstein (Eds.), Automated essay scoring: a cross-disciplinary approach (pp. vii–ix). Mahwah, NJ: Lawrence Erlbaum Associates.

Burstein, J. (2003). The e-rater scoring engine: Automated Essay Scoring with natural language processing. In M. D. Shermis and J. C. Burstein (Eds.), Automated Essay Scoring: A cross-disciplinary approach (pp. 113–121). Mahwah, NJ: Lawrence Erlbaum Associates.

Burstein, J., Chodorow, M., & Leacock, C. (2004). Automated essay evaluation: The Criterion online writing system. AI Magazine, 25, 27-36. College Board (2011). Essay scoring guide: A framework for scoring SAT essays. Retrieved November 15, 2011, from /testing/satreasoning/scores/essay/guide

Crossley, S. A., Allen, L. K., & McNamara, D. S. (2014). Analyzing discourse processing using a simple natural language processing tool (SiNLP). Discourse Processes, 51, 511-534.

Crossley, S. A., Roscoe, R., & McNamara, D. S. (2013). Using automatic scoring models to detect changes in student writing in an intelligent tutoring system. In McCarthy, P. M. & Youngblood G. M., (Eds.). Proceedings of the 26 th International Florida Artificial Intelligence Research Society (FLAIRS) Conference. (pp. 208-213). Menlo Park, CA: The AAAI Press.

Crossley, S. A., Roscoe, R., & McNamara, D. S. (2014). What is quality writing? An investigation into the multiple ways writers can write high quality essays. Written Communication, 31, 184-214.

Crossley, S. A., Roscoe, R. D., McNamara, D. S., & Graesser, A. C. (2011). Predicting human scores of essay quality using computational indices of linguistic and textual features. In G. Biswas, S. Bull, J. Kay, & A Mitrovic (Eds.), Proceedings of the 15 th International Conference on Artificial Intelligence in Education (pp. 438-440). Auckland, New Zealand: AIED.

Daly, J. A., & Miller, M. D. (1975). Apprehension of writing as a predictor of message intensity. Journal of Psychology, 89, 175-177.

Deane, P. (2013). On the relation between automated essay scoring and modern views of the writing construct. Assessing Writing, 18, 7-24.

Dikli, S. (2006). An overview of automated scoring of essays. Journal of Technology, Learning, and Assessment, 5.

Enright, M. K., & Quinlan, T. (2010). Complementing human judgment of essays written by English language learners with e-rater scoring. Language Testing, 27, 317-334. doi: 10.1177/0265532210363144

Ferrari, M., Bouffard, T., & Rainville, L. (1998). What makes a good writer? Differences in good and poor writers’ self-regulation of writing. Instructional Science, 26, 473- 488. doi:10.1023/A:1003202412203

Fitzgerald, J. & Shanahan, T. (2000). Reading and writing relations and their development. Educational Psychologist, 35, 39-50.

Graham, S. (2006). Writing. In P. Alexander & P. Winne (Eds.), Handbook of educational psychology (pp. 457-477). Mahwah, NJ: Erlbaum.

Grimes, D., & Warschauer, W. (2010). Utility in a fallible tool: A multi-site case study of automated writing evaluation. Journal of Technology, Learning, and Assessment. 8, 4-43.

Haswell, R. H. (2006). Automatons and automated scoring: Drudges, black boxes, and dei ex machina. In: P. F. Ericsson & R. H. Haswell (Eds.), Machine scoring of student essays: Truth and consequences (pp. 57–78). Logan, UT: Utah State University Press.

Hearst, M. (2002). The debate on automated essay scoring. Intelligent Systems and their Applications, IEEE, 15, 22-37.

MacGinitie, W. H., & MacGinitie, R. K. (1989). Gates-MacGinitie reading tests. Chicago: Riverside.

McNamara, D. S., Crossley, S. A., & McCarthy, P. M. (2010). The linguistic features of writing quality. Written Communication, 27, 57-86.

McNamara, D. S., Crossley, S. A., & Roscoe, R. D. (2013). Natural language processing in an intelligent writing strategy tutoring system. Behavior Research Methods, 45, 499-515

McNamara, D. S., Crossley, S. A., Roscoe, R. D., Allen, L. K., & Dai, J. (2015). Hierarchical classification approach to automated essay scoring. Assessing Writing, 23, 35-59.

McNamara, D. S., Graesser, A. C., McCarthy, P., & Cai, Z. (2014). Automated evaluation of text and discourse with Coh-Metrix. Cambridge: Cambridge University Press.

McNamara, D. S., Raine, R., Roscoe, R., Crossley, S., Jackson, G. T., Dai, J., Cai, Z., Renner, A., Brandon, R., Weston, J., Dempsey, K., Carney, D., Sullivan, S., Kim, L.,

Rus, V., Floyd, R., McCarthy, P. M., & Graesser, A.C. (2012). The Writing-Pal: Natural language algorithms to support intelligent tutoring on writing strategies. In P.M. McCarthy & C. Boonthum-Denecke (Eds.), Applied natural language processing and content analysis: Identification, investigation, and resolution (pp. 298-311). Hershey, P.A.: IGI Global.

O'Reilly, T. & McNamara, D. S. (2007). The impact of science knowledge, reading strategy knowledge on more traditional “High-Stakes” measures of high school students’ science achievement. American Educational Research Journal, 44, 161- 196.

O'Reilly, T., Best, R., & McNamara, D. S. (2004). Self-explanation reading training: Effects for low-knowledge readers. In K. Forbus, D. Gentner, & T. Regier (Eds.), Proceedings of the 26th Annual Cognitive Science Society (pp. 1053-1058). Mahwah, NJ: Erlbaum.

Perelman, L. (2012). Construct validity, length, score, and time in holistically graded writing assessments: The case against automated essay scoring (AES). In C. Bazerman, C. Dean, J. Early, K. Lunsford, S. Null, P. Rogers, & A. Stansell (Eds.), International advances in writing research: Cultures, places, measures (pp. 121– 131). Fort Collins, Colorado: WAC Clearinghouse/Anderson, SC: Parlor Press.

Reid. J. (1992). A computer text analysis of four cohesion devices in English discourse by native and nonnative writers. Journal of Second Language Writing, 1, 79-107.

Roscoe, R. D., Allen, L. K., Weston, J. L., Crossley, S. A., & McNamara, D. S. (2014). The Writing Pal intelligent tutoring system: Usability testing and development. Computers and Composition, 34, 39-59.

Roscoe, R. D., Crossley, S. A., Snow, E. L., Varner (Allen), L. K., & McNamara, D. S. (2014). Writing quality, knowledge, and comprehension correlates of human and automated essay scoring. In W. Eberle & C. Boonthum-Denecke (Eds.), Proceedings of the 27th International Florida Artificial Intelligence Research Society (FLAIRS) Conference (pp. 393-398). Palo Alto, CA: AAAI Press.

Rudner, L., Garcia, V., & Welch, C. (2006). An evaluation of the IntelliMetric essay scoring system. Journal of Technology, Learning, and Assessment, 4, 4.

Saddler, B., & Graham, S. (2007). The relationship between writing knowledge and writing performance among more and less skilled writers. Reading and Writing Quarterly, 23, 231-247.

Scardamalia, M., & Bereiter, C. (1987). Knowledge telling and knowledge transforming in written composition. In S. Rosenberg (Ed.), Advances in applied psycholinguistics: Reading, writing, and language learning (vol. 2, pp. 142-175). New York:

Cambridge University Press. Staehr, L. S. (2008). Vocabulary size and the skills of listening, reading and writing. Language Learning Journal, 36, 139-152.

Tierney, R. J., & Shanahan, T. (1991). Research on the reading-writing relationship: Interactions, transactions, and outcomes. In R. Barr, M. L. Kamil, P. Mosenthal, & P. D. Pearson (Eds.), The handbook of reading research (Vol. 2, pp. 246-280). New York: Longman.

Varner, L. K., Roscoe, R. D., & McNamara, D. S. (2013). Evaluative misalignment of 10 th-grade student and teacher criteria for essay quality: An automated textual analysis. Journal of Writing Research, 5, 35-59.

Warschauer, M., & Ware, P. (2006). Automated writing evaluation: defining the classroom research agenda. Language Teaching Research, 10, 1-24.

Witte, S., & Faigley, L. (1981). Coherence, cohesion, and writing quality. College Composition and Communication, 32, 189-204.