Computerized classification of student answers offers the possibility of instant feedback and improved learning. Open response (OR) questions provide greater insight into student thinking and understanding than more constrained multiple choice (MC) questions, but development of automated classifiers is more difficult, often requiring training a machine learning system with many human-classified answers. Here we explore a novel intermediate constraint question format called WordBytes (WB) where students assemble one-sentence answers to two different college evolutionary biology questions by choosing, then ordering, fixed tiles containing words and phrases. We found WB allowed students to construct hundreds to thousands of different answers (≤20 tiles), with multiple ways to express correct and incorrect answers with different misconceptions. We found humans could specify rules for an automated WB grader that could accurately classify answers as correct/incorrect with Cohen's kappa ≥ 0.88, near the measured intra-rater reliability of two human graders and the performance of machine classification of OR answers (Nehm et al., 2012). Finer-grained classification to identify the specific misconception had lower accuracy (Cohen's kappa < 0.75), which could be improved either by using a machine learner or revising the rules, but both would require considerably more development effort. Our results indicate that WB may allow rapid development of automated correct/incorrect answer classification without collecting and hand-grading hundreds of student answers.
How to Cite
student answers, assessment, constructed response, reliability
BEGGROW E. P., HA M., NEHM R. H., PEARL D., AND BOONE W. J. 2014. Assessing scientific practices using machine-learning methods: How closely do they match clinical interview performance? Journal of Science Education and Technology, 23, 160-182.
BEJAR I. I. 1991. A methodology for scoring open-ended architectural design problems. Journal of Applied Psychology, 76, 4, 522-532.
BENNETT R. E. 1993. On the meaning of constructed response. In Bennett R. E., Ward W. C. (Eds.), Construction versus choice in cognitive measurement: Issues in constructed response, performance testing, and portfolio assessment. Lawrence Erlbaum Associates. Hillsdale NJ. 1-27.
BLACK P. AND WILLIAM D. 1998. Assessment and classroom learning. Assessment in Education: Principles, Policy & Practice, 5, 1, 7-74. 64 Journal of Educational Data Mining, Volume 9, No 2, 2017
CHANG C. C. AND LIN C. J. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 3, 27.
COHEN J. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 1, 37-46.
HA M., NEHM R. H., URBAN-LURAIN M., AND MERRILL J. E. 2011. Applying Computerized Scoring Models of Written Biological Explanations across Courses and Colleges: Prospects and Limitations. CBE Life Science Education, 10, 379.
HA M. AND NEHM R. H. 2016. The impact of misspelled words on automated computer scoring: a case study of scientific explanations. Journal of Science Education and Technology, 25, 3, 358.
HERRON J., ABRAHAM J., AND MEIR E. 2014. Mendelian Pigs. Simbio.com.
HERRON J. AND MEIR E. 2014. Darwinian Snails. Simbio.com.
HOFMANN M. AND KLINKENBERG R. (eds) 2013. RapidMiner: Data mining use cases and business analytics applications (Chapman & Hall/CRC Data Mining and Knowledge Discovery Series), CRC Press.
HSU C. W., CHANG C. C., AND LIN C. J. 2003. A practical guide to support vector classification. https://www.cs.sfu.ca/people/Faculty/teaching/726/spring11/svmguide.pdf
KLEIN S. P. 2008. Characteristics of hand and machine-assigned scores to college students’ answers to open-ended tasks. In Nolan D. Speed T. (Eds.) Probability and statistics: Essays in Honor of David A. Freedman. Beachwood, OH. 76-89.
KLOPFER E. 2008. Augmented learning: Research and design of mobile educational games. MIT Press, Cambridge, MA.
KRIPPENDORFF K. 1980. Content analysis: An introduction to its methodology. Sage Publications.
LANDIS J. R. AND KOCH G. G. 1977. The measurement of observer agreement for categorical data. Biometrics, 33, 159-174.
LEELAWONG K. AND BISWAS G. 2008. Designing learning by teaching agents: The Betty’s Brain system. International Journal of Artificial Intelligence in Education, 18, 3, 181-208.
LUKHOFF B. 2010. The design and validation of an automatically-scored constructed-response item type for measuring graphical representation skill. Doctoral dissertation, Stanford University, Stanford, CA.
LUCKIE D. B., HARRISON S. H., WALLACE J. L., AND EBERT-MAY D. 2008. Studying C-TOOLS: Automated grading for online concept maps. Conference Proceedings from Conceptual Assessment in Biology II, 2, 1, 1-13.
MAYFIELD E., ADAMSON D., AND ROSE C. P. 2014. LightSide researcher’s workbench user manual.
MOHARRERI K., HA M., AND NEHM R. H. 2014. EvoGrader: an online formative assessment tool for automatically evaluating written evolutionary explanations. Evolution: Education and Outreach, 7, 15.
National Research Council. 2001. Knowing what students know: The science and design of educational assessment, Washington DC: National Academies Press.
NEHM R. H., HA M., AND MAYFIELD E. 2012. Transforming Biology Assessment with Machine Learning: Automated Scoring of Written Evolutionary Explanations. Journal of Science Education Technology, 21, 183.
NEHM R. H. AND HAERTIG H. 2012. Human vs. computer diagnosis of students’ natural selection knowledge: testing the efficacy of text analytic software. Journal of Science Education and Technology, 21, 1, 56-73. 65 Journal of Educational Data Mining, Volume 9, No 2, 2017
QUINLAN R. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA.
RITTHOFF O., KLINKENBERG R., MIERSWA I., AND FELSKE S. 2001. YALE: Yet Another Learning Environment. LLWA’01-Tagungsband der GI-Workshop-Woche Lehren-Lehren-Wissen Adaptivitat. University of Dortmund, Dortmund, Germany. Technical Report, 763, 84-92.
ROMERO C., VENTURA S., PECHENIZKLY M., AND BAKER R. S. 2010. Handbook of Educational Data Mining. CRC Press.
SCALISE K. AND GIFFORD B. 2006. Computer based assessment in E-Learning: A framework for constructing “intermediate constraint” questions and tasks for technology platforms. Journal of Technology, Learning, and Assessment, 4, 6, 4-44.
SHUTE V. J. 2008. Focus on Formative Feedback. Review of Education Research, 78, 1, 153- 189.
SMITH M. K., WOOD W. B., AND KNIGHT J. K. 2008. The genetics concept assessment: A new concept inventory for gauging student understanding of genetics. CBE Life Sciences Education, 7, 4, 422-430.
SPSS INC. 2006. SPSS text analysis for surveys™ 2.0 user’s guide. SPSS Inc, Chicago, IL.
THE CARNEGIE CLASSIFICATION OF INSTITUTIONS OF HIGHER EDUCATION. n.d. About Carnegie Classification. Retrieved (Dec 15, 2016) from http://carnegieclassifications.iu.edu/ .
VOSNIADOU S. 2008. Conceptual Change Research: An Introduction. In Stella Vosniadou, ed. International Handbook of Research on Conceptual Change. (first ed). New York/Abingdon: Routeledge, xiii-xxviii.
YANG Y., BUCKENDAHL C. W., JUSZKIEWICZ P. J., AND BHOLA D. S. 2002. A review of strategies for validating computer automated scoring. Applied Measurement of Education, 15, 4, 391- 412.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors who publish with this journal agree to the following terms:
- The Author retains copyright in the Work, where the term “Work” shall include all digital objects that may result in subsequent electronic publication or distribution.
- Upon acceptance of the Work, the author shall grant to the Publisher the right of first publication of the Work.
- The Author shall grant to the Publisher and its agents the nonexclusive perpetual right and license to publish, archive, and make accessible the Work in whole or in part in all forms of media now or hereafter known under a Creative Commons 4.0 License (Attribution-Noncommercial-No Derivatives 4.0 International), or its equivalent, which, for the avoidance of doubt, allows others to copy, distribute, and transmit the Work under the following conditions:
- Attribution—other users must attribute the Work in the manner specified by the author as indicated on the journal Web site;
- Noncommercial—other users (including Publisher) may not use this Work for commercial purposes;
- No Derivative Works—other users (including Publisher) may not alter, transform, or build upon this Work,with the understanding that any of the above conditions can be waived with permission from the Author and that where the Work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.
- The Author is able to enter into separate, additional contractual arrangements for the nonexclusive distribution of the journal's published version of the Work (e.g., post it to an institutional repository or publish it in a book), as long as there is provided in the document an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post online a pre-publication manuscript (but not the Publisher’s final formatted PDF version of the Work) in institutional repositories or on their Websites prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (see The Effect of Open Access). Any such posting made before acceptance and publication of the Work shall be updated upon publication to include a reference to the Publisher-assigned DOI (Digital Object Identifier) and a link to the online abstract for the final published Work in the Journal.
- Upon Publisher’s request, the Author agrees to furnish promptly to Publisher, at the Author’s own expense, written evidence of the permissions, licenses, and consents for use of third-party material included within the Work, except as determined by Publisher to be covered by the principles of Fair Use.
- The Author represents and warrants that:
- the Work is the Author’s original work;
- the Author has not transferred, and will not transfer, exclusive rights in the Work to any third party;
- the Work is not pending review or under consideration by another publisher;
- the Work has not previously been published;
- the Work contains no misrepresentation or infringement of the Work or property of other authors or third parties; and
- the Work contains no libel, invasion of privacy, or other unlawful matter.
- The Author agrees to indemnify and hold Publisher harmless from Author’s breach of the representations and warranties contained in Paragraph 6 above, as well as any claim or proceeding relating to Publisher’s use and publication of any content contained in the Work, including third-party content.