References
AVDIU, D., BUI, V., AND KLIMČÍKOVÁ, K. P. 2019. Predicting learner knowledge of individual words using machine learning. In Proceedings of the 8thWorkshop on NLP for Computer Assisted Language Learning, D. Alfter, E. Volodina, L. Borin, I. Pilan, and H. Lange, Eds. LiU Electronic Press, 1–9.
BEINBORN, L., ZESCH, T., AND GUREVYCH, I. 2014. Predicting the difficulty of language proficiency tests. Transactions of the Association for Computational Linguistics 2, 517–530.
BENEDETTO, L., ARADELLI, G., CREMONESI, P., CAPPELLI, A., GIUSSANI, A., AND TURRIN, R. 2021. On the application of transformers for estimating the difficulty of multiple-choice questions from text. In Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications, J. Burstein, A. Horbach, E. Kochmar, R. Laarmann-Quante, C. Leacock, N. Madnani, I. Pilán, H. Yannakoudakis, and T. Zesch, Eds. Association for Computational Linguistics, 147–157.
CHEN, Y., LIU, Q., HUANG, Z., WU, L., CHEN, E., WU, R., SU, Y., AND HU, G. 2017. Tracking knowledge proficiency of students with educational priors. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. Association for Computing Machinery, 989–998.
CHENG, S., LIU, Q., CHEN, E., HUANG, Z., HUANG, Z., CHEN, Y., MA, H., AND HU, G. 2019. Dirt: Deep learning enhanced item response theory for cognitive diagnosis. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. Association for Computing Machinery, 2397–2400.
CULLIGAN, B. 2015. A comparison of three test formats to assess word difficulty. Language Testing 32, 4, 503–520.
DE LA TORRE, J. 2009. Dina model and parameter estimation: A didactic. Journal of educational and behavioral statistics 34, 1, 115–130.
DE LA TORRE, J. AND DOUGLAS, J. A. 2004. Higher-order latent trait models for cognitive diagnosis. Psychometrika 69, 3, 333–353.
DESMARAIS, M. C. 2012. Mapping question items to skills with non-negative matrix factorization. ACM SIGKDD Explorations Newsletter 13, 2, 30–36.
EMBRETSON, S. E. AND REISE, S. P. 2013. Item response theory. Psychology Press.
FUSI, N., SHETH, R., AND ELIBOL, M. 2018. Probabilistic matrix factorization for automated machine learning. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, S. Bengio, H. M. Wallach, H. Larochelle, K. Grauman, and N. Cesa-Bianchi, Eds. Curran Associates Inc., 3352–3361.
GAO, W., LIU, Q., HUANG, Z., YIN, Y., BI, H., WANG, M.-C., MA, J., WANG, S., AND SU, Y. 2021. Rcd: Relation map driven cognitive diagnosis for intelligent education systems. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, 501–510.
GRAVE, É., BOJANOWSKI, P., GUPTA, P., JOULIN, A., AND MIKOLOV, T. 2018. Learning word vectors for 157 languages. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), N. Calzolari, K. Choukri, C. Cieri, T. Declerck, S. Goggi, K. Hasida, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, S. Piperidis, and T. Tokunaga, Eds. European Language Resources Association (ELRA), 3483–3487.
HUANG, Z., LIU, Q., CHEN, Y., WU, L., XIAO, K., CHEN, E., MA, H., AND HU, G. 2020. Learning or forgetting? a dynamic approach for tracking the knowledge proficiency of students. ACM Transactions on Information Systems (TOIS) 38, 2, 1–33.
KILICKAYA, F. ET AL. 2019. Assessing l2 vocabulary through multiple-choice, matching, gap-fill, and word formation items. Lublin Studies in Modern Languages and Literature 43, 3, 155–166.
KINGMA, D. P. AND BA, J. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
KREMMEL, B. AND SCHMITT, N. 2016. Interpreting vocabulary test scores: What do various item formats tell us about learners’ ability to employ words? Language Assessment Quarterly 13, 4, 377– 392.
LEE, D. AND SEUNG, H. S. 2000. Algorithms for non-negative matrix factorization. In Proceedings of the 13th International Conference on Neural Information Processing Systems, T. K. Leen, T. G. Dietterich, and V. Tresp, Eds. MIT Press, 535–541.
LI, J., WANG, F., LIU, Q., ZHU, M., HUANG, W., HUANG, Z., CHEN, E., SU, Y., AND WANG, S. 2022. Hiercdf: A bayesian network-based hierarchical cognitive diagnosis framework. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 904–913.
LIU, Q., HUANG, Z., YIN, Y., CHEN, E., XIONG, H., SU, Y., AND HU, G. 2019. Ekt: Exercise-aware knowledge tracing for student performance prediction. IEEE Transactions on Knowledge and Data Engineering 33, 1, 100–115.
LIU, Q., WU, R., CHEN, E., XU, G., SU, Y., CHEN, Z., AND HU, G. 2018. Fuzzy cognitive diagnosis for modelling examinee performance. ACM Transactions on Intelligent Systems and Technology (TIST) 9, 4, 1–26.
LORD, F. M. 1980. Applications of Item Response Theory to Practical Testing Problems. Routledge.
LOUKINA, A., YOON, S.-Y., SAKANO, J., WEI, Y., AND SHEEHAN, K. 2016. Textual complexity as a predictor of difficulty of listening items in language proficiency tests. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Y. Matsumoto and R. Prasad, Eds. The COLING 2016 Organizing Committee, 3245–3253.
MA, B., HETTIARACHCHI, G. P., AND ANDO, Y. 2022. Format-aware item response theory for predicting vocabulary proficiency. In Proceedings of the 15th International Conference on Educational Data Mining, A. Mitrovic and N. Bosch, Eds. International Educational Data Mining Society, 695–700.
MA, B., HETTIARACHCHI, G. P., FUKUI, S., AND ANDO, Y. 2023a. Each encounter counts: Modeling language learning and forgetting. In LAK23: 13th International Learning Analytics and Knowledge Conference. Association for Computing Machinery, 79–88.
MA, B., HETTIARACHCHI, G. P., FUKUI, S., AND ANDO, Y. 2023b. Exploring the effectiveness of vocabulary proficiency diagnosis using linguistic concept and skill modeling. In Proceedings of the 16th International Conference on Educational Data Mining, M. Feng, T. Käser, and P. Talukdar, Eds. International Educational Data Mining Society, 149–159.
MA, H., ZHU, J., YANG, S., LIU, Q., ZHANG, H., ZHANG, X., CAO, Y., AND ZHAO, X. 2022.
A prerequisite attention model for knowledge proficiency diagnosis of students. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. Association for Computing Machinery, 4304–4308.
MNIH, A. AND SALAKHUTDINOV, R. R. 2007. Probabilistic matrix factorization. In Advances in neural information processing systems, J. Platt, D. Koller, Y. Singer, and S. Roweis, Eds. Vol. 20. Curran Associates, Inc.
NATION, I. S. 2001. Learning vocabulary in another language. Vol. 10. Cambridge university press Cambridge.
RECKASE, M. D. 2009. Multidimensional item response theory models. In Multidimensional item response theory. Springer, 79–112.
ROBERTSON, F. 2021. Word discriminations for vocabulary inventory prediction. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), R. Mitkov and G. Angelova, Eds. INCOMA Ltd., 1188–1195.
SETTLES, B., T LAFLAIR, G., AND HAGIWARA, M. 2020. Machine learning–driven language assessment.
Transactions of the Association for computational Linguistics 8, 247–263.
SONG, L., HE, M., SHANG, X., YANG, C., LIU, J., YU, M., AND LU, Y. 2023. A deep cross-modal neural cognitive diagnosis framework for modeling student performance. Expert Systems with Applications, 120675.
STÆHR, L. S. 2008. Vocabulary size and the skills of listening, reading and writing. Language Learning Journal 36, 2, 139–152.
SUN, Y., YE, S., INOUE, S., AND SUN, Y. 2014. Alternating recursive method for q-matrix learning. In Proceedings of the 7th International Conference on Educational Data Mining, J. Stamper, Z. Pardos, M. Mavrikis, and B. M. McLaren, Eds. International Educational Data Mining Society, 14–20.
SUSANTI, Y., NISHIKAWA, H., TOKUNAGA, T., OBARI, H., ET AL. 2016. Item difficulty analysis of english vocabulary questions. In Proceedings of the 8th International Conference on Computer Supported Education (CSEDU 2016), J. Uhomoibhi, G. Costagliola, S. Zvacek, and B. M. McLaren, Eds. Vol. 1. SCITEPRESS - Science and Technology Publications, Lda, 267–274.
THAI-NGHE, N. AND SCHMIDT-THIEME, L. 2015. Multi-relational factorization models for student modeling in intelligent tutoring systems. In 2015 Seventh international conference on knowledge and systems engineering (KSE). IEEE, 61–66.
TONG, S., LIU, J., HONG, Y., HUANG, Z., WU, L., LIU, Q., HUANG, W., CHEN, E., AND ZHANG, D. 2022. Incremental cognitive diagnosis for intelligent education. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, 1760–1770.
TONG, S., LIU, Q., YU, R., HUANG, W., HUANG, Z., PARDOS, Z. A., AND JIANG, W. 2021. Item response ranking for cognitive diagnosis. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI), Z.-H. Zhou, Ed. International Joint Conferences on Artificial Intelligence, 1750–1756.
TOSCHER, A. AND JAHRER, M. 2010. Collaborative filtering applied to educational data mining. KDD Cup 2010 Workshop, Held as part of 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2010).
TRAUB, R. E. 1993. On the equivalence of the traits assessed by multiple-choice and constructedresponse tests. In Construction versus choice in cognitive measurement: Issues in constructed response, performance testing, and portfolio assessment, W. C. Ward and R. E. Bennett, Eds. Routledge, 29–44.
VAN DER LINDEN, W. J. AND HAMBLETON, R. 1997. Handbook of item response theory. Vol. 1. Taylor & Francis Group.
VAN DER MAATEN, L. AND HINTON, G. 2008. Visualizing data using t-sne. Journal of machine learning research 9, 11.
WANG, F., LIU, Q., CHEN, E., HUANG, Z., CHEN, Y., YIN, Y., HUANG, Z., AND WANG, S. 2020. Neural cognitive diagnosis for intelligent education systems. In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34. AAAI Press, 6153–6161.
WANG, F., LIU, Q., CHEN, E., HUANG, Z., YIN, Y., WANG, S., AND SU, Y. 2023. Neuralcd: A general framework for cognitive diagnosis. IEEE Transactions on Knowledge and Data Engineering 35, 8, 8312–8327.
WANG, X., HUANG, C., CAI, J., AND CHEN, L. 2021. Using knowledge concept aggregation towards accurate cognitive diagnosis. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. Association for Computing Machinery, 2010–2019.
WANG, Z., GU, Y., LAN, A., AND BARANIUK, R. 2020. Varfa: A variational factor analysis framework for efficient bayesian learning analytics. In Proceedings of the 13th International Conference on Educational Data Mining, A. N. Rafferty, J. Whitehill, V. Cavalli-Sforza, and C. Romero, Eds. International Educational Data Mining Society, 696–699.
YAO, L. AND SCHWARZ, R. D. 2006. A multidimensional partial credit model with associated item and test statistics: An application to mixed-format tests. Applied psychological measurement 30, 6, 469–492.
ZYLICH, B. AND LAN, A. 2021. Linguistic skill modeling for second language acquisition. In LAK21: 11th International Learning Analytics and Knowledge Conference. Association for Computing Machinery, 141–150.