3,718 research outputs found

    Automatic assessment of text-based responses in post-secondary education: A systematic review

    Full text link
    Text-based open-ended questions in academic formative and summative assessments help students become deep learners and prepare them to understand concepts for a subsequent conceptual assessment. However, grading text-based questions, especially in large courses, is tedious and time-consuming for instructors. Text processing models continue progressing with the rapid development of Artificial Intelligence (AI) tools and Natural Language Processing (NLP) algorithms. Especially after breakthroughs in Large Language Models (LLM), there is immense potential to automate rapid assessment and feedback of text-based responses in education. This systematic review adopts a scientific and reproducible literature search strategy based on the PRISMA process using explicit inclusion and exclusion criteria to study text-based automatic assessment systems in post-secondary education, screening 838 papers and synthesizing 93 studies. To understand how text-based automatic assessment systems have been developed and applied in education in recent years, three research questions are considered. All included studies are summarized and categorized according to a proposed comprehensive framework, including the input and output of the system, research motivation, and research outcomes, aiming to answer the research questions accordingly. Additionally, the typical studies of automated assessment systems, research methods, and application domains in these studies are investigated and summarized. This systematic review provides an overview of recent educational applications of text-based assessment systems for understanding the latest AI/NLP developments assisting in text-based assessments in higher education. Findings will particularly benefit researchers and educators incorporating LLMs such as ChatGPT into their educational activities.Comment: 27 pages, 4 figures, 6 table

    Automated Japanese essay scoring system:jess

    Full text link
    We have developed an automated Japanese essay scoring system named jess. The system evaluates an essay from three features: (1) Rhetoric | ease of read-ing, diversity of vocabulary, percentage of big words (long, dicult words), and percentage of passive sen-tences, (2) Organization | characteristics associated with the orderly presentation of ideas, such as rhetori-cal features and linguistic cues, (3) Contents | vocab-ulary related to the topic, such as relevant information and precise or specialized vocabulary. The nal eval-uated score is calculated by deducting from a perfect score assigned by a learning process using editorial

    Development of online system checkable for Japanese writing tasks

    Get PDF
    Online learning environments have attracted attention of many educators especially in recent years since COVID-19 is still ongoing situation. Meanwhile, the various resources are becoming more and more available in online. In this study, some available online resources were used to create the system checkable for some writing abilities and the depth of understanding for Japanese writing tasks. The system was also made to provide some evaluation scores without depending the number of characters. The demonstration of system were given after the integration and implementation of some modules customized using online resources. The data sheet in the system finally saved the written content for 67 students. The writing task was given as the writing of summarization for what a student understand in a class. The following features were demonstrated from the analytical findings of online system developed in this study. The effectiveness of some available online resources was indicated through the demonstration of system checkable for some writing abilities and the depth of understanding for Japanese writing tasks. It was definite that the system was also made to provide some evaluation scores without depending the number of characters

    DeepEval: An Integrated Framework for the Evaluation of Student Responses in Dialogue Based Intelligent Tutoring Systems

    Get PDF
    The automatic assessment of student answers is one of the critical components of an Intelligent Tutoring System (ITS) because accurate assessment of student input is needed in order to provide effective feedback that leads to learning. But this is a very challenging task because it requires natural language understanding capabilities. The process requires various components, concepts identification, co-reference resolution, ellipsis handling etc. As part of this thesis, we thoroughly analyzed a set of student responses obtained from an experiment with the intelligent tutoring system DeepTutor in which college students interacted with the tutor to solve conceptual physics problems, designed an automatic answer assessment framework (DeepEval), and evaluated the framework after implementing several important components. To evaluate our system, we annotated 618 responses from 41 students for correctness. Our system performs better as compared to the typical similarity calculation method. We also discuss various issues in automatic answer evaluation

    Intelligent CALL

    Get PDF
    This chapter describes the provision of corrective feedback in Tutorial CALL, sketching the challenges in the research and development of computational parsers and grammars. The automatic evaluation and assessment of free-form learner texts paying attention to linguistic accuracy, rhetorical structures, textual complexity, and written fluency is at the centre of attention in the section on Automatic Writing Evaluation. Reading and Incidental Vocabulary Learning Aids looks at the advantages of lexical glosses, or look-up information in electronic dictionaries for reading material aimed at language learners. The conclusion looks at the role of ICALL in the context of general trends in CALL

    A robust methodology for automated essay grading

    Get PDF
    None of the available automated essay grading systems can be used to grade essays according to the National Assessment Program – Literacy and Numeracy (NAPLAN) analytic scoring rubric used in Australia. This thesis is a humble effort to address this limitation. The objective of this thesis is to develop a robust methodology for automatically grading essays based on the NAPLAN rubric by using heuristics and rules based on English language and neural network modelling

    Analyzing grammarly software for corrective feedback: Teacher’s perspective on affordances, limitations and implementation

    Get PDF
    Providing support and feedback in the development of ESL writing skills is imperative for engineering students. The goal of the current study is to assess the potential of using Grammarly software in editing the writing of ESP students while taking into account the current technological advancements in providing computer-mediated corrective feedback and the propensity of engineering students to use digital tools. 35 short essays submitted by first-year students at the University of Novi Sad's Faculty of Technical Sciences were examined in the study. A random selection of essays was made from a pool of online essays written by students during the academic year 2021/2022. In order to compare Grammarly-provided suggestions with the teacher's corrections, the selected essays were corrected by both the teacher and Grammarly software. For the purpose of determining the affordances and limitations of using this digital tool to provide corrective feedback, the authors examined the differences between Grammarly-suggested corrections and teacher-made corrections by classifying them into five groups. According to the results, this tool can be beneficial to ESP classes to some extent, but teacher feedback still plays an important role

    A review of research on student self-assessment in second / foreign language writing

    Get PDF
    The present article reviews the research on writing Self-Assessment (SA) conducted in the period of 2000 - 2020. The article discusses the theoretical foundation for SA following the review of conceptualization of SA by various researchers. We were particularly interested in (i) examining whether the concept of SA has witnessed an expansion during the two decades in English as a foreign/second language (EFL/ESL) writing and (ii) determining the components that were found interconnected to the concept of SA in the writing context. The findings related to the first objective indicate that the SA has expanded in its conceptualization; however, its definition and application are expected to broaden. As a result of analyzing the studies, based on the second objective, the following themes emerged: SA and training students, SA and the dialogue between students and teachers, SA and teacher training, SA and affective variables, SA and cultural components, SA and age, SA and instrumentation, SA and exemplars, SA and teacher feedback, SA and prior experience, SA and conducive environments, SA and contextualizing SA items. The review shows an important role of the components in the concept of SA in the EFL/ESL writing context; however, studies in this regard are scarce. Another group of studies that emerged was those that examined perceptions towards SA. We conclude with a critical reflection on the reviewed literature and recommend new directions for further studies

    Automated Scoring of Speaking and Writing: Starting to Hit its Stride

    Get PDF
    This article reviews recent literature (2011–present) on the automated scoring (AS) of writing and speaking. Its purpose is to first survey the current research on automated scoring of language, then highlight how automated scoring impacts the present and future of assessment, teaching, and learning. The article begins by outlining the general background of AS issues in language assessment and testing. It then positions AS research with respect to technological advancements. Section two details the literature review search process and criteria for article inclusion. In section three, the three main themes emerging from the review are presented: automated scoring design considerations, the role of humans and artificial intelligence, and the accuracy of automated scoring with different groups. Two tables show how specific articles contributed to each of the themes. Following this, each of the three themes is presented in further detail, with a sequential focus on writing, speaking, and a short summary. Section four addresses AS implementation with respect to current assessment, teaching, and learning issues. Section five considers future research possibilities related to both the research and current uses of AS, with implications for the Canadian context in terms of the next steps for automated scoring

    Automated Writing Evaluation for non-native speaker English academic writing: The case of IADE and its formative feedback

    Get PDF
    This dissertation presents an innovative approach to the development and empirical evaluation of Automated Writing Evaluation (AWE) technology used for teaching and learning. It introduces IADE (Intelligent Academic Discourse Evaluator), a new web-based AWE program that analyzes research article Introduction sections and generates immediate, individualized, discipline-specific feedback. The major purpose of the dissertation was to implement IADE as a formative assessment tool complementing L2 graduate-level academic writing instruction and to investigate the effectiveness and appropriateness of its automated evaluation and feedback. To achieve this goal, the study sought evidence of IADE\u27s Language Learning Potential, Meaning Focus, Learner Fit, and Impact qualities outlined in Chapelle\u27s (2001) CALL evaluation conceptual framework. A mixed-methods approach with a concurrent transformative strategy was employed. Quantitative data consisted of Likert-scale, yes/no, and open-ended survey responses; automated and human scores for first and last drafts; pre-/post test scores; and frequency counts for draft submission and for access to IADE\u27s Help Options. Qualitative data contained students\u27 first and last drafts as well as transcripts of think-aloud protocols and Camtasia computer screen recordings, observations, and semi-structured interviews. The findings indicate that IADE can be considered an effective formative assessment tool suitable for implementation in the targeted instructional context. Its effectiveness was a result of combined strengths of its Language Learning Potential, Meaning Focus, Learner Fit, and Impact qualities, which were all enhanced by the program\u27s automated feedback. The strength of Language Learning Potential was supported by evidence of noticing of and focus on discourse form, improved rhetorical quality of writing, increased learning gains, and relative helpfulness of practice and modified interaction. Learners\u27 focus on the functional meaning of discourse and construction of such meaning served as evidence of strong Meaning Focus. IADE\u27s automated feedback characteristics and Help Options were appropriate for targeted learners, which speaks of adequate Learner Fit. Finally, despite some negative effects caused by IADE\u27s numerical feedback, overall Impact, exerted at affective, intrinsic, pragmatic, and cognitive levels, was found to be positive due to the color-coded type of feedback. The results of this study provide valuable empirical knowledge to the areas of L2 academic writing, AWE, formative assessment, and I/CALL. They have important practical and theoretical implications and are informative for future research as well as for the design and application of new learning technologies
    corecore