8,036 research outputs found

    Automated Essay Evaluation Using Natural Language Processing and Machine Learning

    Get PDF
    The goal of automated essay evaluation is to assign grades to essays and provide feedback using computers. Automated evaluation is increasingly being used in classrooms and online exams. The aim of this project is to develop machine learning models for performing automated essay scoring and evaluate their performance. In this research, a publicly available essay data set was used to train and test the efficacy of the adopted techniques. Natural language processing techniques were used to extract features from essays in the dataset. Three different existing machine learning algorithms were used on the chosen dataset. The data was divided into two parts: training data and testing data. The inter-rater reliability and performance of these models were compared with each other and with human graders. Among the three machine learning models, the random forest performed the best in terms of agreement with human scorers as it achieved the lowest mean absolute error for the test dataset

    Evaluating Centering for Information Ordering Using Corpora

    Get PDF
    In this article we discuss several metrics of coherence defined using centering theory and investigate the usefulness of such metrics for information ordering in automatic text generation. We estimate empirically which is the most promising metric and how useful this metric is using a general methodology applied on several corpora. Our main result is that the simplest metric (which relies exclusively on NOCB transitions) sets a robust baseline that cannot be outperformed by other metrics which make use of additional centering-based features. This baseline can be used for the development of both text-to-text and concept-to-text generation systems. </jats:p

    Prompt- and Trait Relation-aware Cross-prompt Essay Trait Scoring

    Full text link
    Automated essay scoring (AES) aims to score essays written for a given prompt, which defines the writing topic. Most existing AES systems assume to grade essays of the same prompt as used in training and assign only a holistic score. However, such settings conflict with real-education situations; pre-graded essays for a particular prompt are lacking, and detailed trait scores of sub-rubrics are required. Thus, predicting various trait scores of unseen-prompt essays (called cross-prompt essay trait scoring) is a remaining challenge of AES. In this paper, we propose a robust model: prompt- and trait relation-aware cross-prompt essay trait scorer. We encode prompt-aware essay representation by essay-prompt attention and utilizing the topic-coherence feature extracted by the topic-modeling mechanism without access to labeled data; therefore, our model considers the prompt adherence of an essay, even in a cross-prompt setting. To facilitate multi-trait scoring, we design trait-similarity loss that encapsulates the correlations of traits. Experiments prove the efficacy of our model, showing state-of-the-art results for all prompts and traits. Significant improvements in low-resource-prompt and inferior traits further indicate our model's strength.Comment: Accepted at ACL 2023 (Findings, long paper

    Automated writing evaluation tools for Indonesian undergraduate English as a foreign language studentsā€™ writing

    Get PDF
    Nowadays, many computer programs are used in the teaching of writing in the context of English as a foreign language (EFL). One of the functions of the computer programs is to provide feedback to EFL studentsā€™ writing so that the quality of their writing can be improved. This study aimed to investigate whether the use of free automated writing evaluation (AWE) tools affect undergraduate EFL studentsā€™ writing skills. In this experimental study, 35 Indonesian undergraduate students of English education department were asked to use two AWE tools, Grammarly and Grammark, in the writing course over four months. Data for this study were collected by using tests and questionnaire. Pre-test, middle test, and post-test were administered to examine the studentsā€™ writing skill improvement. The findings indicate that the sequenced use of two AWE tools, Grammarly followed by Grammark, had a beneficial effect on studentsā€™ writing skill improvement. This study confirms the benefits of free AWE tools in enhancing EFL studentsā€™ writing skills

    Machine Scoring of Student Essays: Truth and Consequences

    Get PDF
    The current trend toward machine-scoring of student work, Ericsson and Haswell argue, has created an emerging issue with implications for higher education across the disciplines, but with particular importance for those in English departments and in administration. The academic community has been silent on the issueā€”some would say excluded from itā€”while the commercial entities who develop essay-scoring software have been very active. Machine Scoring of Student Essays is the first volume to seriously consider the educational mechanisms and consequences of this trend, and it offers important discussions from some of the leading scholars in writing assessment.https://digitalcommons.usu.edu/usupress_pubs/1138/thumbnail.jp

    Entropy and Graph Based Modelling of Document Coherence using Discourse Entities: An Application

    Full text link
    We present two novel models of document coherence and their application to information retrieval (IR). Both models approximate document coherence using discourse entities, e.g. the subject or object of a sentence. Our first model views text as a Markov process generating sequences of discourse entities (entity n-grams); we use the entropy of these entity n-grams to approximate the rate at which new information appears in text, reasoning that as more new words appear, the topic increasingly drifts and text coherence decreases. Our second model extends the work of Guinaudeau & Strube [28] that represents text as a graph of discourse entities, linked by different relations, such as their distance or adjacency in text. We use several graph topology metrics to approximate different aspects of the discourse flow that can indicate coherence, such as the average clustering or betweenness of discourse entities in text. Experiments with several instantiations of these models show that: (i) our models perform on a par with two other well-known models of text coherence even without any parameter tuning, and (ii) reranking retrieval results according to their coherence scores gives notable performance gains, confirming a relation between document coherence and relevance. This work contributes two novel models of document coherence, the application of which to IR complements recent work in the integration of document cohesiveness or comprehensibility to ranking [5, 56]
    • ā€¦
    corecore