8,036 research outputs found
Automated Essay Evaluation Using Natural Language Processing and Machine Learning
The goal of automated essay evaluation is to assign grades to essays and provide feedback using computers. Automated evaluation is increasingly being used in classrooms and online exams. The aim of this project is to develop machine learning models for performing automated essay scoring and evaluate their performance. In this research, a publicly available essay data set was used to train and test the efficacy of the adopted techniques. Natural language processing techniques were used to extract features from essays in the dataset. Three different existing machine learning algorithms were used on the chosen dataset. The data was divided into two parts: training data and testing data. The inter-rater reliability and performance of these models were compared with each other and with human graders. Among the three machine learning models, the random forest performed the best in terms of agreement with human scorers as it achieved the lowest mean absolute error for the test dataset
Evaluating Centering for Information Ordering Using Corpora
In this article we discuss several metrics of coherence defined using centering theory and investigate the usefulness of such metrics for information ordering in automatic text generation. We estimate empirically which is the most promising metric and how useful this metric is using a general methodology applied on several corpora. Our main result is that the simplest metric (which relies exclusively on NOCB transitions) sets a robust baseline that cannot be outperformed by other metrics which make use of additional centering-based features. This baseline can be used for the development of both text-to-text and concept-to-text generation systems. </jats:p
Prompt- and Trait Relation-aware Cross-prompt Essay Trait Scoring
Automated essay scoring (AES) aims to score essays written for a given
prompt, which defines the writing topic. Most existing AES systems assume to
grade essays of the same prompt as used in training and assign only a holistic
score. However, such settings conflict with real-education situations;
pre-graded essays for a particular prompt are lacking, and detailed trait
scores of sub-rubrics are required. Thus, predicting various trait scores of
unseen-prompt essays (called cross-prompt essay trait scoring) is a remaining
challenge of AES. In this paper, we propose a robust model: prompt- and trait
relation-aware cross-prompt essay trait scorer. We encode prompt-aware essay
representation by essay-prompt attention and utilizing the topic-coherence
feature extracted by the topic-modeling mechanism without access to labeled
data; therefore, our model considers the prompt adherence of an essay, even in
a cross-prompt setting. To facilitate multi-trait scoring, we design
trait-similarity loss that encapsulates the correlations of traits. Experiments
prove the efficacy of our model, showing state-of-the-art results for all
prompts and traits. Significant improvements in low-resource-prompt and
inferior traits further indicate our model's strength.Comment: Accepted at ACL 2023 (Findings, long paper
Recommended from our members
An overview of three approaches to scoring written essays by computer
Accessed 43,267 times on https://pareonline.net from December 12, 2001 to December 31, 2019. For downloads from January 1, 2020 forward, please click on the PlumX Metrics link to the right
Automated writing evaluation tools for Indonesian undergraduate English as a foreign language studentsā writing
Nowadays, many computer programs are used in the teaching of writing in the context of English as a foreign language (EFL). One of the functions of the computer programs is to provide feedback to EFL studentsā writing so that the quality of their writing can be improved. This study aimed to investigate whether the use of free automated writing evaluation (AWE) tools affect undergraduate EFL studentsā writing skills. In this experimental study, 35 Indonesian undergraduate students of English education department were asked to use two AWE tools, Grammarly and Grammark, in the writing course over four months. Data for this study were collected by using tests and questionnaire. Pre-test, middle test, and post-test were administered to examine the studentsā writing skill improvement. The findings indicate that the sequenced use of two AWE tools, Grammarly followed by Grammark, had a beneficial effect on studentsā writing skill improvement. This study confirms the benefits of free AWE tools in enhancing EFL studentsā writing skills
Machine Scoring of Student Essays: Truth and Consequences
The current trend toward machine-scoring of student work, Ericsson and Haswell argue, has created an emerging issue with implications for higher education across the disciplines, but with particular importance for those in English departments and in administration. The academic community has been silent on the issueāsome would say excluded from itāwhile the commercial entities who develop essay-scoring software have been very active. Machine Scoring of Student Essays is the first volume to seriously consider the educational mechanisms and consequences of this trend, and it offers important discussions from some of the leading scholars in writing assessment.https://digitalcommons.usu.edu/usupress_pubs/1138/thumbnail.jp
Entropy and Graph Based Modelling of Document Coherence using Discourse Entities: An Application
We present two novel models of document coherence and their application to
information retrieval (IR). Both models approximate document coherence using
discourse entities, e.g. the subject or object of a sentence. Our first model
views text as a Markov process generating sequences of discourse entities
(entity n-grams); we use the entropy of these entity n-grams to approximate the
rate at which new information appears in text, reasoning that as more new words
appear, the topic increasingly drifts and text coherence decreases. Our second
model extends the work of Guinaudeau & Strube [28] that represents text as a
graph of discourse entities, linked by different relations, such as their
distance or adjacency in text. We use several graph topology metrics to
approximate different aspects of the discourse flow that can indicate
coherence, such as the average clustering or betweenness of discourse entities
in text. Experiments with several instantiations of these models show that: (i)
our models perform on a par with two other well-known models of text coherence
even without any parameter tuning, and (ii) reranking retrieval results
according to their coherence scores gives notable performance gains, confirming
a relation between document coherence and relevance. This work contributes two
novel models of document coherence, the application of which to IR complements
recent work in the integration of document cohesiveness or comprehensibility to
ranking [5, 56]
- ā¦