28 research outputs found

    Automatic text scoring using neural networks

    Get PDF
    Automated Text Scoring (ATS) provides a cost-effective and consistent alternative to human marking. However, in order to achieve good performance, the predictive features of the system need to be manually engineered by human experts. We introduce a model that forms word representations by learning the extent to which specific words contribute to the text’s score. Using Long-Short Term Memory networks to represent the meaning of texts, we demonstrate that a fully automated framework is able to achieve excellent results over similar approaches. In an attempt to make our results more interpretable, and inspired by recent advances in visualizing neural networks, we introduce a novel method for identifying the regions of the text that the model has found more discriminative.This is the accepted manuscript. It is currently embargoed pending publication

    Exploring Automated Essay Scoring for Nonnative English Speakers

    Full text link
    Automated Essay Scoring (AES) has been quite popular and is being widely used. However, lack of appropriate methodology for rating nonnative English speakers' essays has meant a lopsided advancement in this field. In this paper, we report initial results of our experiments with nonnative AES that learns from manual evaluation of nonnative essays. For this purpose, we conducted an exercise in which essays written by nonnative English speakers in test environment were rated both manually and by the automated system designed for the experiment. In the process, we experimented with a few features to learn about nuances linked to nonnative evaluation. The proposed methodology of automated essay evaluation has yielded a correlation coefficient of 0.750 with the manual evaluation.Comment: Accepted for publication at EUROPHRAS 201

    Experiments with Universal CEFR Classification

    Get PDF
    The Common European Framework of Reference (CEFR) guidelines describe language proficiency of learners on a scale of 6 levels. While the description of CEFR guidelines is generic across languages, the development of automated proficiency classification systems for different languages follow different approaches. In this paper, we explore universal CEFR classification using domain-specific and domain-agnostic, theory-guided as well as data-driven features. We report the results of our preliminary experiments in monolingual, cross-lingual, and multilingual classification with three languages: German, Czech, and Italian. Our results show that both monolingual and multilingual models achieve similar performance, and cross-lingual classification yields lower, but comparable results to monolingual classification.Comment: to appear in the proceedings of The 13th Workshop on Innovative Use of NLP for Building Educational Application

    MANHATTAN DISTANCE AND DICE SIMILARITY EVALUATION ON INDONESIAN ESSAY EXAMINATION SYSTEM

    Get PDF
    Each learning process requires an evaluation tool to measure the level of understanding of students. The type of evaluation can be multiple choice questions, short entries and essays. Some studies reveal essay exams better than other types of evaluations. An essay assessment is automatically needed to save teacher time in correcting answers. However, the development of essay assessments is still ongoing. The aim is to obtain a better accuracy value than the method used in the assessment. Based on these problems, this study proposes a comparative analysis of similarity methods for online essay exam assessment. The similarity method compared is Similarity Dice and Manhattan Distance. Both methods produce coefficient values which are then compared to the assessment of the system with manual scales with the same scale. The data used were 2162 data. This data was obtained from 50 students who answered 40 questions (politics, sports, lifestyle and technology). The data obtained in this study can be used to support other research that can be accessed at www.indonesian-ir.org. This research shows that the Dice similarity scheme is more accurate than Manhattan Distanc

    Automated Feedback Generation for a Chemistry Database and Abstracting Exercise

    Full text link
    Timely feedback is an important part of teaching and learning. Here we describe how a readily available neural network transformer (machine-learning) model (BERT) can be used to give feedback on the structure of the response to an abstracting exercise where students are asked to summarise the contents of a published article after finding it from a publication database. The dataset contained 207 submissions from two consecutive years of the course, summarising a total of 21 different papers from the primary literature. The model was pre-trained using an available dataset (approx. 15,000 samples) and then fine-tuned on 80% of the submitted dataset. This fine tuning was seen to be important. The sentences in the student submissions are characterised into three classes - background, technique and observation - which allows a comparison of how each submission is structured. Comparing the structure of the students' abstract a large collection of those from the PubMed database shows that students in this exercise concentrate more on the background to the paper and less on the techniques and results than the abstracts to papers themselves. The results allowed feedback for each submitted assignment to be automatically generated.Comment: 9 pages, 1 figure, 3 table
    corecore