118 research outputs found

    Sobre los efectos de combinar Análisis Semántico Latente con otras técnicas de procesamiento de lenguaje natural para la evaluación de preguntas abiertas

    Full text link
    Este artículo presenta la combinación de Análisis Semántico Latente (LSA) con otras técnicas de procesamiento del lenguaje natural (lematización, eliminación de palabras funcionales y desambiguación de sentidos) para mejorar la evaluación automática de respuestas en texto libre. El sistema de evaluación de respuestas en texto libre llamado Atenea (Alfonseca & Pérez, 2004) ha servido de marco experimental para probar el esquema combinacional. Atenea es un sistema capaz de realizar preguntas, escogidas aleatoriamente o bien conforme al perfil del estudiante, y asignarles una calificación numérica. Los resultados de los experimentos demuestran que para todos los conjuntos de datos en los que las técnicas de PLN se han combinado con LSA la correlación de Pearson entre las notas dadas por Atenea y las notas dadas por los profesores para el mismo conjunto de preguntas mejora. La causa puede encontrarse en la complementariedad entre LSA, que trabaja a un nivel semántico superficial, y el resto de las técnicas NLP usadas en Atenea, que están más centradas en los niveles léxico y sintáctico.This article presents the combination of Latent Semantic Analysis (LSA) with other natural language processing techniques (stemming, removal of closed-class words and word sense disambiguation) to improve the automatic assessment of students' free-text answers. The combinational schema has been tested in the experimental framework provided by the free-text Computer Assisted Assessment (CAA) system called Atenea (Alfonseca & Pérez, 2004). This system is able to ask randomly or according to the students' profile an open-ended question to the student and then, assign a score to it. The results prove that for all datasets, when the NLP techniques are combined with LSA, the Pearson correlation between the scores given by Atenea and the scores given by the teachers for the same dataset of questions improves. We believe that this is due to the complementarity between LSA, which works more at a shallow semantic level, and the rest of the NLP techniques used in Atenea, which are more focused on the lexical and syntactical levels

    Adapting the automatic assessment of free-text answers to the students

    Get PDF
    In this paper, we present the first approach in the field of Computer Assisted Assessment (CAA) of students' free-text answers to model the student profiles. This approach has been implemented in a new version of Atenea, a system able to automatically assess students' short answers. The system has been improved so that it is now able to take into account the students' preferences and personal features to adapt not only the assessment process but also to personalize the appearance of the interface. In particular, it is now able to accept students’ answers written in Spanish or in English indistinctly, by means of Machine Translation. Moreover, we have observed that Atenea’s performance does not decrease drastically when combined with automatic translation, provided that the translation does not reduce greatly the variability in the vocabulary

    A Computer-Based Approach For Identifying Student Conceptual Change

    Get PDF
    Misconceptions are commonly encountered in many areas of science and engineering where a to-be-learned concept conflicts with prior knowledge. Conceptual change is an approach for identifying and repairing the misconceptions. One of the ways to promote student conceptual change is providing students with ontological schema training. However, assessment of conceptual change relies on qualitative analysis of student responses. With the exponential growth of qualitative data in the form of graphical representations or written responses, the process of data analysis relying on human experts has become time-consuming and costly. This study took the advantages of natural language processing and machine learning techniques to analyze the responses effectively. In addition, we identified how students described complex phenomena in thermal and transport science and compared the differences of descriptions between students who took certain training courses to address misconceptions by means of ontological schema training and those who were exposed to a different course about the nature of science. After comparing the effectiveness of three different text classification methods - query-based approach, Naive Bayes classifier, and support vector machine (SVM) for identifying conceptual change, SVM classifier was chosen to assess student responses from a corpus collected by Streveler and her research group in previous studies. Based on the automatic assessment for student conceptual change, this research found that training students with appropriate ontological schema would promote the conceptual change

    DeepEval: An Integrated Framework for the Evaluation of Student Responses in Dialogue Based Intelligent Tutoring Systems

    Get PDF
    The automatic assessment of student answers is one of the critical components of an Intelligent Tutoring System (ITS) because accurate assessment of student input is needed in order to provide effective feedback that leads to learning. But this is a very challenging task because it requires natural language understanding capabilities. The process requires various components, concepts identification, co-reference resolution, ellipsis handling etc. As part of this thesis, we thoroughly analyzed a set of student responses obtained from an experiment with the intelligent tutoring system DeepTutor in which college students interacted with the tutor to solve conceptual physics problems, designed an automatic answer assessment framework (DeepEval), and evaluated the framework after implementing several important components. To evaluate our system, we annotated 618 responses from 41 students for correctness. Our system performs better as compared to the typical similarity calculation method. We also discuss various issues in automatic answer evaluation

    Abstract syntax as interlingua: Scaling up the grammatical framework from controlled languages to robust pipelines

    Get PDF
    Syntax is an interlingual representation used in compilers. Grammatical Framework (GF) applies the abstract syntax idea to natural languages. The development of GF started in 1998, first as a tool for controlled language implementations, where it has gained an established position in both academic and commercial projects. GF provides grammar resources for over 40 languages, enabling accurate generation and translation, as well as grammar engineering tools and components for mobile and Web applications. On the research side, the focus in the last ten years has been on scaling up GF to wide-coverage language processing. The concept of abstract syntax offers a unified view on many other approaches: Universal Dependencies, WordNets, FrameNets, Construction Grammars, and Abstract Meaning Representations. This makes it possible for GF to utilize data from the other approaches and to build robust pipelines. In return, GF can contribute to data-driven approaches by methods to transfer resources from one language to others, to augment data by rule-based generation, to check the consistency of hand-annotated corpora, and to pipe analyses into high-precision semantic back ends. This article gives an overview of the use of abstract syntax as interlingua through both established and emerging NLP applications involving GF

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Get PDF
    This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

    Automatic Short Answer Grading Using Transformers

    Get PDF
    RÉSUMÉ : L’évaluation des réponses courtes en langage naturel est une tendance dominante dans tout environnement éducatif. Ces techniques ont le potentiel d’aider les enseignants à mieux comprendre les réussites et les échecs de leurs élèves. En comparaison, les autres types d’évaluation ne mesurent souvent pas adéquatement les compétences des élèves, telles que les questions à choix multiples ou celles où il faut combler des espaces. Cependant, ce sont les moyens les plus fréquemment utilisés pour évaluer les élèves, en particulier dans les envi-ronnements de cours en ligne ouverts (MOOCs). La raison de leur emploi fréquent est que ces questions sont plus simples à corriger avec un ordinateur. Comparativement, devoir com-prendre et noter manuellement des réponses courtes est une tâche plus diÿcile et plus longue, d’autant plus en considérant le nombre croissant d’élèves en classe. La notation automatique de réponses courtes, généralement abrégée de l’anglais par ASAG, est une solution parfaite-ment adaptée à ce problème. Dans ce mémoire, nous nous concentrons sur le ASAG basé sur la classification avec des notes nominales, telles que correct ou incorrect. Nous proposons une approche par référence basée sur un modèle d’apprentissage profond, que nous entraînons sur quatre ensembles de données ASAG de pointe, à savoir SemEval-2013 (SciEntBank et BEETLE), Dt-grade et un jeu de données sur la biologie. Notre approche utilise les modèles BERT Base (sensible à la casse ou non) et XLNET Base (seulement sensible à la casse). Notre analyse subséquente emploie les ensembles de données GLUE (General Language Un-derstanding Evaluation), incluant des tâches de questions-réponses, d’implication textuelle, d’identification de paraphrases et d’analyse de similitude textuelle sémantique (STS). Nous démontrons que celles-ci contribuent à une meilleure performance des modèles sur la tâche ASAG, surtout avec le jeu de données SciEntBank.---------- ABSTRACT : Assessment of short natural language answers is a prevailing trend in any educational envi-ronment. It helps teachers to understand better the success and failure of students. Other types of questions such as multiple-choice or fill-in-the-gap questions don’t provide adequate clues for evaluating the students’ proficiency exhaustively. However, they are common means of student evaluation especially in Massive Open Online Courses (MOOCs) environments. One of the major reasons is that they are fairly easy to be graded. Nonetheless, understand-ing and marking manually short answers are more challenging and time-consuming tasks, especially when the number of students grows in a class. Automatic Short Answer Grading, usually abbreviated to ASAG, is a highly demanding solution in this current context. In this thesis, we mainly concentrate on classification-based ASAG with nominal grades such as correct or not correct. We propose a reference-based approach based on a deep learn-ing model on four ASAG state-of-the-art datasets, namely SemEval-2013 (SciEntBank and BEETLE), Dt-grade and Biology dataset. Our approach is based on BERT (cased and un-cased) and XLNET (cased) models. Our secondary analysis includes how GLUE (General Language Understanding Evaluation) tasks such as question answering, entailment, para-phrase identification and semantic textual similarity analysis strengthen the ASAG task on SciEntBank dataset. We show that language models based on transformers such as BERT and XLNET outperform or equal the state-of-the-art feature-based approaches. We further indicate that the performance of our BERT model increases substantially when we fine-tune a BERT model on an entailment task such as the GLUE MNLI dataset and then on the ASAG task compared to the other GLUE models
    • …
    corecore