10 research outputs found

    Automated Error Detection for Developing Grammar Proficiency of ESL Learners

    Get PDF
    Thanks to natural language processing technologies, computer programs are actively being used not only for holistic scoring, but also for formative evaluation of writing. CyWrite is one such program that is under development. The program is built upon Second Language Acquisition theories and aims to assist ESL learners in higher education by providing them with effective formative feedback to facilitate autonomous learning and improvement of their writing skills. In this study, we focus on CyWrite’s capacity to detect grammatical errors in student writing. We specifically report on (1) computational and pedagogical approaches to the development of the tool in terms of students’ grammatical accuracy, and (2) the performance of our grammatical analyzer. We evaluated the performance of CyWrite on a corpus of essays written by ESL undergraduate students with regards to four types of grammatical errors: quantifiers, subject-verb agreement, articles, and run-on sentences. We compared CyWrite’s performance at detecting these errors to the performance of a well-known commercially available AWE tool, Criterion. Our findings demonstrated better performance metrics of our tool as compared to Criterion, and a deeper analysis of false positives and false negatives shed light on how CyWrite’s performance can be improved

    Detecting Article Errors Based on the Mass Count Distinction

    Full text link

    A Grammar Correction Algorithm – Deep Parsing and Minimal Corrections for a Grammar Checker

    Get PDF
    International audienceThis article presents the central algorithm of an open system for grammar checking, based on deep parsing. The grammatical specification is a context-free grammar with flat feature structures. After a shared-forest analysis where feature agreement constraints are relaxed, error detection globally minimizes the number of corrections and alternative correct sentences are automatically proposed in an order of plausibility reflecting the number of changes made to the original sentence

    Recognizing Syntactic Errors in the Writing of Second Language Learners

    No full text
    This paper reports on the recognition component of an intelligent tutoring system that is designed to help foreign language speakers learn standard English. The system models the gram- mar of the learner, with this instantiation of the system tailored to signers of American Sign Language (ASL). We discuss the theoretical motivations for the system, various difficulties that have been encountered in the implementation, as well as the methods we have used to over- come these problems. Our method of capturing ungrammaticalities involves using mal- rules (also called 'error productions'). However, the straightforward addition of some mal-rules causes significant performance problems with the parser. For instance, the ASL population has a strong tendency to drop pronouns and the auxiliary verb 'to be'. Being able to account for these as sentences results in an explosion in the number of possible parses for each sentence. This explosion, left unchecked, greatly hampers the performance of the system. We discuss how this is handled by taking into account expectations from the specific population (some of which are captured in our unique user model). The different representations of lexical items at various points in the acquisition process are modeled by using mal-rules, which obviates the need for multiple lexicons. The grammar is evaluated on its ability to correctly di- agnose agreement problems in actual sentences produced by ASL native speakers

    Detecting grammatical errors with treebank-induced, probabilistic parsers

    Get PDF
    Today's grammar checkers often use hand-crafted rule systems that define acceptable language. The development of such rule systems is labour-intensive and has to be repeated for each language. At the same time, grammars automatically induced from syntactically annotated corpora (treebanks) are successfully employed in other applications, for example text understanding and machine translation. At first glance, treebank-induced grammars seem to be unsuitable for grammar checking as they massively over-generate and fail to reject ungrammatical input due to their high robustness. We present three new methods for judging the grammaticality of a sentence with probabilistic, treebank-induced grammars, demonstrating that such grammars can be successfully applied to automatically judge the grammaticality of an input string. Our best-performing method exploits the differences between parse results for grammars trained on grammatical and ungrammatical treebanks. The second approach builds an estimator of the probability of the most likely parse using grammatical training data that has previously been parsed and annotated with parse probabilities. If the estimated probability of an input sentence (whose grammaticality is to be judged by the system) is higher by a certain amount than the actual parse probability, the sentence is flagged as ungrammatical. The third approach extracts discriminative parse tree fragments in the form of CFG rules from parsed grammatical and ungrammatical corpora and trains a binary classifier to distinguish grammatical from ungrammatical sentences. The three approaches are evaluated on a large test set of grammatical and ungrammatical sentences. The ungrammatical test set is generated automatically by inserting common grammatical errors into the British National Corpus. The results are compared to two traditional approaches, one that uses a hand-crafted, discriminative grammar, the XLE ParGram English LFG, and one based on part-of-speech n-grams. In addition, the baseline methods and the new methods are combined in a machine learning-based framework, yielding further improvements

    Mugarri: bigarren hizkuntzako ikasleen hizkuntz ezagutza eskuratzeko sistema anitzeko ingurunea

    Get PDF
    265 p.Definimos el objetivo teórico como sigue: Representación del conocimiento que el alumno va adquiriendo a lo largo del proceso de aprendizaje de la segunda lengua. Dicha representaciónse elabora con la ayuda de un psicolingüistica y tomando como base los textos escritos por el alumno en la segunda lengua. A este conocimiento los denominamos interlengua. El objetivo práctico se subdivide en dos: - Apoyar al profesor en el seguimiento del historial del proceso de aprendizaje de sus alumnos, así como el diagnósitco de posibles desvíos y subsanación de los mismos. - Ayudar al alumno a la adquisición de nuevos conocimientos de la segunda lengua. Por tanto, tres agentes humanos intercatúan en el entorno MUGARRI que hemos diseñado: el psicolingüistica, el profesor y el alumno. Se ha desarrollado un sistema informático para cada uno de ellos: HITES: Sistema para la modelización de estructuras de la interlengua. IRAKAZI: Sistema de apoyo para analizar el diagnóstico del proceso de aprendizaje del alumno. IDAZKIDE: Sistema de apoyo para el aprendizaje de segundas lenguas

    QuickAssist Extensive Reading for Learners of German Using CALL Technologies

    Get PDF
    The focus of this dissertation is the development and testing of a CALL tool which assists learners of German with the extensive reading of German texts of their choice. The application provides functionality that enables learners to acquire new vocabulary, analyse the meaning of complex word forms and to study a word’s semantic and syntactic features with the help of corpora and online resources. It is also designed to enable instructors to create meaningful exercises to be used in classroom activities focusing on vocabulary acquisition and word formation rules. The detailed description of the software development and implementation is preceded by a review of the relevant literature in the areas of German morphology and word formation, second language acquisition and vocabulary acquisition in particular, studies on the benefits of extensive reading, the role of motivation in second language learning, CALL, and natural language processing technologies. The user study presented at the end of this dissertation shows how a first test group of learners was able to use the application for individual reading projects and presents the results of an evaluation of the software conducted by three German instructors assessing the affordances of the applications for students and potential applications for language instructors

    Grammar and Corpora 2016

    Get PDF
    In recent years, the availability of large annotated corpora, together with a new interest in the empirical foundation and validation of linguistic theory and description, has sparked a surge of novel work using corpus methods to study the grammar of natural languages. This volume presents recent developments and advances, firstly, in corpus-oriented grammar research with a special focus on Germanic, Slavic, and Romance languages and, secondly, in corpus linguistic methodology as well as the application of corpus methods to grammar-related fields. The volume results from the sixth international conference Grammar and Corpora (GaC 2016), which took place at the Institute for the German Language (IDS) in Mannheim, Germany, in November 2016

    Osvojování češtiny frankofonními studenty: automatická analýza chyb v deklinaci

    Get PDF
    Představujeme platformu CETLEF určenou k počítačem asistované výuce češtiny, která obsahuje uživatelské rozhraní pro studenty i učitele a která nabízí deklinační cvičení s automatickou diagnostikou chyb. V rámci CETLEF byl vypracován formální model deklinace, který obsahuje detailní klasifikaci skoňovacích paradigmat a pravidla pro realizaci vokalických a konsonantických alternací. Tento model je užit k anotaci požadovaných forem v rámci gramatických cvičení, k prezentaci morfologického systému na plateformě určené studentřum a k diagnostice chyb. Diagnostika je založena na srovnávání chybné produkce s hypotetickými formami, které jsou generovány na základě kmenu požadované formy. Vyhodnocení této diagnostiky na materiálu shromážděném díky CETLEF ukazuje, že většina chyb mřuže být interpretována touto technikou.Our work presents the realization of a platform of computer-assisted language learning CETLEF, featuring on-line fill-in-the-blank exercises with automatic feedback on errors. CETLEF, consisting of a relational data base and author and learner interfaces, rendered necessary the definition of a model for declension in Czech. This model contains a detailed classification of the paradigms and rules for the realization of vocalic and consonantal alternations. It enables the morphological annotation of required forms, the didactic presentation of the morphological system of Czech on the learning platform, as well as the automatic error diagnosis. Diagnosis is carried out by the comparison of an erroneous production with hypothetical forms generated from the stem of the required form. An appraisal of the diagnosis of the productions collected on CETLEF shows that the vast majority of errors can be interpreted with the aid of this technique.Institute of the Czech National CorpusÚstav českého národního korpusuFilozofická fakultaFaculty of Art

    Grammar and Corpora 2016

    Get PDF
    In recent years, the availability of large annotated corpora, together with a new interest in the empirical foundation and validation of linguistic theory and description, has sparked a surge of novel work using corpus methods to study the grammar of natural languages. This volume presents recent developments and advances, firstly, in corpus-oriented grammar research with a special focus on Germanic, Slavic, and Romance languages and, secondly, in corpus linguistic methodology as well as the application of corpus methods to grammar-related fields. The volume results from the sixth international conference Grammar and Corpora (GaC 2016), which took place at the Institute for the German Language (IDS) in Mannheim, Germany, in November 2016.Die Verfügbarkeit großer annotierter und durchsuchbarer Korpora, verbunden mit einem neuerwachten Interesse an der empirischen Grundlegung und Validierung linguistischer Theorie und Beschreibung hat in letzter Zeit zu einer regelrechten Welle interessanter Arbeiten zur Grammatik natürlicher Sprachen geführt. Dieser Band präsentiert zum einen neuere Entwicklungen in der korpusorientierten Forschung zur Grammatik germanischer, romanischer und slawischer Sprachen und zum anderen innovative Ansätze in der einschlägigen korpuslinguistischen Methodologie, die auch Anwendung im Umfeld der Grammatik finden. Der Band fasst die Beiträge der sechsten internationalen Konferenz Grammar and Corpora zusammen, die im November 2016 am Institut für Deutsche Sprache (IDS) in Mannheim stattfand
    corecore