11,685 research outputs found

    GenERRate: generating errors for use in grammatical error detection

    Get PDF
    This paper explores the issue of automatically generated ungrammatical data and its use in error detection, with a focus on the task of classifying a sentence as grammatical or ungrammatical. We present an error generation tool called GenERRate and show how GenERRate can be used to improve the performance of a classifier on learner data. We describe initial attempts to replicate Cambridge Learner Corpus errors using GenERRate

    Explicit vs. Implicit L2 grammar knowledge in written error correction

    Get PDF
    Error correction is undoubtedly an important part of the process of drafting and producing written texts. The aim of the paper is to analyse the learners’ ability to correct grammatical errors in relation to the type of knowledge they employ in this task. Green and Hecht (1992), in an often quoted study, found a low correlation between L2 learners’ knowledge of explicit grammar rules and their ability to correct errors. They interpret this as suggesting that in error correction, learners rely primarily on their implicit knowledge. However, certain design features of their study might have caused the subjects to simply guess the correct forms, which, in turn, as DeKeyser (2003) suggests, may have led to the overestimation of implicit knowledge. This paper reports the results of an experiment where 150 Polish learners of English were administered a corpus-based error correction task, the design of which, however, differed from that of Green and Hecht (1992). These alterations resulted in finding a much closer link between the subjects’ knowledge of rules and their ability to correct grammatical errors

    Towards a collocation writing assistant for learners of Spanish

    Get PDF
    This paper describes the process followed in creating a tool aimed at helping learners produce collocations in Spanish. First we present the Diccionario de colocaciones del español (DiCE), an online collocation dictionary, which represents the first stage of this process. The following section focuses on the potential user of a collocation learning tool: we examine the usability problems DiCE presents in this respect, and explore the actual learner needs through a learner corpus study of collocation errors. Next, we review how collocation production problems of English language learners can be solved using a variety of electronic tools devised for that language. Finally, taking all the above into account, we present a new tool aimed at assisting learners of Spanish in writing texts, with particular attention being paid to the use of collocations in this language

    Statistical Function Tagging and Grammatical Relations of Myanmar Sentences

    Get PDF
    This paper describes a context free grammar (CFG) based grammatical relations for Myanmar sentences which combine corpus-based function tagging system. Part of the challenge of statistical function tagging for Myanmar sentences comes from the fact that Myanmar has free-phrase-order and a complex morphological system. Function tagging is a pre-processing step to show grammatical relations of Myanmar sentences. In the task of function tagging, which tags the function of Myanmar sentences with correct segmentation, POS (part-of-speech) tagging and chunking information, we use Naive Bayesian theory to disambiguate the possible function tags of a word. We apply context free grammar (CFG) to find out the grammatical relations of the function tags. We also create a functional annotated tagged corpus for Myanmar and propose the grammar rules for Myanmar sentences. Experiments show that our analysis achieves a good result with simple sentences and complex sentences.Comment: 16 pages, 7 figures, 8 tables, AIAA-2011 (India). arXiv admin note: text overlap with arXiv:0912.1820 by other author

    Supporting collocation learning with a digital library

    Get PDF
    Extensive knowledge of collocations is a key factor that distinguishes learners from fluent native speakers. Such knowledge is difficult to acquire simply because there is so much of it. This paper describes a system that exploits the facilities offered by digital libraries to provide a rich collocation-learning environment. The design is based on three processes that have been identified as leading to lexical acquisition: noticing, retrieval and generation. Collocations are automatically identified in input documents using natural language processing techniques and used to enhance the presentation of the documents and also as the basis of exercises, produced under teacher control, that amplify students' collocation knowledge. The system uses a corpus of 1.3 B short phrases drawn from the web, from which 29 M collocations have been automatically identified. It also connects to examples garnered from the live web and the British National Corpus

    An Analysis of Source-Side Grammatical Errors in NMT

    Full text link
    The quality of Neural Machine Translation (NMT) has been shown to significantly degrade when confronted with source-side noise. We present the first large-scale study of state-of-the-art English-to-German NMT on real grammatical noise, by evaluating on several Grammar Correction corpora. We present methods for evaluating NMT robustness without true references, and we use them for extensive analysis of the effects that different grammatical errors have on the NMT output. We also introduce a technique for visualizing the divergence distribution caused by a source-side error, which allows for additional insights.Comment: Accepted and to be presented at BlackboxNLP 201
    corecore