Search CORE

2,108 research outputs found

Automated Detection of Usage Errors in non-native English Writing

Author: Fujishima Satoru
Ishizaki Shun
Publication venue
Publication date: 26/10/2011
Field of study

In an investigation of the use of a novelty detection algorithm for identifying inappropriate word combinations in a raw English corpus, we employ an unsupervised detection algorithm based on the one- class support vector machines (OC-SVMs) and extract sentences containing word sequences whose frequency of appearance is significantly low in native English writing. Combined with n-gram language models and document categorization techniques, the OC-SVM classifier assigns given sentences into two different groups; the sentences containing errors and those without errors. Accuracies are 79.30 % with bigram model, 86.63 % with trigram model, and 34.34 % with four-gram model

EEPIS Repository

GenERRate: generating errors for use in grammatical error detection

Author: Andersen Øistein E.
Foster Jennifer
Publication venue: The Association for Computational Linguistics
Publication date: 01/01/2009
Field of study

This paper explores the issue of automatically generated ungrammatical data and its use in error detection, with a focus on the task of classifying a sentence as grammatical or ungrammatical. We present an error generation tool called GenERRate and show how GenERRate can be used to improve the performance of a classifier on learner data. We describe initial attempts to replicate Cambridge Learner Corpus errors using GenERRate

CiteSeerX

Irish Universities

DCU Online Research Access Service

Acquiring Word-Meaning Mappings for Natural Language Interfaces

Author: Thompson C.
Publication venue: 'AI Access Foundation'
Publication date: 22/06/2011
Field of study

This paper focuses on a system, WOLFIE (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. WOLFIE is part of an integrated system that learns to transform sentences into representations such as logical database queries. Experimental results are presented demonstrating WOLFIE's ability to learn useful lexicons for a database interface in four different natural languages. The usefulness of the lexicons learned by WOLFIE are compared to those acquired by a similar system, with results favorable to WOLFIE. A second set of experiments demonstrates WOLFIE's ability to scale to larger and more difficult, albeit artificially generated, corpora. In natural language acquisition, it is difficult to gather the annotated data needed for supervised learning; however, unannotated data is fairly plentiful. Active learning methods attempt to select for annotation and training only the most informative examples, and therefore are potentially very useful in natural language applications. However, most results to date for active learning have only considered standard classification tasks. To reduce annotation effort while maintaining accuracy, we apply active learning to semantic lexicons. We show that active learning can significantly reduce the number of annotated examples required to achieve a given level of performance

arXiv.org e-Print Archive

Crossref

Annotating article errors in Spanish learner texts: design and evaluation of an annotation scheme

Author: Ibanez Maria del Pilar Valverde
Ohtani Akira
Publication venue: Department of Linguistics, Faculty of Arts, Chulalongkorn University
Publication date: 01/01/2014
Field of study

Waseda University Repository

Tagging a Japanese Learner Corpus of English and Comparing Trigrams with Those in a Corpus of British Students\u27 Essays

Author: Kamakura Yoshihito
Publication venue: 愛知大学語学教育研究室
Publication date: 01/07/2012
Field of study

Aichi University of Education: AUE Repository / 愛知教育大学学術情報リポジトリ

Treebanks gone bad: generating a treebank of ungrammatical English

Author: Foster Jennifer
Publication venue
Publication date: 01/01/2007
Field of study

This paper describes how a treebank of ungrammatical sentences can be created from a treebank of well-formed sentences. The treebank creation procedure involves the automatic introduction of frequently occurring grammatical errors into the sentences in an existing treebank, and the minimal transformation of the analyses in the treebank so that they describe the newly created ill-formed sentences. Such a treebank can be used to test how well a parser is able to ignore grammatical errors in texts (as people can), and can be used to induce a grammar capable of analysing such sentences. This paper also demonstrates the first of these uses

DCU Online Research Access Service

From Learners\u27 Corpora to Expert Knowledge Description : Analyzing Prepositions in the NICT JLE (Japanese Learner English) Corpus

Author: 井佐原均
竹内和広
谷村緑
Publication venue: IWLeL 2004 Program Committee
Publication date: 31/03/2005
Field of study

The present study has two main purposes. One is to show what the NICT JLE corpus analysis tool and commercial tools can do for analysing error annotated learner corpora. The other is to show how possible L1 transfer effects can be analyzed using learner corpora. We take preposition errors as examples, which often occur as collocation. For future direction, we suggest that standardization of a shared collocation dictionary rather than a word list is needed, especially for Japanese English learners for pedagogic purposes. We also propose that an objective method to characterize L1 transfer needs to be developed. We assume that back-translation of each utterance would be one of the effective ways to extract L1 translation, in other words, Bi-lingual aligned corpus and machine translation software would be a foundation to develop such a method

Waseda University Repository

Towards Automatic Error Type Classification of Japanese Language Learners\u27 Writing

Author: Komachi Mamoru
Matsumoto Yuji
Oyama Hiromi
Publication venue: Department of English, National Chengchi University
Publication date: 01/01/2013
Field of study

Waseda University Repository