Search CORE

396 research outputs found

JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction

Author: Napoles Courtney
Sakaguchi Keisuke
Tetreault Joel
Publication venue
Publication date: 01/01/2017
Field of study

We present a new parallel corpus, JHU FLuency-Extended GUG corpus (JFLEG) for developing and evaluating grammatical error correction (GEC). Unlike other corpora, it represents a broad range of language proficiency levels and uses holistic fluency edits to not only correct grammatical errors but also make the original text more native sounding. We describe the types of corrections made and benchmark four leading GEC systems on this corpus, identifying specific areas in which they do well and how they can improve. JFLEG fulfills the need for a new gold standard to properly assess the current state of GEC.Comment: To appear in EACL 2017 (short papers

arXiv.org e-Print Archive

Crossref

Ordinal GAMMs: a new window on human ratings

Author: Baayen Harald
Divjak Dagmar
Publication venue: Slavica Publishers
Publication date: 01/01/2017
Field of study

University of Birmingham Research Portal

Hypothesis Only Baselines in Natural Language Inference

Author: Haldar Aparajita
Naradowsky Jason
Poliak Adam
Rudinger Rachel
Van Durme Benjamin
Publication venue
Publication date: 01/01/2018
Field of study

We propose a hypothesis only baseline for diagnosing Natural Language Inference (NLI). Especially when an NLI dataset assumes inference is occurring based purely on the relationship between a context and a hypothesis, it follows that assessing entailment relations while ignoring the provided context is a degenerate solution. Yet, through experiments on ten distinct NLI datasets, we find that this approach, which we refer to as a hypothesis-only model, is able to significantly outperform a majority class baseline across a number of NLI datasets. Our analysis suggests that statistical irregularities may allow a model to perform NLI in some datasets beyond what should be achievable without access to the context.Comment: Accepted at *SEM 2018 as long paper. 12 page

arXiv.org e-Print Archive

Crossref

Scholarship, Research, and Creative Work at Bryn Mawr College | Bryn Mawr College Research

Grammaticality, Acceptability, and Probability: A Probabilistic View of Linguistic Knowledge

Author: Clark A
Lappin S
Lau JH
Publication venue: 'Wiley'
Publication date: 01/07/2017
Field of study

The question of whether humans represent grammatical knowledge as a binary condition on membership in a set of well‐formed sentences, or as a probabilistic property has been the subject of debate among linguists, psychologists, and cognitive scientists for many decades. Acceptability judgments present a serious problem for both classical binary and probabilistic theories of grammaticality. These judgements are gradient in nature, and so cannot be directly accommodated in a binary formal grammar. However, it is also not possible to simply reduce acceptability to probability. The acceptability of a sentence is not the same as the likelihood of its occurrence, which is, in part, determined by factors like sentence length and lexical frequency. In this paper, we present the results of a set of large‐scale experiments using crowd‐sourced acceptability judgments that demonstrate gradience to be a pervasive feature in acceptability judgments. We then show how one can predict acceptability judgments on the basis of probability by augmenting probabilistic language models with an acceptability measure. This is a function that normalizes probability values to eliminate the confounding factors of length and lexical frequency. We describe a sequence of modeling experiments with unsupervised language models drawn from state‐of‐the‐art machine learning methods in natural language processing. Several of these models achieve very encouraging levels of accuracy in the acceptability prediction task, as measured by the correlation between the acceptability measure scores and mean human acceptability values. We consider the relevance of these results to the debate on the nature of grammatical competence, and we argue that they support the view that linguistic knowledge can be intrinsically probabilistic

Queen Mary Research Online

University of Melbourne Institutional Repository

BLiMP: The Benchmark of Linguistic Minimal Pairs for English

Author: Bowman Samuel R.
Liu Haokun
Mohananey Anhad
Parrish Alicia
Peng Wei
Wang Sheng-Fu
Warstadt Alex
Publication venue
Publication date: 23/09/2020
Field of study

We introduce The Benchmark of Linguistic Minimal Pairs (shortened to BLiMP), a challenge set for evaluating what language models (LMs) know about major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each containing 1000 minimal pairs isolating specific contrasts in syntax, morphology, or semantics. The data is automatically generated according to expert-crafted grammars, and aggregate human agreement with the labels is 96.4%. We use it to evaluate n-gram, LSTM, and Transformer (GPT-2 and Transformer-XL) LMs. We find that state-of-the-art models identify morphological contrasts reliably, but they struggle with semantic restrictions on the distribution of quantifiers and negative polarity items and subtle syntactic phenomena such as extraction islands.Comment: To appear in TAC

arXiv.org e-Print Archive

Recommended from our members

Quantification at a distance and grammatical illusions in French

Author: Dillon Brian
Frazier Lyn
Pasquereau Jérémy
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2023
Field of study

Recent research in psycholinguistics supports the hypothesis that retrieval from working memory is a key component of establishing syntactic dependencies in comprehension. This can result in so-called grammatical illusions. These illusions have been modeled as the result of a content-addressable retrieval process in sentence comprehension that allows grammatically inaccessible licensing elements to be reactivated, creating a spurious perception of acceptability. This article reports five studies that establish the existence of a new grammatical illusion involving quantification at a distance and the licensing of so-called de NPs in French. Our results suggest that this grammatical illusion is interestingly constrained by syntactic properties of the licensors. Specifically, quantifiers that independently participate in quantification-at-a-distance constructions were seen to create grammatical illusions to a greater extent than quantifiers that do not participate in that construction. Consistent with previous work on the nature of cues in memory retrieval, we suggest that this is the result of fairly specific abstract syntactic cues that guide retrieval of a licensing element. This article thus brings further evidence that syntax is crucially used to structure working memory over the course of a parse

ScholarWorks@UMass Amherst

Why We Need New Evaluation Metrics for NLG

Author: Curry Amanda Cercas
Dušek Ondřej
Novikova Jekaterina
Rieser Verena
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

The majority of NLG evaluation relies on automatic metrics, such as BLEU . In this paper, we motivate the need for novel, system- and data-independent automatic evaluation methods: We investigate a wide range of metrics, including state-of-the-art word-based and novel grammar-based ones, and demonstrate that they only weakly reflect human judgements of system outputs as generated by data-driven, end-to-end NLG. We also show that metric performance is data- and system-specific. Nevertheless, our results also suggest that automatic metrics perform reliably at system-level and can support system development by finding cases where a system performs poorly.Comment: accepted to EMNLP 201

arXiv.org e-Print Archive

Crossref

Heriot Watt Pure

Relationship between metalinguistic knowledge/learning contexts and language proficiency, The

Author: Hanson Jenna
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2013
Field of study

2013 Spring.Includes bibliographical references.This study explores the relationship between learning context on learners' oral proficiency, metalinguistic knowledge of Spanish (MKS) and metalinguistic knowledge of English (MKE). The study also explores the relationship between MKE and MKS, and MKS on oral proficiency between the two learning contexts. The two contexts in question were a traditional semester (TS) that met five days a week, fifty minutes a day for fifteen weeks and a four-week summer intensive program that met five days a week, four hours a day for four weeks. A COPI (computerized oral proficiency interview) was administered to measure oral proficiency and two different measures of metalinguistic knowledge were employed to test MKE and MKS. The MKE test was administered as a pre and posttest, whereas the MKS test was given at the end of the semester. The study found that, a) students in the TS group have significantly higher levels of MKS, b) student in the TS group significantly improve their MKE more so than the IS group, c) there is a significant relationship between MKS and oral proficiency regardless of group, d) there is a significant relationship between MKE pretest and MKS at the end of the semester, and e) there is no significant difference between oral proficiency between the two contexts

Mountain Scholar (Digital Collections of Colorado and Wyoming)