1,393 research outputs found
A large annotated corpus for learning natural language inference
Understanding entailment and contradiction is fundamental to understanding
natural language, and inference about entailment and contradiction is a
valuable testing ground for the development of semantic representations.
However, machine learning research in this area has been dramatically limited
by the lack of large-scale resources. To address this, we introduce the
Stanford Natural Language Inference corpus, a new, freely available collection
of labeled sentence pairs, written by humans doing a novel grounded task based
on image captioning. At 570K pairs, it is two orders of magnitude larger than
all other resources of its type. This increase in scale allows lexicalized
classifiers to outperform some sophisticated existing entailment models, and it
allows a neural network-based model to perform competitively on natural
language inference benchmarks for the first time.Comment: To appear at EMNLP 2015. The data will be posted shortly before the
conference (the week of 14 Sep) at http://nlp.stanford.edu/projects/snli
Language (Technology) is Power: A Critical Survey of "Bias" in NLP
We survey 146 papers analyzing "bias" in NLP systems, finding that their
motivations are often vague, inconsistent, and lacking in normative reasoning,
despite the fact that analyzing "bias" is an inherently normative process. We
further find that these papers' proposed quantitative techniques for measuring
or mitigating "bias" are poorly matched to their motivations and do not engage
with the relevant literature outside of NLP. Based on these findings, we
describe the beginnings of a path forward by proposing three recommendations
that should guide work analyzing "bias" in NLP systems. These recommendations
rest on a greater recognition of the relationships between language and social
hierarchies, encouraging researchers and practitioners to articulate their
conceptualizations of "bias"---i.e., what kinds of system behaviors are
harmful, in what ways, to whom, and why, as well as the normative reasoning
underlying these statements---and to center work around the lived experiences
of members of communities affected by NLP systems, while interrogating and
reimagining the power relations between technologists and such communities
- …