1,923,908 research outputs found
Neural Natural Language Inference Models Enhanced with External Knowledge
Modeling natural language inference is a very challenging task. With the
availability of large annotated data, it has recently become feasible to train
complex models such as neural-network-based inference models, which have shown
to achieve the state-of-the-art performance. Although there exist relatively
large annotated data, can machines learn all knowledge needed to perform
natural language inference (NLI) from these data? If not, how can
neural-network-based NLI models benefit from external knowledge and how to
build NLI models to leverage it? In this paper, we enrich the state-of-the-art
neural natural language inference models with external knowledge. We
demonstrate that the proposed models improve neural NLI models to achieve the
state-of-the-art performance on the SNLI and MultiNLI datasets.Comment: Accepted by ACL 201
The Sensitivity of Language Models and Humans to Winograd Schema Perturbations
Large-scale pretrained language models are the major driving force behind
recent improvements in performance on the Winograd Schema Challenge, a widely
employed test of common sense reasoning ability. We show, however, with a new
diagnostic dataset, that these models are sensitive to linguistic perturbations
of the Winograd examples that minimally affect human understanding. Our results
highlight interesting differences between humans and language models: language
models are more sensitive to number or gender alternations and synonym
replacements than humans, and humans are more stable and consistent in their
predictions, maintain a much higher absolute performance, and perform better on
non-associative instances than associative ones. Overall, humans are correct
more often than out-of-the-box models, and the models are sometimes right for
the wrong reasons. Finally, we show that fine-tuning on a large, task-specific
dataset can offer a solution to these issues.Comment: ACL 202
Slim Embedding Layers for Recurrent Neural Language Models
Recurrent neural language models are the state-of-the-art models for language
modeling. When the vocabulary size is large, the space taken to store the model
parameters becomes the bottleneck for the use of recurrent neural language
models. In this paper, we introduce a simple space compression method that
randomly shares the structured parameters at both the input and output
embedding layers of the recurrent neural language models to significantly
reduce the size of model parameters, but still compactly represent the original
input and output embedding layers. The method is easy to implement and tune.
Experiments on several data sets show that the new method can get similar
perplexity and BLEU score results while only using a very tiny fraction of
parameters.Comment: To appear at AAAI 201
A Unified Multilingual Handwriting Recognition System using multigrams sub-lexical units
We address the design of a unified multilingual system for handwriting
recognition. Most of multi- lingual systems rests on specialized models that
are trained on a single language and one of them is selected at test time.
While some recognition systems are based on a unified optical model, dealing
with a unified language model remains a major issue, as traditional language
models are generally trained on corpora composed of large word lexicons per
language. Here, we bring a solution by con- sidering language models based on
sub-lexical units, called multigrams. Dealing with multigrams strongly reduces
the lexicon size and thus decreases the language model complexity. This makes
pos- sible the design of an end-to-end unified multilingual recognition system
where both a single optical model and a single language model are trained on
all the languages. We discuss the impact of the language unification on each
model and show that our system reaches state-of-the-art methods perfor- mance
with a strong reduction of the complexity.Comment: preprin
- …
