367 research outputs found
Neural Natural Language Inference Models Enhanced with External Knowledge
Modeling natural language inference is a very challenging task. With the
availability of large annotated data, it has recently become feasible to train
complex models such as neural-network-based inference models, which have shown
to achieve the state-of-the-art performance. Although there exist relatively
large annotated data, can machines learn all knowledge needed to perform
natural language inference (NLI) from these data? If not, how can
neural-network-based NLI models benefit from external knowledge and how to
build NLI models to leverage it? In this paper, we enrich the state-of-the-art
neural natural language inference models with external knowledge. We
demonstrate that the proposed models improve neural NLI models to achieve the
state-of-the-art performance on the SNLI and MultiNLI datasets.Comment: Accepted by ACL 201
Modeling Empathy and Distress in Reaction to News Stories
Computational detection and understanding of empathy is an important factor
in advancing human-computer interaction. Yet to date, text-based empathy
prediction has the following major limitations: It underestimates the
psychological complexity of the phenomenon, adheres to a weak notion of ground
truth where empathic states are ascribed by third parties, and lacks a shared
corpus. In contrast, this contribution presents the first publicly available
gold standard for empathy prediction. It is constructed using a novel
annotation methodology which reliably captures empathy assessments by the
writer of a statement using multi-item scales. This is also the first
computational work distinguishing between multiple forms of empathy, empathic
concern, and personal distress, as recognized throughout psychology. Finally,
we present experimental results for three different predictive models, of which
a CNN performs the best.Comment: To appear at EMNLP 201
Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems
Visual modifications to text are often used to obfuscate offensive comments
in social media (e.g., "!d10t") or as a writing style ("1337" in "leet speak"),
among other scenarios. We consider this as a new type of adversarial attack in
NLP, a setting to which humans are very robust, as our experiments with both
simple and more difficult visual input perturbations demonstrate. We then
investigate the impact of visual adversarial attacks on current NLP systems on
character-, word-, and sentence-level tasks, showing that both neural and
non-neural models are, in contrast to humans, extremely sensitive to such
attacks, suffering performance decreases of up to 82\%. We then explore three
shielding methods---visual character embeddings, adversarial training, and
rule-based recovery---which substantially improve the robustness of the models.
However, the shielding methods still fall behind performances achieved in
non-attack scenarios, which demonstrates the difficulty of dealing with visual
attacks.Comment: Accepted as long paper at NAACL-2019; fixed one ungrammatical
sentenc
SRL4ORL: Improving Opinion Role Labeling using Multi-task Learning with Semantic Role Labeling
For over a decade, machine learning has been used to extract
opinion-holder-target structures from text to answer the question "Who
expressed what kind of sentiment towards what?". Recent neural approaches do
not outperform the state-of-the-art feature-based models for Opinion Role
Labeling (ORL). We suspect this is due to the scarcity of labeled training data
and address this issue using different multi-task learning (MTL) techniques
with a related task which has substantially more data, i.e. Semantic Role
Labeling (SRL). We show that two MTL models improve significantly over the
single-task model for labeling of both holders and targets, on the development
and the test sets. We found that the vanilla MTL model which makes predictions
using only shared ORL and SRL features, performs the best. With deeper analysis
we determine what works and what might be done to make further improvements for
ORL.Comment: Published in NAACL 201
Region-Attentive Multimodal Neural Machine Translation
We propose a multimodal neural machine translation (MNMT) method with semantic image regions called region-attentive multimodal neural machine translation (RA-NMT). Existing studies on MNMT have mainly focused on employing global visual features or equally sized grid local visual features extracted by convolutional neural networks (CNNs) to improve translation performance. However, they neglect the effect of semantic information captured inside the visual features. This study utilizes semantic image regions extracted by object detection for MNMT and integrates visual and textual features using two modality-dependent attention mechanisms. The proposed method was implemented and verified on two neural architectures of neural machine translation (NMT): recurrent neural network (RNN) and self-attention network (SAN). Experimental results on different language pairs of Multi30k dataset show that our proposed method improves over baselines and outperforms most of the state-of-the-art MNMT methods. Further analysis demonstrates that the proposed method can achieve better translation performance because of its better visual feature use
An Empirical Analysis of NMT-Derived Interlingual Embeddings and their Use in Parallel Sentence Identification
End-to-end neural machine translation has overtaken statistical machine
translation in terms of translation quality for some language pairs, specially
those with large amounts of parallel data. Besides this palpable improvement,
neural networks provide several new properties. A single system can be trained
to translate between many languages at almost no additional cost other than
training time. Furthermore, internal representations learned by the network
serve as a new semantic representation of words -or sentences- which, unlike
standard word embeddings, are learned in an essentially bilingual or even
multilingual context. In view of these properties, the contribution of the
present work is two-fold. First, we systematically study the NMT context
vectors, i.e. output of the encoder, and their power as an interlingua
representation of a sentence. We assess their quality and effectiveness by
measuring similarities across translations, as well as semantically related and
semantically unrelated sentence pairs. Second, as extrinsic evaluation of the
first point, we identify parallel sentences in comparable corpora, obtaining an
F1=98.2% on data from a shared task when using only NMT context vectors. Using
context vectors jointly with similarity measures F1 reaches 98.9%.Comment: 11 pages, 4 figure
- …