134 research outputs found
Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations
We propose a novel data augmentation for labeled sentences called contextual
augmentation. We assume an invariance that sentences are natural even if the
words in the sentences are replaced with other words with paradigmatic
relations. We stochastically replace words with other words that are predicted
by a bi-directional language model at the word positions. Words predicted
according to a context are numerous but appropriate for the augmentation of the
original words. Furthermore, we retrofit a language model with a
label-conditional architecture, which allows the model to augment sentences
without breaking the label-compatibility. Through the experiments for six
various different text classification tasks, we demonstrate that the proposed
method improves classifiers based on the convolutional or recurrent neural
networks.Comment: NAACL 201
Exploiting Rich Syntactic Information for Semantic Parsing with Graph-to-Sequence Model
Existing neural semantic parsers mainly utilize a sequence encoder, i.e., a
sequential LSTM, to extract word order features while neglecting other valuable
syntactic information such as dependency graph or constituent trees. In this
paper, we first propose to use the \textit{syntactic graph} to represent three
types of syntactic information, i.e., word order, dependency and constituency
features. We further employ a graph-to-sequence model to encode the syntactic
graph and decode a logical form. Experimental results on benchmark datasets
show that our model is comparable to the state-of-the-art on Jobs640, ATIS and
Geo880. Experimental results on adversarial examples demonstrate the robustness
of the model is also improved by encoding more syntactic information.Comment: EMNLP'1
European Union regulations on algorithmic decision-making and a "right to explanation"
We summarize the potential impact that the European Union's new General Data
Protection Regulation will have on the routine use of machine learning
algorithms. Slated to take effect as law across the EU in 2018, it will
restrict automated individual decision-making (that is, algorithms that make
decisions based on user-level predictors) which "significantly affect" users.
The law will also effectively create a "right to explanation," whereby a user
can ask for an explanation of an algorithmic decision that was made about them.
We argue that while this law will pose large challenges for industry, it
highlights opportunities for computer scientists to take the lead in designing
algorithms and evaluation frameworks which avoid discrimination and enable
explanation.Comment: presented at 2016 ICML Workshop on Human Interpretability in Machine
Learning (WHI 2016), New York, N
How Much is 131 Million Dollars? Putting Numbers in Perspective with Compositional Descriptions
How much is 131 million US dollars? To help readers put such numbers in
context, we propose a new task of automatically generating short descriptions
known as perspectives, e.g. "$131 million is about the cost to employ everyone
in Texas over a lunch period". First, we collect a dataset of numeric mentions
in news articles, where each mention is labeled with a set of rated
perspectives. We then propose a system to generate these descriptions
consisting of two steps: formula construction and description generation. In
construction, we compose formulae from numeric facts in a knowledge base and
rank the resulting formulas based on familiarity, numeric proximity and
semantic compatibility. In generation, we convert a formula into natural
language using a sequence-to-sequence recurrent neural network. Our system
obtains a 15.2% F1 improvement over a non-compositional baseline at formula
construction and a 12.5 BLEU point improvement over a baseline description
generation
- …