9 research outputs found
Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations
We propose a novel data augmentation for labeled sentences called contextual
augmentation. We assume an invariance that sentences are natural even if the
words in the sentences are replaced with other words with paradigmatic
relations. We stochastically replace words with other words that are predicted
by a bi-directional language model at the word positions. Words predicted
according to a context are numerous but appropriate for the augmentation of the
original words. Furthermore, we retrofit a language model with a
label-conditional architecture, which allows the model to augment sentences
without breaking the label-compatibility. Through the experiments for six
various different text classification tasks, we demonstrate that the proposed
method improves classifiers based on the convolutional or recurrent neural
networks.Comment: NAACL 201
Learning to Select, Track, and Generate for Data-to-Text
We propose a data-to-text generation model with two modules, one for tracking
and the other for text generation. Our tracking module selects and keeps track
of salient information and memorizes which record has been mentioned. Our
generation module generates a summary conditioned on the state of tracking
module. Our model is considered to simulate the human-like writing process that
gradually selects the information by determining the intermediate variables
while writing the summary. In addition, we also explore the effectiveness of
the writer information for generation. Experimental results show that our model
outperforms existing models in all evaluation metrics even without writer
information. Incorporating writer information further improves the performance,
contributing to content planning and surface realization.Comment: ACL 201
Question Dependent Recurrent Entity Network for Question Answering
Question Answering is a task which requires building models capable of providing answers to questions expressed in human language. Full question answering involves some form of reasoning ability. We introduce a neural network architecture for this task, which is a form of Memory Network, that recognizes entities and their relations to answers through a focus attention mechanism. Our model is named Question Dependent Recurrent Entity Network and extends the Recurrent Entity Network by exploiting aspects of the question during the memorization process. We validate the model on both synthetic and real datasets: the bAbI question answering dataset and the CNN & Daily News reading comprehension dataset. In our experiments, our models improved the existing Recurrent Entity Network and achieved competitive results in both dataset
Open Vocabulary Learning on Source Code with a Graph-Structured Cache
Machine learning models that take computer program source code as input typically use Natural Language Processing (NLP) techniques. However, a major challenge is that code is written using an open, rapidly changing vocabulary due to, e.g., the coinage of new variable and method names. Reasoning over such a vocabulary is not something for which most NLP methods are designed. We introduce a Graph-Structured Cache to address this problem; this cache contains a node for each new word the model encounters with edges connecting each word to its occurrences in the code. We find that combining this graph-structured cache strategy with recent Graph-Neural-Network-based models for supervised learning on code improves the models’ performance on a code completion task and a variable naming task — with over 100% relative improvement on the latter — at the cost of a moderate increase in computation time