Search CORE

9 research outputs found

Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations

Author: Kobayashi Sosuke
Publication venue
Publication date: 01/01/2018
Field of study

We propose a novel data augmentation for labeled sentences called contextual augmentation. We assume an invariance that sentences are natural even if the words in the sentences are replaced with other words with paradigmatic relations. We stochastically replace words with other words that are predicted by a bi-directional language model at the word positions. Words predicted according to a context are numerous but appropriate for the augmentation of the original words. Furthermore, we retrofit a language model with a label-conditional architecture, which allows the model to augment sentences without breaking the label-compatibility. Through the experiments for six various different text classification tasks, we demonstrate that the proposed method improves classifiers based on the convolutional or recurrent neural networks.Comment: NAACL 201

arXiv.org e-Print Archive

Crossref

Learning to Select, Track, and Generate for Data-to-Text

Author: Aramaki Eiji
Ishigaki Tatsuya
Iso Hayate
Kobayashi Ichiro
Miyao Yusuke
Noji Hiroshi
Okazaki Naoaki
Takamura Hiroya
Uehara Yui
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

We propose a data-to-text generation model with two modules, one for tracking and the other for text generation. Our tracking module selects and keeps track of salient information and memorizes which record has been mentioned. Our generation module generates a summary conditioned on the state of tracking module. Our model is considered to simulate the human-like writing process that gradually selects the information by determining the intermediate variables while writing the summary. In addition, we also explore the effectiveness of the writer information for generation. Experimental results show that our model outperforms existing models in all evaluation metrics even without writer information. Incorporating writer information further improves the performance, contributing to content planning and surface realization.Comment: ACL 201

arXiv.org e-Print Archive

Crossref

Question Dependent Recurrent Entity Network for Question Answering

Author: Attardi Giuseppe
Madotto Andrea
Publication venue: 'OpenEdition'
Publication date: 15/12/2020
Field of study

Question Answering is a task which requires building models capable of providing answers to questions expressed in human language. Full question answering involves some form of reasoning ability. We introduce a neural network architecture for this task, which is a form of Memory Network, that recognizes entities and their relations to answers through a focus attention mechanism. Our model is named Question Dependent Recurrent Entity Network and extends the Recurrent Entity Network by exploiting aspects of the question during the memorization process. We validate the model on both synthetic and real datasets: the bAbI question answering dataset and the CNN & Daily News reading comprehension dataset. In our experiments, our models improved the existing Recurrent Entity Network and achieved competitive results in both dataset

OpenEdition

Open Vocabulary Learning on Source Code with a Graph-Structured Cache

Author: Anandkumar Anima
Cvitkovic Milan
Singh Badal
Publication venue: PMLR
Publication date: 01/06/2019
Field of study

Machine learning models that take computer program source code as input typically use Natural Language Processing (NLP) techniques. However, a major challenge is that code is written using an open, rapidly changing vocabulary due to, e.g., the coinage of new variable and method names. Reasoning over such a vocabulary is not something for which most NLP methods are designed. We introduce a Graph-Structured Cache to address this problem; this cache contains a node for each new word the model encounters with edges connecting each word to its occurrences in the code. We find that combining this graph-structured cache strategy with recent Graph-Neural-Network-based models for supervised learning on code improves the models’ performance on a code completion task and a variable naming task — with over 100% relative improvement on the latter — at the cost of a moderate increase in computation time