126 research outputs found
Transductive Auxiliary Task Self-Training for Neural Multi-Task Models
Multi-task learning and self-training are two common ways to improve a
machine learning model's performance in settings with limited training data.
Drawing heavily on ideas from those two approaches, we suggest transductive
auxiliary task self-training: training a multi-task model on (i) a combination
of main and auxiliary task training data, and (ii) test instances with
auxiliary task labels which a single-task version of the model has previously
generated. We perform extensive experiments on 86 combinations of languages and
tasks. Our results are that, on average, transductive auxiliary task
self-training improves absolute accuracy by up to 9.56% over the pure
multi-task model for dependency relation tagging and by up to 13.03% for
semantic tagging.Comment: Camera ready version, to appear at DeepLo 2019 (EMNLP workshop
Multi-Task Learning of Keyphrase Boundary Classification
Keyphrase boundary classification (KBC) is the task of detecting keyphrases
in scientific articles and labelling them with respect to predefined types.
Although important in practice, this task is so far underexplored, partly due
to the lack of labelled data. To overcome this, we explore several auxiliary
tasks, including semantic super-sense tagging and identification of multi-word
expressions, and cast the task as a multi-task learning problem with deep
recurrent neural networks. Our multi-task models perform significantly better
than previous state of the art approaches on two scientific KBC datasets,
particularly for long keyphrases.Comment: ACL 201
Simplified Neural Unsupervised Domain Adaptation
Unsupervised domain adaptation (UDA) is the task of modifying a statistical
model trained on labeled data from a source domain to achieve better
performance on data from a target domain, with access to only unlabeled data in
the target domain. Existing state-of-the-art UDA approaches use neural networks
to learn representations that can predict the values of subset of important
features called "pivot features." In this work, we show that it is possible to
improve on these methods by jointly training the representation learner with
the task learner, and examine the importance of existing pivot selection
methods.Comment: To be presented at NAACL 201
Neural Paraphrase Identification of Questions with Noisy Pretraining
We present a solution to the problem of paraphrase identification of
questions. We focus on a recent dataset of question pairs annotated with binary
paraphrase labels and show that a variant of the decomposable attention model
(Parikh et al., 2016) results in accurate performance on this task, while being
far simpler than many competing neural architectures. Furthermore, when the
model is pretrained on a noisy dataset of automatically collected question
paraphrases, it obtains the best reported performance on the dataset
- …