18 research outputs found
Supervised and Unsupervised Transfer Learning for Question Answering
Although transfer learning has been shown to be successful for tasks like
object and speech recognition, its applicability to question answering (QA) has
yet to be well-studied. In this paper, we conduct extensive experiments to
investigate the transferability of knowledge learned from a source QA dataset
to a target dataset using two QA models. The performance of both models on a
TOEFL listening comprehension test (Tseng et al., 2016) and MCTest (Richardson
et al., 2013) is significantly improved via a simple transfer learning
technique from MovieQA (Tapaswi et al., 2016). In particular, one of the models
achieves the state-of-the-art on all target datasets; for the TOEFL listening
comprehension test, it outperforms the previous best model by 7%. Finally, we
show that transfer learning is helpful even in unsupervised scenarios when
correct answers for target QA dataset examples are not available.Comment: To appear in NAACL HLT 2018 (long paper
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
We present TriviaQA, a challenging reading comprehension dataset containing
over 650K question-answer-evidence triples. TriviaQA includes 95K
question-answer pairs authored by trivia enthusiasts and independently gathered
evidence documents, six per question on average, that provide high quality
distant supervision for answering the questions. We show that, in comparison to
other recently introduced large-scale datasets, TriviaQA (1) has relatively
complex, compositional questions, (2) has considerable syntactic and lexical
variability between questions and corresponding answer-evidence sentences, and
(3) requires more cross sentence reasoning to find answers. We also present two
baseline algorithms: a feature-based classifier and a state-of-the-art neural
network, that performs well on SQuAD reading comprehension. Neither approach
comes close to human performance (23% and 40% vs. 80%), suggesting that
TriviaQA is a challenging testbed that is worth significant future study. Data
and code available at -- http://nlp.cs.washington.edu/triviaqa/Comment: Added references, fixed typos, minor baseline updat
Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus
Many efforts of research are devoted to semantic role labeling (SRL) which is
crucial for natural language understanding. Supervised approaches have achieved
impressing performances when large-scale corpora are available for
resource-rich languages such as English. While for the low-resource languages
with no annotated SRL dataset, it is still challenging to obtain competitive
performances. Cross-lingual SRL is one promising way to address the problem,
which has achieved great advances with the help of model transferring and
annotation projection. In this paper, we propose a novel alternative based on
corpus translation, constructing high-quality training datasets for the target
languages from the source gold-standard SRL annotations. Experimental results
on Universal Proposition Bank show that the translation-based method is highly
effective, and the automatic pseudo datasets can improve the target-language
SRL performances significantly.Comment: Accepted at ACL 202