24,789 research outputs found
Active Learning for Visual Question Answering: An Empirical Study
We present an empirical study of active learning for Visual Question
Answering, where a deep VQA model selects informative question-image pairs from
a pool and queries an oracle for answers to maximally improve its performance
under a limited query budget. Drawing analogies from human learning, we explore
cramming (entropy), curiosity-driven (expected model change), and goal-driven
(expected error reduction) active learning approaches, and propose a fast and
effective goal-driven active learning scoring function to pick question-image
pairs for deep VQA models under the Bayesian Neural Network framework. We find
that deep VQA models need large amounts of training data before they can start
asking informative questions. But once they do, all three approaches outperform
the random selection baseline and achieve significant query savings. For the
scenario where the model is allowed to ask generic questions about images but
is evaluated only on specific questions (e.g., questions whose answer is either
yes or no), our proposed goal-driven scoring function performs the best
Sentence Pair Scoring: Towards Unified Framework for Text Comprehension
We review the task of Sentence Pair Scoring, popular in the literature in
various forms - viewed as Answer Sentence Selection, Semantic Text Scoring,
Next Utterance Ranking, Recognizing Textual Entailment, Paraphrasing or e.g. a
component of Memory Networks.
We argue that all such tasks are similar from the model perspective and
propose new baselines by comparing the performance of common IR metrics and
popular convolutional, recurrent and attention-based neural models across many
Sentence Pair Scoring tasks and datasets. We discuss the problem of evaluating
randomized models, propose a statistically grounded methodology, and attempt to
improve comparisons by releasing new datasets that are much harder than some of
the currently used well explored benchmarks. We introduce a unified open source
software framework with easily pluggable models and tasks, which enables us to
experiment with multi-task reusability of trained sentence model. We set a new
state-of-art in performance on the Ubuntu Dialogue dataset.Comment: submitted as paper to CoNLL 201
The Cascade Transformer: an Application for Efficient Answer Sentence Selection
Large transformer-based language models have been shown to be very effective
in many classification tasks. However, their computational complexity prevents
their use in applications requiring the classification of a large set of
candidates. While previous works have investigated approaches to reduce model
size, relatively little attention has been paid to techniques to improve batch
throughput during inference. In this paper, we introduce the Cascade
Transformer, a simple yet effective technique to adapt transformer-based models
into a cascade of rankers. Each ranker is used to prune a subset of candidates
in a batch, thus dramatically increasing throughput at inference time. Partial
encodings from the transformer model are shared among rerankers, providing
further speed-up. When compared to a state-of-the-art transformer model, our
approach reduces computation by 37% with almost no impact on accuracy, as
measured on two English Question Answering datasets.Comment: Accepted to ACL 2020 (long
A Compare-Aggregate Model with Latent Clustering for Answer Selection
In this paper, we propose a novel method for a sentence-level
answer-selection task that is a fundamental problem in natural language
processing. First, we explore the effect of additional information by adopting
a pretrained language model to compute the vector representation of the input
text and by applying transfer learning from a large-scale corpus. Second, we
enhance the compare-aggregate model by proposing a novel latent clustering
method to compute additional information within the target corpus and by
changing the objective function from listwise to pointwise. To evaluate the
performance of the proposed approaches, experiments are performed with the
WikiQA and TREC-QA datasets. The empirical results demonstrate the superiority
of our proposed approach, which achieve state-of-the-art performance for both
datasets.Comment: 5 pages, Accepted as a conference paper at CIKM 201
Open-Retrieval Conversational Question Answering
Conversational search is one of the ultimate goals of information retrieval.
Recent research approaches conversational search by simplified settings of
response ranking and conversational question answering, where an answer is
either selected from a given candidate set or extracted from a given passage.
These simplifications neglect the fundamental role of retrieval in
conversational search. To address this limitation, we introduce an
open-retrieval conversational question answering (ORConvQA) setting, where we
learn to retrieve evidence from a large collection before extracting answers,
as a further step towards building functional conversational search systems. We
create a dataset, OR-QuAC, to facilitate research on ORConvQA. We build an
end-to-end system for ORConvQA, featuring a retriever, a reranker, and a reader
that are all based on Transformers. Our extensive experiments on OR-QuAC
demonstrate that a learnable retriever is crucial for ORConvQA. We further show
that our system can make a substantial improvement when we enable history
modeling in all system components. Moreover, we show that the reranker
component contributes to the model performance by providing a regularization
effect. Finally, further in-depth analyses are performed to provide new
insights into ORConvQA.Comment: Accepted to SIGIR'2
Degrees of Freedom in Deep Neural Networks
In this paper, we explore degrees of freedom in deep sigmoidal neural
networks. We show that the degrees of freedom in these models is related to the
expected optimism, which is the expected difference between test error and
training error. We provide an efficient Monte-Carlo method to estimate the
degrees of freedom for multi-class classification methods. We show degrees of
freedom are lower than the parameter count in a simple XOR network. We extend
these results to neural nets trained on synthetic and real data, and
investigate impact of network's architecture and different regularization
choices. The degrees of freedom in deep networks are dramatically smaller than
the number of parameters, in some real datasets several orders of magnitude.
Further, we observe that for fixed number of parameters, deeper networks have
less degrees of freedom exhibiting a regularization-by-depth
Text-based Question Answering from Information Retrieval and Deep Neural Network Perspectives: A Survey
Text-based Question Answering (QA) is a challenging task which aims at
finding short concrete answers for users' questions. This line of research has
been widely studied with information retrieval techniques and has received
increasing attention in recent years by considering deep neural network
approaches. Deep learning approaches, which are the main focus of this paper,
provide a powerful technique to learn multiple layers of representations and
interaction between questions and texts. In this paper, we provide a
comprehensive overview of different models proposed for the QA task, including
both traditional information retrieval perspective, and more recent deep neural
network perspective. We also introduce well-known datasets for the task and
present available results from the literature to have a comparison between
different techniques
Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks
Existing question answering methods infer answers either from a knowledge
base or from raw text. While knowledge base (KB) methods are good at answering
compositional questions, their performance is often affected by the
incompleteness of the KB. Au contraire, web text contains millions of facts
that are absent in the KB, however in an unstructured form. {\it Universal
schema} can support reasoning on the union of both structured KBs and
unstructured text by aligning them in a common embedded space. In this paper we
extend universal schema to natural language question answering, employing
\emph{memory networks} to attend to the large body of facts in the combination
of text and KB. Our models can be trained in an end-to-end fashion on
question-answer pairs. Evaluation results on \spades fill-in-the-blank question
answering dataset show that exploiting universal schema for question answering
is better than using either a KB or text alone. This model also outperforms the
current state-of-the-art by 8.5 points.\footnote{Code and data available
in \url{https://rajarshd.github.io/TextKBQA}}Comment: ACL 2017 (short
Unsupervised Domain Adaptation on Reading Comprehension
Reading comprehension (RC) has been studied in a variety of datasets with the
boosted performance brought by deep neural networks. However, the
generalization capability of these models across different domains remains
unclear. To alleviate this issue, we are going to investigate unsupervised
domain adaptation on RC, wherein a model is trained on labeled source domain
and to be applied to the target domain with only unlabeled samples. We first
show that even with the powerful BERT contextual representation, the
performance is still unsatisfactory when the model trained on one dataset is
directly applied to another target dataset. To solve this, we provide a novel
conditional adversarial self-training method (CASe). Specifically, our approach
leverages a BERT model fine-tuned on the source dataset along with the
confidence filtering to generate reliable pseudo-labeled samples in the target
domain for self-training. On the other hand, it further reduces domain
distribution discrepancy through conditional adversarial learning across
domains. Extensive experiments show our approach achieves comparable accuracy
to supervised models on multiple large-scale benchmark datasets.Comment: 8 pages, 6 figures, 5 tables, Accepted by AAAI 202
Bi-Directional Neural Machine Translation with Synthetic Parallel Data
Despite impressive progress in high-resource settings, Neural Machine
Translation (NMT) still struggles in low-resource and out-of-domain scenarios,
often failing to match the quality of phrase-based translation. We propose a
novel technique that combines back-translation and multilingual NMT to improve
performance in these difficult cases. Our technique trains a single model for
both directions of a language pair, allowing us to back-translate source or
target monolingual data without requiring an auxiliary model. We then continue
training on the augmented parallel data, enabling a cycle of improvement for a
single model that can incorporate any source, target, or parallel data to
improve both translation directions. As a byproduct, these models can reduce
training and deployment costs significantly compared to uni-directional models.
Extensive experiments show that our technique outperforms standard
back-translation in low-resource scenarios, improves quality on cross-domain
tasks, and effectively reduces costs across the board.Comment: Accepted at the 2nd Workshop on Neural Machine Translation and
Generation (WNMT 2018
- …