3,680 research outputs found
TAPAS: Weakly Supervised Table Parsing via Pre-training
Answering natural language questions over tables is usually seen as a
semantic parsing task. To alleviate the collection cost of full logical forms,
one popular approach focuses on weak supervision consisting of denotations
instead of logical forms. However, training semantic parsers from weak
supervision poses difficulties, and in addition, the generated logical forms
are only used as an intermediate step prior to retrieving the denotation. In
this paper, we present TAPAS, an approach to question answering over tables
without generating logical forms. TAPAS trains from weak supervision, and
predicts the denotation by selecting table cells and optionally applying a
corresponding aggregation operator to such selection. TAPAS extends BERT's
architecture to encode tables as input, initializes from an effective joint
pre-training of text segments and tables crawled from Wikipedia, and is trained
end-to-end. We experiment with three different semantic parsing datasets, and
find that TAPAS outperforms or rivals semantic parsing models by improving
state-of-the-art accuracy on SQA from 55.1 to 67.2 and performing on par with
the state-of-the-art on WIKISQL and WIKITQ, but with a simpler model
architecture. We additionally find that transfer learning, which is trivial in
our setting, from WIKISQL to WIKITQ, yields 48.7 accuracy, 4.2 points above the
state-of-the-art.Comment: Accepted to ACL 202
Neural Ranking Models with Weak Supervision
Despite the impressive improvements achieved by unsupervised deep neural
networks in computer vision and NLP tasks, such improvements have not yet been
observed in ranking for information retrieval. The reason may be the complexity
of the ranking problem, as it is not obvious how to learn from queries and
documents when no supervised signal is available. Hence, in this paper, we
propose to train a neural ranking model using weak supervision, where labels
are obtained automatically without human annotators or any external resources
(e.g., click data). To this aim, we use the output of an unsupervised ranking
model, such as BM25, as a weak supervision signal. We further train a set of
simple yet effective ranking models based on feed-forward neural networks. We
study their effectiveness under various learning scenarios (point-wise and
pair-wise models) and using different input representations (i.e., from
encoding query-document pairs into dense/sparse vectors to using word embedding
representation). We train our networks using tens of millions of training
instances and evaluate it on two standard collections: a homogeneous news
collection(Robust) and a heterogeneous large-scale web collection (ClueWeb).
Our experiments indicate that employing proper objective functions and letting
the networks to learn the input representation based on weakly supervised data
leads to impressive performance, with over 13% and 35% MAP improvements over
the BM25 model on the Robust and the ClueWeb collections. Our findings also
suggest that supervised neural ranking models can greatly benefit from
pre-training on large amounts of weakly labeled data that can be easily
obtained from unsupervised IR models.Comment: In proceedings of The 40th International ACM SIGIR Conference on
Research and Development in Information Retrieval (SIGIR2017
Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering
Generative models for open domain question answering have proven to be
competitive, without resorting to external knowledge. While promising, this
approach requires to use models with billions of parameters, which are
expensive to train and query. In this paper, we investigate how much these
models can benefit from retrieving text passages, potentially containing
evidence. We obtain state-of-the-art results on the Natural Questions and
TriviaQA open benchmarks. Interestingly, we observe that the performance of
this method significantly improves when increasing the number of retrieved
passages. This is evidence that generative models are good at aggregating and
combining evidence from multiple passages
- …