Search CORE

229 research outputs found

Answering Complex Questions by Joining Multi-Document Evidence with Quasi Knowledge Graphs

Author: Abujabal A.
Lu X.
Pramanik S.
Saha Roy R.
Wang Y.
Weikum G.
Publication venue
Publication date: 01/01/2019
Field of study

Direct answering of questions that involve multiple entities and relations is a challenge for text-based QA. This problem is most pronounced when answers can be found only by joining evidence from multiple documents. Curated knowledge graphs (KGs) may yield good answers, but are limited by their inherent incompleteness and potential staleness. This paper presents QUEST, a method that can answer complex questions directly from textual sources on-the-fly, by computing similarity joins over partial results from different documents. Our method is completely unsupervised, avoiding training-data bottlenecks and being able to cope with rapidly evolving ad hoc topics and formulation style in user questions. QUEST builds a noisy quasi KG with node and edge weights, consisting of dynamically retrieved entity names and relational phrases. It augments this graph with types and semantic alignments, and computes the best answers by an algorithm for Group Steiner Trees. We evaluate QUEST on benchmarks of complex questions, and show that it substantially outperforms state-of-the-art baselines

MPG.PuRe

Neural Information Retrieval: at the End of the Early Years

Author: Altingovde I.S.
Angert A.
Banner E.
Braylan A.
Chang H.-L.
Dang B.
de Rijke M.
Karagoz P.
Khetan V.
Kim H.
Lease M.
McDonnell T.
McNamara Q.
Nguyen A.T.
Onal K.D.
Rahman M.M.
Wallace B.C.
Xu D.
Zhang Y.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Recommended from our members

ANSWER SIMILARITY GROUPING AND DIVERSIFICATION IN QUESTION ANSWERING SYSTEMS

Author: Vikraman Lakshmi Nair
Publication venue: ScholarWorks@UMass Amherst
Publication date: 26/10/2022
Field of study

The rise in popularity of mobile and voice search has led to a shift in IR from document to passage retrieval for non-factoid questions. Various datasets such as MSMarco, as well as efficient retrieval models have been developed to identify single best answer passages for this task. However, such models do not specifically address questions which could have multiple or alternative answers. In this dissertation, we focus on this new research area that involves studying answer passage relationships and how this could be applied to passage retrieval tasks. We first create a high quality dataset for the answer passage similarity task in the context of question answering. Manual annotation of passage pairs is performed to set the similarity labels, from which answer group information is automatically generated. We next investigate different types of representations, which could be used to create effective clusters. We experiment with various unsupervised representations and show that distributional representations outperform term based representations for this task. Next, weak supervision is leveraged to further improve the cluster modeling performance. We use BERT as the underlying model for training and show the relative performance of various weak signals such as GloVe and term-based Language Modeling for this task. In order to apply these clusters to the answer passage retrieval task for multi-answer questions, we use a modified version of the Maximal Marginal Relevance (MMR) diversification model. We demonstrate that answers retrieved using this model are more diverse i.e, cover more answer types with low redundancy as well as maximize relevance, with respect to the baselines. So far, we used passage clustering as a means to identify answer groups corresponding to a question and apply them in a question answering task. We extend this a step further by looking at related questions within a conversation. For this purpose, we expand the definition of Reciprocal Rank Fusion (RRF) and use this to identify pertinent history passages for such questions. Updated question rewrites generated using these passages are then used to improve the conversational search task. In addition to being the first work that looks at answer relationships, our specific contributions can be summarized as follows: (1) Creation of new datasets with passage similarity and answer type information; (2) Effective passage similarity clustering models using unsupervised representations and weak supervision methods; (3) Applying the passage similarity/clustering information to diversification framework; (4) Identifying good response history candidates using answer passage clustering for the conversational search task

ScholarWorks@UMass Amherst

Evaluation campaigns and TRECVid

Author: Kraaij Wessel
Over Paul
Smeaton Alan F.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2006
Field of study

The TREC Video Retrieval Evaluation (TRECVid) is an international benchmarking activity to encourage research in video information retrieval by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. TRECVid completed its fifth annual cycle at the end of 2005 and in 2006 TRECVid will involve almost 70 research organizations, universities and other consortia. Throughout its existence, TRECVid has benchmarked both interactive and automatic/manual searching for shots from within a video corpus, automatic detection of a variety of semantic and low-level video features, shot boundary detection and the detection of story boundaries in broadcast TV news. This paper will give an introduction to information retrieval (IR) evaluation from both a user and a system perspective, highlighting that system evaluation is by far the most prevalent type of evaluation carried out. We also include a summary of TRECVid as an example of a system evaluation benchmarking campaign and this allows us to discuss whether such campaigns are a good thing or a bad thing. There are arguments for and against these campaigns and we present some of them in the paper concluding that on balance they have had a very positive impact on research progress

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Graph-Based Entity-Oriented Search

Author: José Luís da Silva Devezas
Publication venue
Publication date: 26/01/2021
Field of study

Repositório Aberto da Universidade do Porto

Distilling Knowledge from Reader to Retriever for Question Answering

Author: Grave Edouard
Izacard Gautier
Publication venue
Publication date: 08/12/2020
Field of study

The task of information retrieval is an important component of many natural language processing systems, such as open domain question answering. While traditional methods were based on hand-crafted features, continuous representations based on neural networks recently obtained competitive results. A challenge of using such methods is to obtain supervised data to train the retriever model, corresponding to pairs of query and support documents. In this paper, we propose a technique to learn retriever models for downstream tasks, inspired by knowledge distillation, and which does not require annotated pairs of query and documents. Our approach leverages attention scores of a reader model, used to solve the task based on retrieved documents, to obtain synthetic labels for the retriever. We evaluate our method on question answering, obtaining state-of-the-art results

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server