Search CORE

3,167 research outputs found

Design Patterns for Fusion-Based Object Retrieval

Author: C Macdonald
H Fang
M Shokouhi
W Weerkamp
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/08/2017
Field of study

We address the task of ranking objects (such as people, blogs, or verticals) that, unlike documents, do not have direct term-based representations. To be able to match them against keyword queries, evidence needs to be amassed from documents that are associated with the given object. We present two design patterns, i.e., general reusable retrieval strategies, which are able to encompass most existing approaches from the past. One strategy combines evidence on the term level (early fusion), while the other does it on the document level (late fusion). We demonstrate the generality of these patterns by applying them to three different object retrieval tasks: expert finding, blog distillation, and vertical ranking.Comment: Proceedings of the 39th European conference on Advances in Information Retrieval (ECIR '17), 201

arXiv.org e-Print Archive

Crossref

Overview of the TREC 2013 federated web search track

Author: Demeester Thomas
Hiemstra D
Nguyen D
Trieschnigg D
Publication venue
Publication date: 01/01/2013
Field of study

Ghent University Academic Bibliography

Explicit diversification of event aspects for temporal summarization

Author: Macdonald Craig
McCreadie Richard
Ounis Iadh
Santos Rodrygo L.T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2018
Field of study

During major events, such as emergencies and disasters, a large volume of information is reported on newswire and social media platforms. Temporal summarization (TS) approaches are used to automatically produce concise overviews of such events by extracting text snippets from related articles over time. Current TS approaches rely on a combination of event relevance and textual novelty for snippet selection. However, for events that span multiple days, textual novelty is often a poor criterion for selecting snippets, since many snippets are textually unique but are semantically redundant or non-informative. In this article, we propose a framework for the diversification of snippets using explicit event aspects, building on recent works in search result diversification. In particular, we first propose two techniques to identify explicit aspects that a user might want to see covered in a summary for different types of event. We then extend a state-of-the-art explicit diversification framework to maximize the coverage of these aspects when selecting summary snippets for unseen events. Through experimentation over the TREC TS 2013, 2014, and 2015 datasets, we show that explicit diversification for temporal summarization significantly outperforms classical novelty-based diversification, as the use of explicit event aspects reduces the amount of redundant and off-topic snippets returned, while also increasing summary timeliness

Enlighten

Teaching a New Dog Old Tricks: Resurrecting Multilingual Retrieval Using Zero-shot Learning

Author: B Mitra
C Carpineto
C Peters
G Amati
KD Onal
M Braschler
M Johnson
P Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/12/2019
Field of study

While billions of non-English speaking users rely on search engines every day, the problem of ad-hoc information retrieval is rarely studied for non-English languages. This is primarily due to a lack of data set that are suitable to train ranking algorithms. In this paper, we tackle the lack of data by leveraging pre-trained multilingual language models to transfer a retrieval system trained on English collections to non-English queries and documents. Our model is evaluated in a zero-shot setting, meaning that we use them to predict relevance scores for query-document pairs in languages never seen during training. Our results show that the proposed approach can significantly outperform unsupervised retrieval techniques for Arabic, Chinese Mandarin, and Spanish. We also show that augmenting the English training collection with some examples from the target language can sometimes improve performance.Comment: ECIR 2020 (short

arXiv.org e-Print Archive

Crossref

Generating Query Suggestions to Support Task-Based Search

Author: Awadallah Ahmed H.
Balog Krisztian
Kelly Diane
Ryen
Verma Manisha
Yilmaz Emine
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/08/2017
Field of study

We address the problem of generating query suggestions to support users in completing their underlying tasks (which motivated them to search in the first place). Given an initial query, these query suggestions should provide a coverage of possible subtasks the user might be looking for. We propose a probabilistic modeling framework that obtains keyphrases from multiple sources and generates query suggestions from these keyphrases. Using the test suites of the TREC Tasks track, we evaluate and analyze each component of our model.Comment: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '17), 201

arXiv.org e-Print Archive

Crossref