18,538 research outputs found
CODEC: Complex Document and Entity Collection
CODEC is a document and entity ranking benchmark that focuses on complex research topics. We target essay-style information needs of social science researchers, i.e. "How has the UK's Open Banking Regulation benefited Challenger Banks". CODEC includes 42 topics developed by researchers and a new focused web corpus with semantic annotations including entity links. This resource includes expert judgments on 17,509 documents and entities (416.9 per topic) from diverse automatic and interactive manual runs. The manual runs include 387 query reformulations, providing data for query performance prediction and automatic rewriting evaluation.
CODEC includes analysis of state-of-the-art systems, including dense retrieval and neural re-ranking. The results show the topics are challenging with headroom for document and entity ranking improvement. Query expansion with entity information shows significant gains on document ranking, demonstrating the resource's value for evaluating and improving entity-oriented search. We also show that the manual query reformulations significantly improve document ranking and entity ranking performance. Overall, CODEC provides challenging research topics to support the development and evaluation of entity-centric search methods
Contextual Media Retrieval Using Natural Language Queries
The widespread integration of cameras in hand-held and head-worn devices as
well as the ability to share content online enables a large and diverse visual
capture of the world that millions of users build up collectively every day. We
envision these images as well as associated meta information, such as GPS
coordinates and timestamps, to form a collective visual memory that can be
queried while automatically taking the ever-changing context of mobile users
into account. As a first step towards this vision, in this work we present
Xplore-M-Ego: a novel media retrieval system that allows users to query a
dynamic database of images and videos using spatio-temporal natural language
queries. We evaluate our system using a new dataset of real user queries as
well as through a usability study. One key finding is that there is a
considerable amount of inter-user variability, for example in the resolution of
spatial relations in natural language utterances. We show that our retrieval
system can cope with this variability using personalisation through an online
learning-based retrieval formulation.Comment: 8 pages, 9 figures, 1 tabl
Finding Support Documents with a Logistic Regression Approach
Entity retrieval finds the relevant results for a user’s information needs at a finer unit called “entity”. To retrieve such entity, people usually first locate a small set of support documents which contain answer entities, and then further detect the answer entities in this set. In the literature, people view the support documents as relevant documents, and their findings as a conventional document retrieval problem. In this paper, we will state that finding support documents and that of relevant documents, although sounds similar, have important differences. Further, we propose a logistic regression approach to find support documents. Our experiment results show that the logistic regression method performs significantly better than a baseline system that treat the support document finding as a conventional document retrieval problem
USFD at KBP 2011: Entity Linking, Slot Filling and Temporal Bounding
This paper describes the University of Sheffield's entry in the 2011 TAC KBP
entity linking and slot filling tasks. We chose to participate in the
monolingual entity linking task, the monolingual slot filling task and the
temporal slot filling tasks. We set out to build a framework for
experimentation with knowledge base population. This framework was created, and
applied to multiple KBP tasks. We demonstrated that our proposed framework is
effective and suitable for collaborative development efforts, as well as useful
in a teaching environment. Finally we present results that, while very modest,
provide improvements an order of magnitude greater than our 2010 attempt.Comment: Proc. Text Analysis Conference (2011
Target Type Identification for Entity-Bearing Queries
Identifying the target types of entity-bearing queries can help improve
retrieval performance as well as the overall search experience. In this work,
we address the problem of automatically detecting the target types of a query
with respect to a type taxonomy. We propose a supervised learning approach with
a rich variety of features. Using a purpose-built test collection, we show that
our approach outperforms existing methods by a remarkable margin. This is an
extended version of the article published with the same title in the
Proceedings of SIGIR'17.Comment: Extended version of SIGIR'17 short paper, 5 page
- …