4,092 research outputs found
Overview of INEX Tweet Contextualization 2013 track
International audienceTwitter is increasingly used for on-line client and audience fishing; this motivated the tweet contextualization task at INEX. The objective is to help a user to understand a tweet by providing him with a short summary (500 words). This summary should be built automatically using local resources like the Wikipedia and generated by extracting relevant passages and aggregating them into a coherent summary. The task is evaluated considering informativeness which is computed using a variant of Kullback-Leibler divergence and passage pooling. Meanwhile effective readability in context of summaries is checked using binary questionnaires on small samples of results. Running since 2010, results show that only systems that efficiently combine passage retrieval, sentence segmentation and scoring, named entity recognition, text POS analysis, anaphora detection, diversity content measure as well as sentence reordering are effective
On Type-Aware Entity Retrieval
Today, the practice of returning entities from a knowledge base in response
to search queries has become widespread. One of the distinctive characteristics
of entities is that they are typed, i.e., assigned to some hierarchically
organized type system (type taxonomy). The primary objective of this paper is
to gain a better understanding of how entity type information can be utilized
in entity retrieval. We perform this investigation in an idealized "oracle"
setting, assuming that we know the distribution of target types of the relevant
entities for a given query. We perform a thorough analysis of three main
aspects: (i) the choice of type taxonomy, (ii) the representation of
hierarchical type information, and (iii) the combination of type-based and
term-based similarity in the retrieval model. Using a standard entity search
test collection based on DBpedia, we find that type information proves most
useful when using large type taxonomies that provide very specific types. We
provide further insights on the extensional coverage of entities and on the
utility of target types.Comment: Proceedings of the 3rd ACM International Conference on the Theory of
Information Retrieval (ICTIR '17), 201
Enriching Existing Test Collections with OXPath
Extending TREC-style test collections by incorporating external resources is
a time consuming and challenging task. Making use of freely available web data
requires technical skills to work with APIs or to create a web scraping program
specifically tailored to the task at hand. We present a light-weight
alternative that employs the web data extraction language OXPath to harvest
data to be added to an existing test collection from web resources. We
demonstrate this by creating an extended version of GIRT4 called GIRT4-XT with
additional metadata fields harvested via OXPath from the social sciences portal
Sowiport. This allows the re-use of this collection for other evaluation
purposes like bibliometrics-enhanced retrieval. The demonstrated method can be
applied to a variety of similar scenarios and is not limited to extending
existing collections but can also be used to create completely new ones with
little effort.Comment: Experimental IR Meets Multilinguality, Multimodality, and Interaction
- 8th International Conference of the CLEF Association, CLEF 2017, Dublin,
Ireland, September 11-14, 201
Neural Architecture for Question Answering Using a Knowledge Graph and Web Corpus
In Web search, entity-seeking queries often trigger a special Question
Answering (QA) system. It may use a parser to interpret the question to a
structured query, execute that on a knowledge graph (KG), and return direct
entity responses. QA systems based on precise parsing tend to be brittle: minor
syntax variations may dramatically change the response. Moreover, KG coverage
is patchy. At the other extreme, a large corpus may provide broader coverage,
but in an unstructured, unreliable form. We present AQQUCN, a QA system that
gracefully combines KG and corpus evidence. AQQUCN accepts a broad spectrum of
query syntax, between well-formed questions to short `telegraphic' keyword
sequences. In the face of inherent query ambiguities, AQQUCN aggregates signals
from KGs and large corpora to directly rank KG entities, rather than commit to
one semantic interpretation of the query. AQQUCN models the ideal
interpretation as an unobservable or latent variable. Interpretations and
candidate entity responses are scored as pairs, by combining signals from
multiple convolutional networks that operate collectively on the query, KG and
corpus. On four public query workloads, amounting to over 8,000 queries with
diverse query syntax, we see 5--16% absolute improvement in mean average
precision (MAP), compared to the entity ranking performance of recent systems.
Our system is also competitive at entity set retrieval, almost doubling F1
scores for challenging short queries.Comment: Accepted to Information Retrieval Journa
Dynamic Time-Dependent Route Planning in Road Networks with User Preferences
There has been tremendous progress in algorithmic methods for computing
driving directions on road networks. Most of that work focuses on
time-independent route planning, where it is assumed that the cost on each arc
is constant per query. In practice, the current traffic situation significantly
influences the travel time on large parts of the road network, and it changes
over the day. One can distinguish between traffic congestion that can be
predicted using historical traffic data, and congestion due to unpredictable
events, e.g., accidents. In this work, we study the \emph{dynamic and
time-dependent} route planning problem, which takes both prediction (based on
historical data) and live traffic into account. To this end, we propose a
practical algorithm that, while robust to user preferences, is able to
integrate global changes of the time-dependent metric~(e.g., due to traffic
updates or user restrictions) faster than previous approaches, while allowing
subsequent queries that enable interactive applications
Target Type Identification for Entity-Bearing Queries
Identifying the target types of entity-bearing queries can help improve
retrieval performance as well as the overall search experience. In this work,
we address the problem of automatically detecting the target types of a query
with respect to a type taxonomy. We propose a supervised learning approach with
a rich variety of features. Using a purpose-built test collection, we show that
our approach outperforms existing methods by a remarkable margin. This is an
extended version of the article published with the same title in the
Proceedings of SIGIR'17.Comment: Extended version of SIGIR'17 short paper, 5 page
- âŠ