19,991 research outputs found
Efficient Document Re-Ranking for Transformers by Precomputing Term Representations
Deep pretrained transformer networks are effective at various ranking tasks,
such as question answering and ad-hoc document ranking. However, their
computational expenses deem them cost-prohibitive in practice. Our proposed
approach, called PreTTR (Precomputing Transformer Term Representations),
considerably reduces the query-time latency of deep transformer networks (up to
a 42x speedup on web document ranking) making these networks more practical to
use in a real-time ranking scenario. Specifically, we precompute part of the
document term representations at indexing time (without a query), and merge
them with the query representation at query time to compute the final ranking
score. Due to the large size of the token representations, we also propose an
effective approach to reduce the storage requirement by training a compression
layer to match attention scores. Our compression technique reduces the storage
required up to 95% and it can be applied without a substantial degradation in
ranking performance.Comment: Accepted at SIGIR 2020 (long
GeoCLEF 2007: the CLEF 2007 cross-language geographic information retrieval track overview
GeoCLEF ran as a regular track for the second time within the Cross
Language Evaluation Forum (CLEF) 2007. The purpose of GeoCLEF is to test
and evaluate cross-language geographic information retrieval (GIR): retrieval
for topics with a geographic specification. GeoCLEF 2007 consisted of two sub
tasks. A search task ran for the third time and a query classification task was
organized for the first. For the GeoCLEF 2007 search task, twenty-five search
topics were defined by the organizing groups for searching English, German,
Portuguese and Spanish document collections. All topics were translated into
English, Indonesian, Portuguese, Spanish and German. Several topics in 2007
were geographically challenging. Thirteen groups submitted 108 runs. The
groups used a variety of approaches. For the classification task, a query log
from a search engine was provided and the groups needed to identify the
queries with a geographic scope and the geographic components within the
local queries
DCU search runs at MediaEval 2012: search and hyperlinking task
We describe the runs for our participation in the Search
sub-task of the Search and Hyperlinking Task at MediaEval
2012. Our runs are designed to form a retrieval baseline by using time-based segmentation of audio transcripts incorporating pause information and a sliding window to define the retrieval segments boundaries with a standard language modelling information retrieval strategy. Using this baseline system runs based on transcripts provided by LIUM were better for all evaluation metrics, than those using transcripts provided by LIMSI
Evaluation campaigns and TRECVid
The TREC Video Retrieval Evaluation (TRECVid) is an
international benchmarking activity to encourage research
in video information retrieval by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. TRECVid completed its fifth annual cycle at the end of 2005 and in 2006 TRECVid will involve almost 70 research organizations, universities and other consortia. Throughout its existence, TRECVid has benchmarked both interactive and automatic/manual searching for shots from within a video
corpus, automatic detection of a variety of semantic and
low-level video features, shot boundary detection and the
detection of story boundaries in broadcast TV news. This
paper will give an introduction to information retrieval (IR) evaluation from both a user and a system perspective, highlighting that system evaluation is by far the most prevalent type of evaluation carried out. We also include a summary of TRECVid as an example of a system evaluation benchmarking campaign and this allows us to discuss whether
such campaigns are a good thing or a bad thing. There are
arguments for and against these campaigns and we present
some of them in the paper concluding that on balance they
have had a very positive impact on research progress
Symbiosis between the TRECVid benchmark and video libraries at the Netherlands Institute for Sound and Vision
Audiovisual archives are investing in large-scale digitisation efforts of their analogue holdings and, in parallel, ingesting an ever-increasing amount of born- digital files in their digital storage facilities. Digitisation opens up new access paradigms and boosted re-use of audiovisual content. Query-log analyses show the shortcomings of manual annotation, therefore archives are complementing these annotations by developing novel search engines that automatically extract information from both audio and the visual tracks. Over the past few years, the TRECVid benchmark has developed a novel relationship with the Netherlands Institute of Sound and Vision (NISV) which goes beyond the NISV just providing data and use cases to TRECVid. Prototype and demonstrator systems developed as part of TRECVid are set to become a key driver in improving the quality of search engines at the NISV and will ultimately help other audiovisual archives to offer more efficient and more fine-grained access to their collections. This paper reports the experiences of NISV in leveraging the activities of the TRECVid benchmark
The State-of-the-arts in Focused Search
The continuous influx of various text data on the Web requires search engines to improve their retrieval abilities for more specific information. The need for relevant results to a userās topic of interest has gone beyond search for domain or type specific documents to more focused result (e.g. document fragments or answers to a query). The introduction of XML provides a format standard for data representation, storage, and exchange. It helps focused search to be carried out at different granularities of a structured document with XML markups. This report aims at reviewing the state-of-the-arts in focused search, particularly techniques for topic-specific document retrieval, passage retrieval, XML retrieval, and entity ranking. It is concluded with highlight of open problems
Context Aware Computing for The Internet of Things: A Survey
As we are moving towards the Internet of Things (IoT), the number of sensors
deployed around the world is growing at a rapid pace. Market research has shown
a significant growth of sensor deployments over the past decade and has
predicted a significant increment of the growth rate in the future. These
sensors continuously generate enormous amounts of data. However, in order to
add value to raw sensor data we need to understand it. Collection, modelling,
reasoning, and distribution of context in relation to sensor data plays
critical role in this challenge. Context-aware computing has proven to be
successful in understanding sensor data. In this paper, we survey context
awareness from an IoT perspective. We present the necessary background by
introducing the IoT paradigm and context-aware fundamentals at the beginning.
Then we provide an in-depth analysis of context life cycle. We evaluate a
subset of projects (50) which represent the majority of research and commercial
solutions proposed in the field of context-aware computing conducted over the
last decade (2001-2011) based on our own taxonomy. Finally, based on our
evaluation, we highlight the lessons to be learnt from the past and some
possible directions for future research. The survey addresses a broad range of
techniques, methods, models, functionalities, systems, applications, and
middleware solutions related to context awareness and IoT. Our goal is not only
to analyse, compare and consolidate past research work but also to appreciate
their findings and discuss their applicability towards the IoT.Comment: IEEE Communications Surveys & Tutorials Journal, 201
- ā¦