3,732 research outputs found
An evaluation resource for geographic information retrieval
In this paper we present an evaluation resource for geographic information retrieval developed within the Cross Language Evaluation
Forum (CLEF). The GeoCLEF track is dedicated to the evaluation of geographic information retrieval systems. The resource
encompasses more than 600,000 documents, 75 topics so far, and more than 100,000 relevance judgments for these topics. Geographic
information retrieval requires an evaluation resource which represents realistic information needs and which is geographically
challenging. Some experimental results and analysis are reported
GeoCLEF 2007: the CLEF 2007 cross-language geographic information retrieval track overview
GeoCLEF ran as a regular track for the second time within the Cross
Language Evaluation Forum (CLEF) 2007. The purpose of GeoCLEF is to test
and evaluate cross-language geographic information retrieval (GIR): retrieval
for topics with a geographic specification. GeoCLEF 2007 consisted of two sub
tasks. A search task ran for the third time and a query classification task was
organized for the first. For the GeoCLEF 2007 search task, twenty-five search
topics were defined by the organizing groups for searching English, German,
Portuguese and Spanish document collections. All topics were translated into
English, Indonesian, Portuguese, Spanish and German. Several topics in 2007
were geographically challenging. Thirteen groups submitted 108 runs. The
groups used a variety of approaches. For the classification task, a query log
from a search engine was provided and the groups needed to identify the
queries with a geographic scope and the geographic components within the
local queries
DCU@TRECMed 2012: Using ad-hoc baselines for domain-specific retrieval
This paper describes the first participation of DCU in the TREC Medical Records Track (TRECMed). We performed some initial experiments on the 2011 TRECMed data based on the BM25 retrieval model. Surprisingly, we found that the standard BM25 model with default parameters, performs comparable to the best automatic runs submitted to TRECMed 2011 and would have resulted in rank four out of 29 participating groups. We expected that some form of domain adaptation would increase performance. However, results on the 2011 data proved otherwise: concept-based query expansion decreased performance, and filtering and reranking by term proximity also decreased performance slightly. We submitted four runs based on the BM25 retrieval model to TRECMed 2012 using standard BM25, standard query expansion, result filtering, and concept-based query expansion. Official results for 2012 confirm that domain-specific knowledge does not increase performance compared to the BM25 baseline as applied by us
Challenges to evaluation of multilingual geographic information retrieval in GeoCLEF
This is the third year of the evaluation of
geographic information retrieval (GeoCLEF)
within the Cross-Language Evaluation Forum
(CLEF). GeoCLEF 2006 presented topics and
documents in four languages (English,
German, Portuguese and Spanish). After two
years of evaluation we are beginning to
understand the challenges to both Geographic
Information Retrieval from text and of
evaluation of the results of geographic
information retrieval. This poster enumerates
some of these challenges to evaluation and
comments on the limitations encountered in the
first two evaluations
A Reinforcement Learning-driven Translation Model for Search-Oriented Conversational Systems
Search-oriented conversational systems rely on information needs expressed in
natural language (NL). We focus here on the understanding of NL expressions for
building keyword-based queries. We propose a reinforcement-learning-driven
translation model framework able to 1) learn the translation from NL
expressions to queries in a supervised way, and, 2) to overcome the lack of
large-scale dataset by framing the translation model as a word selection
approach and injecting relevance feedback in the learning process. Experiments
are carried out on two TREC datasets and outline the effectiveness of our
approach.Comment: This is the author's pre-print version of the work. It is posted here
for your personal use, not for redistribution. Please cite the definitive
version which will be published in Proceedings of the 2018 EMNLP Workshop
SCAI: The 2nd International Workshop on Search-Oriented Conversational AI -
ISBN: 978-1-948087-75-
On the probabilistic logical modelling of quantum and geometrically-inspired IR
Information Retrieval approaches can mostly be classed into probabilistic, geometric or logic-based. Recently, a new unifying framework for IR has emerged that integrates a probabilistic description within a geometric framework, namely vectors in Hilbert spaces. The geometric model leads naturally to a predicate logic over linear subspaces, also known as quantum logic. In this paper we show the relation between this model and classic concepts such as the Generalised Vector Space Model, highlighting similarities and differences. We also show how some fundamental components of quantum-based IR can be modelled in a descriptive way using a well-established tool, i.e. Probabilistic Datalog
- …