Search CORE

47,793 research outputs found

Automatic Ranking of Retrieval Systems in Imperfect Environments

Author: Can F.
Nuray R.
Publication venue
Publication date: 01/01/2003
Field of study

The empirical investigation of the effectiveness of information retrieval (IR) systems requires a test collection, a set of query topics, and a set of relevance judgments made by human assessors for each query. Previous experiments show that differences in human relevance assessments do not affect the relative performance of retrieval systems. Based on this observation, we propose and evaluate a new approach to replace the human relevance judgments by an automatic method. Ranking of retrieval systems with our methodology correlates positively and significantly with that of human-based evaluations. In the experiments, we assume a Web-like imperfect environment: the indexing information for all documents is available for ranking, but some documents may not be available for retrieval. Such conditions can be due to document deletions or network problems. Our method of simulating imperfect environments can be used for Web search engine assessment and in estimating the effects of network conditions (e.g., network unreliability) on IR system performance

Bilkent University Institutional Repository

CODEC: Complex Document and Entity Collection

Author: Dalton Jeffrey
Fischer Sophie
Gemmell Carlos
MacAvaney Sean
Mackie Iain
Owoicho Paul
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/07/2022
Field of study

CODEC is a document and entity ranking benchmark that focuses on complex research topics. We target essay-style information needs of social science researchers, i.e. "How has the UK's Open Banking Regulation benefited Challenger Banks". CODEC includes 42 topics developed by researchers and a new focused web corpus with semantic annotations including entity links. This resource includes expert judgments on 17,509 documents and entities (416.9 per topic) from diverse automatic and interactive manual runs. The manual runs include 387 query reformulations, providing data for query performance prediction and automatic rewriting evaluation. CODEC includes analysis of state-of-the-art systems, including dense retrieval and neural re-ranking. The results show the topics are challenging with headroom for document and entity ranking improvement. Query expansion with entity information shows significant gains on document ranking, demonstrating the resource's value for evaluating and improving entity-oriented search. We also show that the manual query reformulations significantly improve document ranking and entity ranking performance. Overall, CODEC provides challenging research topics to support the development and evaluation of entity-centric search methods

Enlighten

Recommended from our members

The quest for information retrieval on the semantic web

Author: Castells-Azpilicueta Pablo
Fernández-Sánchez Miriam
Vallet-Weadon David
Publication venue
Publication date: 01/12/2005
Field of study

Semantic search has been one of the motivations of the Semantic Web since it was envisioned. We propose a model for the exploitation of ontology-based KBs to improve search over large document repositories. The retrieval model is based on an adaptation of the classic vector-space model, including an annotation weighting algorithm, and a ranking algorithm. Semantic search is combined with keyword-based search to achieve tolerance to KB incompleteness. Our proposal has been tested on corpora of significant size, showing promising results with respect to keyword-based search, and providing ground for further analysis and research

Open Research Online (The Open University)

A survey on the use of relevance feedback for information access systems

Author: Lalmas Mounia
Ruthven Ian
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/06/2003
Field of study

Users of online search engines often find it difficult to express their need for information in the form of a query. However, if the user can identify examples of the kind of documents they require then they can employ a technique known as relevance feedback. Relevance feedback covers a range of techniques intended to improve a user's query and facilitate retrieval of information relevant to a user's information need. In this paper we survey relevance feedback techniques. We study both automatic techniques, in which the system modifies the user's query, and interactive techniques, in which the user has control over query modification. We also consider specific interfaces to relevance feedback systems and characteristics of searchers that can affect the use and success of relevance feedback systems

Crossref

University of Strathclyde Institutional Repository

Applying Machine Translation to Two-Stage Cross-Language Information Retrieval

Author: Fujii Atsushi
Ishikawa Tetsuya
Publication venue
Publication date: 01/01/2000
Field of study

Cross-language information retrieval (CLIR), where queries and documents are in different languages, needs a translation of queries and/or documents, so as to standardize both of them into a common representation. For this purpose, the use of machine translation is an effective approach. However, computational cost is prohibitive in translating large-scale document collections. To resolve this problem, we propose a two-stage CLIR method. First, we translate a given query into the document language, and retrieve a limited number of foreign documents. Second, we machine translate only those documents into the user language, and re-rank them based on the translation result. We also show the effectiveness of our method by way of experiments using Japanese queries and English technical documents.Comment: 13 pages, 1 Postscript figur

arXiv.org e-Print Archive

CiteSeerX

Overview of the ImageCLEFphoto 2008 photographic retrieval task

Author: Arni T.
Clough P.
Grubinger M.
Sanderson M.
Publication venue
Publication date: 01/01/2008
Field of study

ImageCLEFphoto 2008 is an ad-hoc photo retrieval task and part of the ImageCLEF evaluation campaign. This task provides both the resources and the framework necessary to perform comparative laboratory-style evaluation of visual information retrieval systems. In 2008, the evaluation task concentrated on promoting diversity within the top 20 results from a multilingual image collection. This new challenge attracted a record number of submissions: a total of 24 participating groups submitting 1,042 system runs. Some of the findings include that the choice of annotation language is almost negligible and the best runs are by combining concept and content-based retrieval methods

White Rose Research Online

A study of interface support mechanisms for interactive information retrieval

Author: Ruthven I.
White R.W.
Publication venue
Publication date: 01/05/2006
Field of study

Advances in search technology have meant that search systems can now offer assistance to users beyond simply retrieving a set of documents. For example, search systems are now capable of inferring user interests by observing their interaction, offering suggestions about what terms could be used in a query, or reorganizing search results to make exploration of retrieved material more effective. When providing new search functionality, system designers must decide how the new functionality should be offered to users. One major choice is between (a) offering automatic features that require little human input, but give little human control; or (b) interactive features which allow human control over how the feature is used, but often give little guidance over how the feature should be best used. This article presents a study in which we empirically investigate the issue of control by presenting an experiment in which participants were asked to interact with three experimental systems that vary the degree of control they had in creating queries, indicating which results are relevant in making search decisions. We use our findings to discuss why and how the control users want over search decisions can vary depending on the nature of the decisions and the impact of those decisions on the user's search

Crossref

University of Strathclyde Institutional Repository