312,651 research outputs found

    An evaluation resource for geographic information retrieval

    Get PDF
    In this paper we present an evaluation resource for geographic information retrieval developed within the Cross Language Evaluation Forum (CLEF). The GeoCLEF track is dedicated to the evaluation of geographic information retrieval systems. The resource encompasses more than 600,000 documents, 75 topics so far, and more than 100,000 relevance judgments for these topics. Geographic information retrieval requires an evaluation resource which represents realistic information needs and which is geographically challenging. Some experimental results and analysis are reported

    Evaluating the implicit feedback models for adaptive video retrieval

    Get PDF
    Interactive video retrieval systems are becoming popular. On the one hand, these systems try to reduce the effect of the semantic gap, an issue currently being addressed by the multimedia retrieval community. On the other hand, such systems enhance the quality of information seeking for the user by supporting query formulation and reformulation. Interactive systems are very popular in the textual retrieval domain. However, they are relatively unexplored in the case of multimedia retrieval. The main problem in the development of interactive retrieval systems is the evaluation cost.The traditional evaluation methodology, as used in the information retrieval domain, is not applicable. An alternative is to use a user-centred evaluation methodology. However, such schemes are expensive in terms of effort, cost and are not scalable. This problem gets exacerbated by the use of implicit indicators, which are useful and increasingly used in predicting user intentions. In this paper, we explore the effectiveness of a number of interfaces and feedback mechanisms and compare their relative performance using a simulated evaluation methodology. The results show the relatively better performance of a search interface with the combination of explicit and implicit features

    Creating a Dutch testbed to evaluate the retrieval from textual databases

    Get PDF
    This paper describes the first large-scale evaluation of information retrieval systems using Dutch documents and queries. We describe in detail the characteristics of the Dutch test data, which is part of the official CLEF multilingual texttual database, and give an overview of the experimental results of companies and research institutions that participated in the first official Dutch CLEF experiments. Judging from these experiments, the handling of language-specific issues of Dutch, like for instance simple morphology and compound nouns, significantly improves the performance of information retrieval systems in many cases. Careful examination of the test collection shows that it serves as a reliable tool for the evaluation of information retrieval systems in the future

    The Simplest Evaluation Measures for XML Information Retrieval that Could Possibly Work

    Get PDF
    This paper reviews several evaluation measures developed for evaluating XML information retrieval (IR) systems. We argue that these measures, some of which are currently in use by the INitiative for the Evaluation of XML Retrieval (INEX), are complicated, hard to understand, and hard to explain to users of XML IR systems. To show the value of keeping things simple, we report alternative evaluation results of official evaluation runs submitted to INEX 2004 using simple metrics, and show its value for INEX

    A new metric for patent retrieval evaluation

    Get PDF
    Patent retrieval is generally considered to be a recall-oriented information retrieval task that is growing in importance. Despite this fact, precision based scores such as mean average precision (MAP) remain the primary evaluation measures for patent retrieval. Our study examines different evaluation measures for the recall-oriented patent retrieval task and shows the limitations of the current scores in comparing different IR systems for this task. We introduce PRES, a novel evaluation metric for this type of application taking account of recall and user search effort. The behaviour of PRES is demonstrated on 48 runs from the CLEF-IP 2009 patent retrieval track. A full analysis of the performance of PRES shows its suitability for measuring the retrieval effectiveness of systems from a recall focused perspective taking into account the expected search effort of patent searchers

    A proposal for the evaluation of adaptive information retrieval systems using simulated interaction

    Get PDF
    The Centre for Next Generation Localisation (CNGL) is involved in building interactive adaptive systems which combine Information Retrieval (IR), Adaptive Hypermedia (AH) and adaptive web techniques and technologies. The complex functionality of these systems coupled with the variety of potential users means that the experiments necessary to evaluate such systems are difficult to plan, implement and execute. This evaluation requires both component-level scientific evaluation and user-based evaluation. Automated replication of experiments and simulation of user interaction would be hugely beneficial in the evaluation of adaptive information retrieval systems (AIRS). This paper proposes a methodology for the evaluation of AIRS which leverages simulated interaction. The hybrid approach detailed combines: (i) user-centred methods for simulating interaction and personalisation; (ii) evaluation metrics that combine Human Computer Interaction (HCI), AH and IR techniques; and (iii) the use of qualitative and quantitative evaluations. The benefits and limitations of evaluations based on user simulations are also discussed

    PRES: A score metric for evaluating recall-oriented information retrieval applications

    Get PDF
    Information retrieval (IR) evaluation scores are generally designed to measure the effectiveness with which relevant documents are identified and retrieved. Many scores have been proposed for this purpose over the years. These have primarily focused on aspects of precision and recall, and while these are often discussed with equal importance, in practice most attention has been given to precision focused metrics. Even for recalloriented IR tasks of growing importance, such as patent retrieval, these precision based scores remain the primary evaluation measures. Our study examines different evaluation measures for a recall-oriented patent retrieval task and demonstrates the limitations of the current scores in comparing different IR systems for this task. We introduce PRES, a novel evaluation metric for this type of application taking account of recall and the user’s search effort. The behaviour of PRES is demonstrated on 48 runs from the CLEF-IP 2009 patent retrieval track. A full analysis of the performance of PRES shows its suitability for measuring the retrieval effectiveness of systems from a recall focused perspective taking into account the user’s expected search effort

    The CLAIRE visual analytics system for analysing IR evaluation data

    Get PDF
    In this paper, we describe Combinatorial visuaL Analytics system for Information Retrieval Evaluation (CLAIRE), a Visual Analytics (VA) system for exploring and making sense of the performances of a large amount of Information Retrieval (IR) systems, in order to quickly and intuitively grasp which system configurations are preferred, what are the contributions of the different components and how these components interact together

    Overview of the ImageCLEFphoto 2008 photographic retrieval task

    Get PDF
    ImageCLEFphoto 2008 is an ad-hoc photo retrieval task and part of the ImageCLEF evaluation campaign. This task provides both the resources and the framework necessary to perform comparative laboratory-style evaluation of visual information retrieval systems. In 2008, the evaluation task concentrated on promoting diversity within the top 20 results from a multilingual image collection. This new challenge attracted a record number of submissions: a total of 24 participating groups submitting 1,042 system runs. Some of the findings include that the choice of annotation language is almost negligible and the best runs are by combining concept and content-based retrieval methods
    corecore