Search CORE

256,245 research outputs found

PRES: A score metric for evaluating recall-oriented information retrieval applications

Author: Jones Gareth J.F.
Magdy Walid
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

Information retrieval (IR) evaluation scores are generally designed to measure the effectiveness with which relevant documents are identified and retrieved. Many scores have been proposed for this purpose over the years. These have primarily focused on aspects of precision and recall, and while these are often discussed with equal importance, in practice most attention has been given to precision focused metrics. Even for recalloriented IR tasks of growing importance, such as patent retrieval, these precision based scores remain the primary evaluation measures. Our study examines different evaluation measures for a recall-oriented patent retrieval task and demonstrates the limitations of the current scores in comparing different IR systems for this task. We introduce PRES, a novel evaluation metric for this type of application taking account of recall and the user’s search effort. The behaviour of PRES is demonstrated on 48 runs from the CLEF-IP 2009 patent retrieval track. A full analysis of the performance of PRES shows its suitability for measuring the retrieval effectiveness of systems from a recall focused perspective taking into account the user’s expected search effort

CiteSeerX

Irish Universities

DCU Online Research Access Service

Evaluating epistemic uncertainty under incomplete assessments

Author: Barry
Blair
Blair
Blair
Blair
Harter
Hull
Ian Ruthven
Ingwersen
Järvelin
Leif Azzopardi
Mark Baillie
Popper
Ruthven
Salton
Saracevic
Savoy
Schamber
Soboroff
Swanson
Swanson
Van Rijsbergen
Voorhees
Voorhees
Voorhees
Wallis
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

The thesis of this study is to propose an extended methodology for laboratory based Information Retrieval evaluation under incomplete relevance assessments. This new methodology aims to identify potential uncertainty during system comparison that may result from incompleteness. The adoption of this methodology is advantageous, because the detection of epistemic uncertainty - the amount of knowledge (or ignorance) we have about the estimate of a system's performance - during the evaluation process can guide and direct researchers when evaluating new systems over existing and future test collections. Across a series of experiments we demonstrate how this methodology can lead towards a finer grained analysis of systems. In particular, we show through experimentation how the current practice in Information Retrieval evaluation of using a measurement depth larger than the pooling depth increases uncertainty during system comparison

CiteSeerX

Crossref

University of Strathclyde Institutional Repository

Enlighten

A laboratory-based method for the evaluation of personalised search

Author: Ganguly Debasis
Jones Gareth J.F.
Pasi Gabriella
Sanvitto Camilla
Publication venue: 'National Institute of Informatics (NII)'
Publication date: 01/06/2016
Field of study

Comparative evaluation of Information Retrieval Systems (IRSs) using publically available test collections has become an established practice in Information Retrieval (IR). By means of the popular Cranfield evaluation paradigm IR test collections enable researchers to compare new methods to existing approaches. An important area of IR research where this strategy has not been applied to date is Personalised Information Retrieval (PIR), which has generally relied on user-based evaluations. This paper describes a method that enables the creation of publically available extended test collections to allow repeatable laboratory-based evaluation of personalised search

Irish Universities

DCU Online Research Access Service

Highly focused document retrieval in aerospace engineering : user interaction design and evaluation

Author: Begdev Ravish
Chapman Sam
Ciravegna Fabio
Lanfranchi Vitaveska
Petrelli Daniela
Publication venue: 'Emerald'
Publication date: 01/01/2011
Field of study

Purpose – This paper seeks to describe the preliminary studies (on both users and data), the design and evaluation of the K-Search system for searching legacy documents in aerospace engineering. Real-world reports of jet engine maintenance challenge the current indexing practice, while real users’ tasks require retrieving the information in the proper context. K-Search is currently in use in Rolls-Royce plc and has evolved to include other tools for knowledge capture and management. Design/methodology/approach – Semantic Web techniques have been used to automatically extract information from the reports while maintaining the original context, allowing a more focused retrieval than with more traditional techniques. The paper combines semantic search with classical information retrieval to increase search effectiveness. An innovative user interface has been designed to take advantage of this hybrid search technique. The interface is designed to allow a flexible and personal approach to searching legacy data. Findings – The user evaluation showed that the system is effective and well received by users. It also shows that different people look at the same data in different ways and make different use of the same system depending on their individual needs, influenced by their job profile and personal attitude. Research limitations/implications – This study focuses on a specific case of an enterprise working in aerospace engineering. Although the findings are likely to be shared with other engineering domains (e.g. mechanical, electronic), the study does not expand the evaluation to different settings. Originality/value – The study shows how real context of use can provide new and unexpected challenges to researchers and how effective solutions can then be adopted and used in organizations.</p

Crossref

Sheffield Hallam University Research Archive

APPLICATION OF COGNITIVE PRINCIPLES WITHIN AN ONLINE STATISTICAL LEARNING ENVIRONMENT

Author: Huffman William
Publication venue
Publication date: 16/12/2016
Field of study

Three experiments were conducted in order to further investigate optimal learning procedures within an online statistical learning environment. Experiment 1 exposed learners to retrieval practice learning conditions with or without segmentation. Retrieval practice formats included; multiple choice, open ended, multiple evaluation, or instruction only type manipulations. Experiment 2 explored the impact of added immediate feedback in conjunction with retrieval practice and segmentation. Experiment 3 further investigated how the benefits of optimal learning procedures transfer to novel situations / examinations. Within all experiments a series of metacognitive questions were administered to learners in order to measure their metamemory over the statistical knowledge that was taught. In alignment with our hypotheses and previous research it was found that retrieval practice (experiment 1) and retrieval practice with immediate feedback (experiment 2) tended to boost memory retention. However, the data trend for all experiments tended to suggest that segmentation has little or no impact on statistical learning, such a finding was support against our hypotheses as well as the findings within previous studies. Though the results are somewhat mixed, the benefits associated with retrieval practice and retrieval practice with feedback did not seem to transfer to novel instances. Individuals that learned within an open ended or multiple evaluation type format tended to have greater insight into their own metacognitive knowledge

SHAREOK repository

Open Domain Multi-document Summarization: A Comprehensive Study of Model Brittleness under Retrieval

Author: Bader Gary
Cohan Arman
Giorgi John
Lo Kyle
Soldaini Luca
Wang Bo
Wang Lucy Lu
Publication venue
Publication date: 25/10/2023
Field of study

Multi-document summarization (MDS) assumes a set of topic-related documents are provided as input. In practice, this document set is not always available; it would need to be retrieved given an information need, i.e. a question or topic statement, a setting we dub "open-domain" MDS. We study this more challenging setting by formalizing the task and bootstrapping it using existing datasets, retrievers and summarizers. Via extensive automatic and human evaluation, we determine: (1) state-of-the-art summarizers suffer large reductions in performance when applied to open-domain MDS, (2) additional training in the open-domain setting can reduce this sensitivity to imperfect retrieval, and (3) summarizers are insensitive to the retrieval of duplicate documents and the order of retrieved documents, but highly sensitive to other errors, like the retrieval of irrelevant documents. Based on our results, we provide practical guidelines to enable future work on open-domain MDS, e.g. how to choose the number of retrieved documents to summarize. Our results suggest that new retrieval and summarization methods and annotated resources for training and evaluation are necessary for further progress in the open-domain setting.Comment: Accepted to EMNLP Findings 202

arXiv.org e-Print Archive