136,035 research outputs found

    Report on the Second International Workshop on the Evaluation on Collaborative Information Seeking and Retrieval (ECol'2017 @ CHIIR)

    Get PDF
    The 2nd workshop on the evaluation of collaborative information retrieval and seeking (ECol) was held in conjunction with the ACM SIGIR Conference on Human Information Interaction & Retrieval (CHIIR) in Oslo, Norway. The workshop focused on discussing the challenges and difficulties of researching and studying collaborative information retrieval and seeking (CIS/CIR). After an introductory and scene setting overview of developments in CIR/CIS, participants were challenged with devising a range of possible CIR/CIS tasks that could be used for evaluation purposes. Through the brainstorming and discussions, valuable insights regarding the evaluation of CIR/CIS tasks become apparent ? for particular tasks efficiency and/or effectiveness is most important, however for the majority of tasks the success and quality of outcomes along with knowledge sharing and sense-making were most important ? of which these latter attributes are much more difficult to measure and evaluate. Thus the major challenge for CIR/CIS research is to develop methods, measures and methodologies to evaluate these high order attributes

    PRES: A score metric for evaluating recall-oriented information retrieval applications

    Get PDF
    Information retrieval (IR) evaluation scores are generally designed to measure the effectiveness with which relevant documents are identified and retrieved. Many scores have been proposed for this purpose over the years. These have primarily focused on aspects of precision and recall, and while these are often discussed with equal importance, in practice most attention has been given to precision focused metrics. Even for recalloriented IR tasks of growing importance, such as patent retrieval, these precision based scores remain the primary evaluation measures. Our study examines different evaluation measures for a recall-oriented patent retrieval task and demonstrates the limitations of the current scores in comparing different IR systems for this task. We introduce PRES, a novel evaluation metric for this type of application taking account of recall and the user’s search effort. The behaviour of PRES is demonstrated on 48 runs from the CLEF-IP 2009 patent retrieval track. A full analysis of the performance of PRES shows its suitability for measuring the retrieval effectiveness of systems from a recall focused perspective taking into account the user’s expected search effort

    MIR task and evaluation techniques

    Get PDF
    Existing tasks in MIREX have traditionally focused on low-level MIR tasks working with flat (usually DSP-only) ground-truth. These evaluation techniques, however, can not evaluate the increasing number of algorithms that utilize relational data and are not currently utilizing the state of the art in evaluating ranked or ordered output. This paper summarizes the state of the art in evaluating relational ground-truth. These components are then synthesized into novel evaluation techniques that are then applied to 14 concrete music document retrieval tasks, demonstrating how these evaluation techniques can be applied in a practical context

    Report of ECol Workshop Report on the First International Workshop on the Evaluation on Collaborative Information Seeking and Retrieval (ECol'2015)

    Get PDF
    Report of the ECol Workshop @ CIKM 2015The workshop on the evaluation of collaborative information retrieval and seeking (ECol) was held in conjunction with the 24 th Conference on Information and Knowledge Management (CIKM) in Melbourne, Australia. The workshop featured three main elements. First, a keynote on the main dimensions, challenges, and opportunities in collaborative information retrieval and seeking by Chirag Shah. Second, an oral presentation session in which four papers were presented. Third, a discussion based on three seed research questions: (1) In what ways is collaborative search evaluation more challenging than individual interactive information retrieval (IIIR) evaluation? (2) Would it be possible and/or useful to standardise experimental designs and data for collaborative search evaluation? and (3) For evaluating collaborative search, can we leverage ideas from other tasks such as diversified search, subtopic mining and/or e-discovery? The discussion was intense and raised many points and issues, leading to the proposition that a new evaluation track focused on collaborative information retrieval/seeking tasks, would be worthwhile

    Evaluating Focused Retrieval Tasks

    Get PDF
    International audienceFocused retrieval, identified by question answering, passage retrieval, and XML element retrieval, is becoming increasingly important within the broad task of information retrieval. In this paper, we present a taxonomy of text retrieval tasks based on the structure of the answers required by a task. Of particular importance are the in context tasks of focused retrieval, where not only relevant documents should be retrieved but also relevant information within each document should be correctly identified. Answers containing relevant information could be, for example, best entry points, or non-overlapping passages or elements. Our main research question is: How should the effectiveness of focused retrieval be evaluated? We propose an evaluation framework where different aspects of the in context focused retrieval tasks can be consistently evaluated and compared, and use fidelity tests on simulated runs to show what is measured. Results from our fidelity experiments demonstrate the usefulness of the proposed evaluation framework, and show its ability to measure different aspects and model different evaluation assumptions of focused retrieval

    Overview of the ImageCLEF 2016 Medical Task

    Get PDF
    ImageCLEF is the image retrieval task of the Conference and Labs of the Evaluation Forum (CLEF). ImageCLEF has historically focused on the multimodal and language–independent retrieval of images. Many tasks are related to image classification and the annotation of image data as well. The medical task has focused more on image retrieval in the beginning and then retrieval and classification tasks in subsequent years. In 2016 a main focus was the creation of meta data for a collection of medical images taken from articles of the the biomedical scientific literature. In total 8 teams participated in the four tasks and 69 runs were submitted. No team participated in the caption prediction task, a totally new task. Deep learning has now been used for several of the ImageCLEF tasks and by many of the participants obtaining very good results. A majority of runs was submitting using deep learning and this follows general trends in machine learning. In several of the tasks multimodal approaches clearly led to best results

    Highly focused document retrieval in aerospace engineering : user interaction design and evaluation

    Get PDF
    Purpose – This paper seeks to describe the preliminary studies (on both users and data), the design and evaluation of the K-Search system for searching legacy documents in aerospace engineering. Real-world reports of jet engine maintenance challenge the current indexing practice, while real users’ tasks require retrieving the information in the proper context. K-Search is currently in use in Rolls-Royce plc and has evolved to include other tools for knowledge capture and management. Design/methodology/approach – Semantic Web techniques have been used to automatically extract information from the reports while maintaining the original context, allowing a more focused retrieval than with more traditional techniques. The paper combines semantic search with classical information retrieval to increase search effectiveness. An innovative user interface has been designed to take advantage of this hybrid search technique. The interface is designed to allow a flexible and personal approach to searching legacy data. Findings – The user evaluation showed that the system is effective and well received by users. It also shows that different people look at the same data in different ways and make different use of the same system depending on their individual needs, influenced by their job profile and personal attitude. Research limitations/implications – This study focuses on a specific case of an enterprise working in aerospace engineering. Although the findings are likely to be shared with other engineering domains (e.g. mechanical, electronic), the study does not expand the evaluation to different settings. Originality/value – The study shows how real context of use can provide new and unexpected challenges to researchers and how effective solutions can then be adopted and used in organizations.</p

    Characterizing Health-Related Information Needs of Domain Experts (regular paper)

    Get PDF
    International audienceIn information retrieval literature, understanding the users' intents behind the queries is critically important to gain a better insight of how to select relevant results. While many studies investigated how users in general carry out exploratory health searches in digital environments, a few focused on how are the queries formulated, specifically by domain expert users. This study intends to fill this gap by studying 173 health expert queries issued from 3 medical information retrieval tasks within 2 different evaluation compaigns. A statistical analysis has been carried out to study both variation and correlation of health-query attributes such as length, clarity and specificity of either clinical or non clinical queries. The knowledge gained from the study has an immediate impact on the design of future health information seeking systems

    A Deep Relevance Matching Model for Ad-hoc Retrieval

    Full text link
    In recent years, deep neural networks have led to exciting breakthroughs in speech recognition, computer vision, and natural language processing (NLP) tasks. However, there have been few positive results of deep models on ad-hoc retrieval tasks. This is partially due to the fact that many important characteristics of the ad-hoc retrieval task have not been well addressed in deep models yet. Typically, the ad-hoc retrieval task is formalized as a matching problem between two pieces of text in existing work using deep models, and treated equivalent to many NLP tasks such as paraphrase identification, question answering and automatic conversation. However, we argue that the ad-hoc retrieval task is mainly about relevance matching while most NLP matching tasks concern semantic matching, and there are some fundamental differences between these two matching tasks. Successful relevance matching requires proper handling of the exact matching signals, query term importance, and diverse matching requirements. In this paper, we propose a novel deep relevance matching model (DRMM) for ad-hoc retrieval. Specifically, our model employs a joint deep architecture at the query term level for relevance matching. By using matching histogram mapping, a feed forward matching network, and a term gating network, we can effectively deal with the three relevance matching factors mentioned above. Experimental results on two representative benchmark collections show that our model can significantly outperform some well-known retrieval models as well as state-of-the-art deep matching models.Comment: CIKM 2016, long pape
    • 

    corecore