21 research outputs found

    An Axiomatic Analysis of Diversity Evaluation Metrics: Introducing the Rank-Biased Utility Metric

    Full text link
    Many evaluation metrics have been defined to evaluate the effectiveness ad-hoc retrieval and search result diversification systems. However, it is often unclear which evaluation metric should be used to analyze the performance of retrieval systems given a specific task. Axiomatic analysis is an informative mechanism to understand the fundamentals of metrics and their suitability for particular scenarios. In this paper, we define a constraint-based axiomatic framework to study the suitability of existing metrics in search result diversification scenarios. The analysis informed the definition of Rank-Biased Utility (RBU) -- an adaptation of the well-known Rank-Biased Precision metric -- that takes into account redundancy and the user effort associated to the inspection of documents in the ranking. Our experiments over standard diversity evaluation campaigns show that the proposed metric captures quality criteria reflected by different metrics, being suitable in the absence of knowledge about particular features of the scenario under study.Comment: Original version: 10 pages. Preprint of full paper to appear at SIGIR'18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, July 8-12, 2018, Ann Arbor, MI, USA. ACM, New York, NY, US

    Search as learning (SAL) workshop 2016

    Get PDF
    The "Search as Learning" (SAL) workshop is focused on an area within the information retrieval fi

    Validating simulated interaction for retrieval evaluation

    Get PDF
    A searcher’s interaction with a retrieval system consists of actions such as query formulation, search result list interaction and document interaction. The simulation of searcher interaction has recently gained momentum in the analysis and evaluation of interactive information retrieval (IIR). However, a key issue that has not yet been adequately addressed is the validity of such IIR simulations and whether they reliably predict the performance obtained by a searcher across the session. The aim of this paper is to determine the validity of the common interaction model (CIM) typically used for simulating multi-query sessions. We focus on search result interactions, i.e., inspecting snippets, examining documents and deciding when to stop examining the results of a single query, or when to stop the whole session. To this end, we run a series of simulations grounded by real world behavioral data to show how accurate and responsive the model is to various experimental conditions under which the data were produced. We then validate on a second real world data set derived under similar experimental conditions. We seek to predict cumulated gain across the session. We find that the interaction model with a query-level stopping strategy based on consecutive non-relevant snippets leads to the highest prediction accuracy, and lowest deviation from ground truth, around 9 to 15% depending on the experimental conditions. To our knowledge, the present study is the first validation effort of the CIM that shows that the model’s acceptance and use is justified within IIR evaluations. We also identify and discuss ways to further improve the CIM and its behavioral parameters for more accurate simulations

    Report of ECol Workshop Report on the First International Workshop on the Evaluation on Collaborative Information Seeking and Retrieval (ECol'2015)

    Get PDF
    Report of the ECol Workshop @ CIKM 2015The workshop on the evaluation of collaborative information retrieval and seeking (ECol) was held in conjunction with the 24 th Conference on Information and Knowledge Management (CIKM) in Melbourne, Australia. The workshop featured three main elements. First, a keynote on the main dimensions, challenges, and opportunities in collaborative information retrieval and seeking by Chirag Shah. Second, an oral presentation session in which four papers were presented. Third, a discussion based on three seed research questions: (1) In what ways is collaborative search evaluation more challenging than individual interactive information retrieval (IIIR) evaluation? (2) Would it be possible and/or useful to standardise experimental designs and data for collaborative search evaluation? and (3) For evaluating collaborative search, can we leverage ideas from other tasks such as diversified search, subtopic mining and/or e-discovery? The discussion was intense and raised many points and issues, leading to the proposition that a new evaluation track focused on collaborative information retrieval/seeking tasks, would be worthwhile

    A topical approach to retrievability bias estimation

    Get PDF
    Retrievability is an independent evaluation measure that offers insights to an aspect of retrieval systems that performance and efficiency measures do not. Retrievability is often used to calculate the retrievability bias, an indication of how accessible a system makes all the documents in a collection. Generally, computing the retrievability bias of a system requires a colossal number of queries to be issued for the system to gain an accurate estimate of the bias. However, it is often the case that the accuracy of the estimate is not of importance, but the relationship between the estimate of bias and performance when tuning a systems parameters. As such, reaching a stable estimation of bias for the system is more important than getting very accurate retrievability scores for individual documents. This work explores the idea of using topical subsets of the collection for query generation and bias estimation to form a local estimate of bias which correlates with the global estimate of retrievability bias. By using topical subsets, it would be possible to reduce the volume of queries required to reach an accurate estimate of retrievability bias, reducing the time and resources required to perform a retrievability analysis. Findings suggest that this is a viable approach to estimating retrievability bias and that the number of queries required can be reduced to less than a quarter of what was previously thought necessary

    Assessing learning outcomes in web searching: A comparison of tasks and query strategies

    Full text link
    Users make frequent use of Web search for learning-related tasks, but little is known about how different Web search interaction strategies affect outcomes for learning-oriented tasks, or what implicit or explicit indicators could reliably be used to assess search-related learning on the Web. We describe a lab-based user study in which we investigated potential indicators of learning in web searching, effective query strategies for learning, and the relationship between search behavior and learning outcomes. Using questionnaires, analysis of written responses to knowledge prompts, and search log data, we found that searchers’ perceived learning outcomes closely matched their actual learning outcomes; that the amount searchers wrote in post-search questionnaire responses was highly correlated with their cognitive learning scores; and that the time searchers spent per document while searching was also highly and consistently correlated with higher-level cognitive learning scores. We also found that of the three query interaction conditions we applied, an intrinsically diverse presentation of results was associated with the highest percentage of users achieving combined factual and conceptual knowledge gains. Our study provides deeper insight into which aspects of search interaction are most effective for supporting superior learning outcomes, and the difficult problem of how learning may be assessed effectively during Web search.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/145733/1/Collins-Thompson Rieh CHIIR 2016.pd

    Two scrolls or one click : a cost model for browsing search results

    Get PDF
    Modeling how people interact with search interfaces has been of particular interest and importance to the field of Interactive Information Retrieval. Recently, there has been a move to developing formal models of the interaction between the user and the system, whether it be to run a simulation, conduct an economic analysis, measure system performance, or simply to better understand the interactions. In this paper, we present a cost model that characterizes a user examining search results. The model shows under what conditions the interface should be more scroll based or more click based and provides ways to estimate the number of results per page based on the size of the screen and the various interaction costs. Further extensions to the model could be easily included to model different types of browsing and other costs
    corecore