20,583 research outputs found

    Cognitive biases in search : a review and reflection of cognitive biases in Information Retrieval

    Get PDF
    People are susceptible to an array of cognitive biases, which can result in systematic errors and deviations from rational decision making. Over the past decade, an increasing amount of attention has been paid towards investigating how cognitive biases influence information seeking and retrieval behaviours and outcomes. In particular, how such biases may negatively affect decisions because, for example, searchers may seek confirmatory but incorrect infor- mation or anchor on an initial search result even if its incorrect. In this perspectives paper, we aim to: (1) bring together and catalogue the emerging work on cognitive biases in the field of Information Retrieval; and (2) provide a critical review and reflection on these studies and subsequent findings. During our analysis we report on over thirty studies, that empirically examined cognitive biases in search, providing over forty key findings related to different domains (e.g. health, web, socio-political) and different parts of the search process (e.g. querying, assessing, judging, etc.). Our reflection highlights the importance of this research area, and critically discusses the limitations, difficulties and challenges when investigating this phenomena along with presenting open questions and future directions in researching the impact — both positive and negative — of cognitive biases in Information Retrieval

    Active Sampling for Large-scale Information Retrieval Evaluation

    Get PDF
    Evaluation is crucial in Information Retrieval. The development of models, tools and methods has significantly benefited from the availability of reusable test collections formed through a standardized and thoroughly tested methodology, known as the Cranfield paradigm. Constructing these collections requires obtaining relevance judgments for a pool of documents, retrieved by systems participating in an evaluation task; thus involves immense human labor. To alleviate this effort different methods for constructing collections have been proposed in the literature, falling under two broad categories: (a) sampling, and (b) active selection of documents. The former devises a smart sampling strategy by choosing only a subset of documents to be assessed and inferring evaluation measure on the basis of the obtained sample; the sampling distribution is being fixed at the beginning of the process. The latter recognizes that systems contributing documents to be judged vary in quality, and actively selects documents from good systems. The quality of systems is measured every time a new document is being judged. In this paper we seek to solve the problem of large-scale retrieval evaluation combining the two approaches. We devise an active sampling method that avoids the bias of the active selection methods towards good systems, and at the same time reduces the variance of the current sampling approaches by placing a distribution over systems, which varies as judgments become available. We validate the proposed method using TREC data and demonstrate the advantages of this new method compared to past approaches

    Collaborative video searching on a tabletop

    Get PDF
    Almost all system and application design for multimedia systems is based around a single user working in isolation to perform some task yet much of the work for which we use computers to help us, is based on working collaboratively with colleagues. Groupware systems do support user collaboration but typically this is supported through software and users still physically work independently. Tabletop systems, such as the DiamondTouch from MERL, are interface devices which support direct user collaboration on a tabletop. When a tabletop is used as the interface for a multimedia system, such as a video search system, then this kind of direct collaboration raises many questions for system design. In this paper we present a tabletop system for supporting a pair of users in a video search task and we evaluate the system not only in terms of search performance but also in terms of user–user interaction and how different user personalities within each pair of searchers impacts search performance and user interaction. Incorporating the user into the system evaluation as we have done here reveals several interesting results and has important ramifications for the design of a multimedia search system

    Surprisingly Rational: Probability theory plus noise explains biases in judgment

    Full text link
    The systematic biases seen in people's probability judgments are typically taken as evidence that people do not reason about probability using the rules of probability theory, but instead use heuristics which sometimes yield reasonable judgments and sometimes systematic biases. This view has had a major impact in economics, law, medicine, and other fields; indeed, the idea that people cannot reason with probabilities has become a widespread truism. We present a simple alternative to this view, where people reason about probability according to probability theory but are subject to random variation or noise in the reasoning process. In this account the effect of noise is cancelled for some probabilistic expressions: analysing data from two experiments we find that, for these expressions, people's probability judgments are strikingly close to those required by probability theory. For other expressions this account produces systematic deviations in probability estimates. These deviations explain four reliable biases in human probabilistic reasoning (conservatism, subadditivity, conjunction and disjunction fallacies). These results suggest that people's probability judgments embody the rules of probability theory, and that biases in those judgments are due to the effects of random noise.Comment: 64 pages. Final preprint version. In press, Psychological Revie

    Probabilistic biases meet the Bayesian brain

    Get PDF
    Bayesian cognitive science sees the mind as a spectacular probabilistic inference machine. But Judgment and Decision Making research has spent half a century uncovering how dramatically and systematically people depart from rational norms. This paper outlines recent research that opens up the possibility of an unexpected reconciliation. The key hypothesis is that the brain neither represents nor calculates with probabilities; but approximates probabilistic calculations through drawing samples from memory or mental simulation. Sampling models diverge from perfect probabilistic calculations in ways that capture many classic JDM findings, and offers the hope of an integrated explanation of classic heuristics and biases, including availability, representativeness, and anchoring and adjustment

    Radio Oranje: Enhanced Access to a Historical Spoken Word Collection

    Get PDF
    Access to historical audio collections is typically very restricted:\ud content is often only available on physical (analog) media and the\ud metadata is usually limited to keywords, giving access at the level\ud of relatively large fragments, e.g., an entire tape. Many spoken\ud word heritage collections are now being digitized, which allows the\ud introduction of more advanced search technology. This paper presents\ud an approach that supports online access and search for recordings of\ud historical speeches. A demonstrator has been built, based on the\ud so-called Radio Oranje collection, which contains radio speeches by\ud the Dutch Queen Wilhelmina that were broadcast during World War II.\ud The audio has been aligned with its original 1940s manual\ud transcriptions to create a time-stamped index that enables the speeches to be\ud searched at the word level. Results are presented together with\ud related photos from an external database

    Validating simulated interaction for retrieval evaluation

    Get PDF
    A searcher’s interaction with a retrieval system consists of actions such as query formulation, search result list interaction and document interaction. The simulation of searcher interaction has recently gained momentum in the analysis and evaluation of interactive information retrieval (IIR). However, a key issue that has not yet been adequately addressed is the validity of such IIR simulations and whether they reliably predict the performance obtained by a searcher across the session. The aim of this paper is to determine the validity of the common interaction model (CIM) typically used for simulating multi-query sessions. We focus on search result interactions, i.e., inspecting snippets, examining documents and deciding when to stop examining the results of a single query, or when to stop the whole session. To this end, we run a series of simulations grounded by real world behavioral data to show how accurate and responsive the model is to various experimental conditions under which the data were produced. We then validate on a second real world data set derived under similar experimental conditions. We seek to predict cumulated gain across the session. We find that the interaction model with a query-level stopping strategy based on consecutive non-relevant snippets leads to the highest prediction accuracy, and lowest deviation from ground truth, around 9 to 15% depending on the experimental conditions. To our knowledge, the present study is the first validation effort of the CIM that shows that the model’s acceptance and use is justified within IIR evaluations. We also identify and discuss ways to further improve the CIM and its behavioral parameters for more accurate simulations

    The TRECVID 2007 BBC rushes summarization evaluation pilot

    Get PDF
    This paper provides an overview of a pilot evaluation of video summaries using rushes from several BBC dramatic series. It was carried out under the auspices of TRECVID. Twenty-two research teams submitted video summaries of up to 4% duration, of 42 individual rushes video files aimed at compressing out redundant and insignificant material. The output of two baseline systems built on straightforward content reduction techniques was contributed by Carnegie Mellon University as a control. Procedures for developing ground truth lists of important segments from each video were developed at Dublin City University and applied to the BBC video. At NIST each summary was judged by three humans with respect to how much of the ground truth was included, how easy the summary was to understand, and how much repeated material the summary contained. Additional objective measures included: how long it took the system to create the summary, how long it took the assessor to judge it against the ground truth, and what the summary's duration was. Assessor agreement on finding desired segments averaged 78% and results indicate that while it is difficult to exceed the performance of baselines, a few systems did

    Scoping analytical usability evaluation methods: A case study

    Get PDF
    Analytical usability evaluation methods (UEMs) can complement empirical evaluation of systems: for example, they can often be used earlier in design and can provide accounts of why users might experience difficulties, as well as what those difficulties are. However, their properties and value are only partially understood. One way to improve our understanding is by detailed comparisons using a single interface or system as a target for evaluation, but we need to look deeper than simple problem counts: we need to consider what kinds of accounts each UEM offers, and why. Here, we report on a detailed comparison of eight analytical UEMs. These eight methods were applied to it robotic arm interface, and the findings were systematically compared against video data of the arm ill use. The usability issues that were identified could be grouped into five categories: system design, user misconceptions, conceptual fit between user and system, physical issues, and contextual ones. Other possible categories such as User experience did not emerge in this particular study. With the exception of Heuristic Evaluation, which supported a range of insights, each analytical method was found to focus attention on just one or two categories of issues. Two of the three "home-grown" methods (Evaluating Multimodal Usability and Concept-based Analysis of Surface and Structural Misfits) were found to occupy particular niches in the space, whereas the third (Programmable User Modeling) did not. This approach has identified commonalities and contrasts between methods and provided accounts of why a particular method yielded the insights it did. Rather than considering measures such as problem count or thoroughness, this approach has yielded insights into the scope of each method

    Conducting process-product studies : some considerations

    Get PDF
    One of the major factors which deter mines the validity of research findings is undoubtedly the effectiveness of the decision-making process in setting up a design and a general methodology which are both rigorous and compatible with the aims of the study. Indeed, all the decisions made, be they major or minor ones, and related problems, vary from one study to the next depending largely on the nature and purpose of the exercise. The great majority of decisions required in designing a study are made in its early stages, It is usually the case, however, that in the course of a study other decisions would have to be taken. Since most decisions are interrelated, a change in one would precipitate a change in, or a reconsideration of, at least a second decision. Vis-a-vis the above, it is the purpose of this paper to discuss some of the aspects and issues which should be considered in a research study of the relationship between teaching and attainment. Although most of the following arguments and considerations would be valid for such a study at the primary or secondary level of education, these would be more true at the former level.peer-reviewe
    corecore