11,617 research outputs found

    Searching and Stopping: An Analysis of Stopping Rules and Strategies

    Get PDF
    Searching naturally involves stopping points, both at a query level (how far down the ranked list should I go?) and at a session level (how many queries should I issue?). Understanding when searchers stop has been of much interest to the community because it is fundamental to how we evaluate search behaviour and performance. Research has shown that searchers find it difficult to formalise stopping criteria, and typically resort to their intuition of what is "good enough". While various heuristics and stopping criteria have been proposed, little work has investigated how well they perform, and whether searchers actually conform to any of these rules. In this paper, we undertake the first large scale study of stopping rules, investigating how they influence overall session performance, and which rules best match actual stopping behaviour. Our work is focused on stopping at the query level in the context of ad-hoc topic retrieval, where searchers undertake search tasks within a fixed time period. We show that stopping strategies based upon the disgust or frustration point rules - both of which capture a searcher's tolerance to non-relevance - typically result in (i) the best overall performance, and (ii) provide the closest approximation to actual searcher behaviour, although a fixed depth approach also performs remarkably well. Findings from this study have implications regarding how we build measures, and how we conduct simulations of search behaviours

    Learning in the Repeated Secretary Problem

    Full text link
    In the classical secretary problem, one attempts to find the maximum of an unknown and unlearnable distribution through sequential search. In many real-world searches, however, distributions are not entirely unknown and can be learned through experience. To investigate learning in such a repeated secretary problem we conduct a large-scale behavioral experiment in which people search repeatedly from fixed distributions. In contrast to prior investigations that find no evidence for learning in the classical scenario, in the repeated setting we observe substantial learning resulting in near-optimal stopping behavior. We conduct a Bayesian comparison of multiple behavioral models which shows that participants' behavior is best described by a class of threshold-based models that contains the theoretically optimal strategy. Fitting such a threshold-based model to data reveals players' estimated thresholds to be surprisingly close to the optimal thresholds after only a small number of games

    VIP: Incorporating Human Cognitive Biases in a Probabilistic Model of Retweeting

    Full text link
    Information spread in social media depends on a number of factors, including how the site displays information, how users navigate it to find items of interest, users' tastes, and the `virality' of information, i.e., its propensity to be adopted, or retweeted, upon exposure. Probabilistic models can learn users' tastes from the history of their item adoptions and recommend new items to users. However, current models ignore cognitive biases that are known to affect behavior. Specifically, people pay more attention to items at the top of a list than those in lower positions. As a consequence, items near the top of a user's social media stream have higher visibility, and are more likely to be seen and adopted, than those appearing below. Another bias is due to the item's fitness: some items have a high propensity to spread upon exposure regardless of the interests of adopting users. We propose a probabilistic model that incorporates human cognitive biases and personal relevance in the generative model of information spread. We use the model to predict how messages containing URLs spread on Twitter. Our work shows that models of user behavior that account for cognitive factors can better describe and predict user behavior in social media.Comment: SBP 201

    Online Human-Bot Interactions: Detection, Estimation, and Characterization

    Full text link
    Increasing evidence suggests that a growing amount of social media content is generated by autonomous entities known as social bots. In this work we present a framework to detect such entities on Twitter. We leverage more than a thousand features extracted from public data and meta-data about users: friends, tweet content and sentiment, network patterns, and activity time series. We benchmark the classification framework by using a publicly available dataset of Twitter bots. This training data is enriched by a manually annotated collection of active Twitter users that include both humans and bots of varying sophistication. Our models yield high accuracy and agreement with each other and can detect bots of different nature. Our estimates suggest that between 9% and 15% of active Twitter accounts are bots. Characterizing ties among accounts, we observe that simple bots tend to interact with bots that exhibit more human-like behaviors. Analysis of content flows reveals retweet and mention strategies adopted by bots to interact with different target groups. Using clustering analysis, we characterize several subclasses of accounts, including spammers, self promoters, and accounts that post content from connected applications.Comment: Accepted paper for ICWSM'17, 10 pages, 8 figures, 1 tabl

    The Relationship Between Risk Attitudes and Heuristics in Search Tasks: A Laboratory Experiment

    Get PDF
    Experimental studies of search behavior suggest that individuals stop searching earlier than predicted by the optimal, risk-neutral stopping rule. Such behavior could be generated by two different classes of decision rules: rules that are optimal conditional on utility functions departing from risk neutrality, or heuristics derived from limited cognitive processing capacities and satisfycing. To discriminate among these two possibilities, we conduct an experiment that consists of a standard search task as well as a lottery task designed to elicit utility functions. We find that search heuristics are not related to measures of risk aversion, but to measures of loss aversion
    corecore