6,337 research outputs found

    Time-Sensitive Bayesian Information Aggregation for Crowdsourcing Systems

    Get PDF
    Crowdsourcing systems commonly face the problem of aggregating multiple judgments provided by potentially unreliable workers. In addition, several aspects of the design of efficient crowdsourcing processes, such as defining worker's bonuses, fair prices and time limits of the tasks, involve knowledge of the likely duration of the task at hand. Bringing this together, in this work we introduce a new time--sensitive Bayesian aggregation method that simultaneously estimates a task's duration and obtains reliable aggregations of crowdsourced judgments. Our method, called BCCTime, builds on the key insight that the time taken by a worker to perform a task is an important indicator of the likely quality of the produced judgment. To capture this, BCCTime uses latent variables to represent the uncertainty about the workers' completion time, the tasks' duration and the workers' accuracy. To relate the quality of a judgment to the time a worker spends on a task, our model assumes that each task is completed within a latent time window within which all workers with a propensity to genuinely attempt the labelling task (i.e., no spammers) are expected to submit their judgments. In contrast, workers with a lower propensity to valid labeling, such as spammers, bots or lazy labelers, are assumed to perform tasks considerably faster or slower than the time required by normal workers. Specifically, we use efficient message-passing Bayesian inference to learn approximate posterior probabilities of (i) the confusion matrix of each worker, (ii) the propensity to valid labeling of each worker, (iii) the unbiased duration of each task and (iv) the true label of each task. Using two real-world public datasets for entity linking tasks, we show that BCCTime produces up to 11% more accurate classifications and up to 100% more informative estimates of a task's duration compared to state-of-the-art methods

    Privacy-Friendly Mobility Analytics using Aggregate Location Data

    Get PDF
    Location data can be extremely useful to study commuting patterns and disruptions, as well as to predict real-time traffic volumes. At the same time, however, the fine-grained collection of user locations raises serious privacy concerns, as this can reveal sensitive information about the users, such as, life style, political and religious inclinations, or even identities. In this paper, we study the feasibility of crowd-sourced mobility analytics over aggregate location information: users periodically report their location, using a privacy-preserving aggregation protocol, so that the server can only recover aggregates -- i.e., how many, but not which, users are in a region at a given time. We experiment with real-world mobility datasets obtained from the Transport For London authority and the San Francisco Cabs network, and present a novel methodology based on time series modeling that is geared to forecast traffic volumes in regions of interest and to detect mobility anomalies in them. In the presence of anomalies, we also make enhanced traffic volume predictions by feeding our model with additional information from correlated regions. Finally, we present and evaluate a mobile app prototype, called Mobility Data Donors (MDD), in terms of computation, communication, and energy overhead, demonstrating the real-world deployability of our techniques.Comment: Published at ACM SIGSPATIAL 201

    Whom to Ask? Jury Selection for Decision Making Tasks on Micro-blog Services

    Full text link
    It is universal to see people obtain knowledge on micro-blog services by asking others decision making questions. In this paper, we study the Jury Selection Problem(JSP) by utilizing crowdsourcing for decision making tasks on micro-blog services. Specifically, the problem is to enroll a subset of crowd under a limited budget, whose aggregated wisdom via Majority Voting scheme has the lowest probability of drawing a wrong answer(Jury Error Rate-JER). Due to various individual error-rates of the crowd, the calculation of JER is non-trivial. Firstly, we explicitly state that JER is the probability when the number of wrong jurors is larger than half of the size of a jury. To avoid the exponentially increasing calculation of JER, we propose two efficient algorithms and an effective bounding technique. Furthermore, we study the Jury Selection Problem on two crowdsourcing models, one is for altruistic users(AltrM) and the other is for incentive-requiring users(PayM) who require extra payment when enrolled into a task. For the AltrM model, we prove the monotonicity of JER on individual error rate and propose an efficient exact algorithm for JSP. For the PayM model, we prove the NP-hardness of JSP on PayM and propose an efficient greedy-based heuristic algorithm. Finally, we conduct a series of experiments to investigate the traits of JSP, and validate the efficiency and effectiveness of our proposed algorithms on both synthetic and real micro-blog data.Comment: VLDB201

    Agent-based pedestrian modelling

    Get PDF
    When the focus of interest in geographical systems is at the very fine scale, at the level of streets and buildings for example, movement becomes central to simulations of how spatial activities are used and develop. Recent advances in computing power and the acquisition of fine scale digital data now mean that we are able to attempt to understand and predict such phenomena with the focus in spatial modelling changing to dynamic simulations of the individual and collective behaviour of individual decision-making at such scales. In this Chapter, we develop ideas about how such phenomena can be modelled showing first how randomness and geometry are all important to local movement and how ordered spatial structures emerge from such actions. We focus on developing these ideas for pedestrians showing how random walks constrained by geometry but aided by what agents can see, determine how individuals respond to locational patterns. We illustrate these ideas with three types of example: first for local scale street scenes where congestion and flocking is all important, second for coarser scale shopping centres such as malls where economic preference interferes much more with local geometry, and finally for semi-organised street festivals where management and control by police and related authorities is integral to the way crowds move
    • …
    corecore