7,423 research outputs found

    Human-powered Sorts and Joins

    Get PDF
    Crowdsourcing markets like Amazon's Mechanical Turk (MTurk) make it possible to task people with small jobs, such as labeling images or looking up phone numbers, via a programmatic interface. MTurk tasks for processing datasets with humans are currently designed with significant reimplementation of common workflows and ad-hoc selection of parameters such as price to pay per task. We describe how we have integrated crowds into a declarative workflow engine called Qurk to reduce the burden on workflow designers. In this paper, we focus on how to use humans to compare items for sorting and joining data, two of the most common operations in DBMSs. We describe our basic query interface and the user interface of the tasks we post to MTurk. We also propose a number of optimizations, including task batching, replacing pairwise comparisons with numerical ratings, and pre-filtering tables before joining them, which dramatically reduce the overall cost of running sorts and joins on the crowd. In an experiment joining two sets of images, we reduce the overall cost from 67inanaiveimplementationtoabout67 in a naive implementation to about 3, without substantially affecting accuracy or latency. In an end-to-end experiment, we reduced cost by a factor of 14.5.Comment: VLDB201

    Human-powered sorts and joins

    Full text link

    Political Regimes, Bureaucracy, and Scientific Productivity

    Get PDF
    Can a scientist trust that the government is going to pay him or her fairly? In the science–government relationship, an incumbent may be better off if he or she does not provide—or does not provide a fair pay to public scientists. We propose a simple game-theoretic model for understanding the trust problem in the relationship between governments and scientists. The model shows how with reliable governments (democracies), bureaucratic contracts (e.g., secure tenure) are not optimal since they have low-powered incentives (in contrast to the highpowered private-sector type of contracts) and run against scientists’ responsiveness to government demands. However, with nonreliable governments (dictatorships), bureaucratic contracts are second-best solutions because they protect scientists against the possibility of governments’ misbehavior (i.e., ex post opportunistic defections, such as canceling research programs overnight). An empirical analysis confirms the predictions: bureaucratic contracts enhance scientific productivity with nonreliable governments (dictatorships) but hamper scientific productivity with reliable governments (democracies).Publicad

    Optimization techniques for human computation-enabled data processing systems

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 119-124).Crowdsourced labor markets make it possible to recruit large numbers of people to complete small tasks that are difficult to automate on computers. These marketplaces are increasingly widely used, with projections of over $1 billion being transferred between crowd employers and crowd workers by the end of 2012. While crowdsourcing enables forms of computation that artificial intelligence has not yet achieved, it also presents crowd workflow designers with a series of challenges including describing tasks, pricing tasks, identifying and rewarding worker quality, dealing with incorrect responses, and integrating human computation into traditional programming frameworks. In this dissertation, we explore the systems-building, operator design, and optimization challenges involved in building a crowd-powered workflow management system. We describe a system called Qurk that utilizes techniques from databases such as declarative workflow definition, high-latency workflow execution, and query optimization to aid crowd-powered workflow developers. We study how crowdsourcing can enhance the capabilities of traditional databases by evaluating how to implement basic database operators such as sorts and joins on datasets that could not have been processed using traditional computation frameworks. Finally, we explore the symbiotic relationship between the crowd and query optimization, enlisting crowd workers to perform selectivity estimation, a key component in optimizing complex crowd-powered workflows.by Adam Marcus.Ph.D

    Demonstration of Qurk: A Query Processor for Human Operators

    Get PDF
    Crowdsourcing technologies such as Amazon's Mechanical Turk ("MTurk") service have exploded in popularity in recent years. These services are increasingly used for complex human-reliant data processing tasks, such as labelling a collection of images, combining two sets of images to identify people that appear in both, or extracting sentiment from a corpus of text snippets. There are several challenges in designing a workflow that filters, aggregates, sorts and joins human-generated data sources. Currently, crowdsourcing-based workflows are hand-built, resulting in increasingly complex programs. Additionally, developers must hand-optimize tradeoffs among monetary cost, accuracy, and time to completion of results. These challenges are well-suited to a declarative query interface that allows developers to describe their worflow at a high level and automatically optimizes workflow and tuning parameters. In this demonstration, we will present Qurk, a novel query system that allows human-based processing for relational databases. The audience will interact with the system to build queries and monitor their progress. The audience will also see Qurk from an MTurk user's perspective, and complete several tasks to better understand how a query is processed

    Technology Trends: Working Life with ‘Smart Things’

    Get PDF
    • 

    corecore