6 research outputs found

    Worker Retention, Response Quality, and Diversity in Microtask Crowdsourcing: An Experimental Investigation of the Potential for Priming Effects to Promote Project Goals

    Get PDF
    Online microtask crowdsourcing platforms act as efficient resources for delegating small units of work, gathering data, generating ideas, and more. Members of research and business communities have incorporated crowdsourcing into problem-solving processes. When human workers contribute to a crowdsourcing task, they are subject to various stimuli as a result of task design. Inter-task priming effects - through which work is nonconsciously, yet significantly, influenced by exposure to certain stimuli - have been shown to affect microtask crowdsourcing responses in a variety of ways. Instead of simply being wary of the potential for priming effects to skew results, task administrators can utilize proven priming procedures in order to promote project goals. In a series of three experiments conducted on Amazon’s Mechanical Turk, we investigated the effects of proposed priming treatments on worker retention, response quality, and response diversity. In our first two experiments, we studied the effect of initial response freedom on sustained worker participation and response quality. We expected that workers who were granted greater levels of freedom in an initial response would be stimulated to complete more work and deliver higher quality work than workers originally constrained in their initial response possibilities. We found no significant relationship between the initial response freedom granted to workers and the amount of optional work they completed. The degree of initial response freedom also did not have a significant impact on subsequent response quality. However, the influence of inter-task effects were evident based on response tendencies for different question types. We found evidence that consistency in task structure may play a stronger role in promoting response quality than proposed priming procedures. In our final experiment, we studied the influence of a group-level priming treatment on response diversity. Instead of varying task structure for different workers, we varied the degree of overlap in question content distributed to different workers in a group. We expected groups of workers that were exposed to more diverse preliminary question sets to offer greater diversity in response to a subsequent question. Although differences in response diversity were revealed, no consistent trend between question content overlap and response diversity prevailed. Nevertheless, combining consistent task structure with crowd-level priming procedures - to encourage diversity in inter-task effects across the crowd - offers an exciting path for future study

    The quality of data collected online: An investigation of careless responding in a crowdsourced sample

    Get PDF
    Despite recent concerns about data quality, various academic fields rely increasingly on crowdsourced samples. Thus, the goal of this study was to systematically assess carelessness in a crowdsourced sample (N = 394) by applying various measures and detection methods. A Latent Profile Analysis revealed that 45.9% of the participants showed some form of careless behavior. Excluding these participants increased the effect size in an experiment included in the survey. Based on our findings, several recommendations of easy to apply measures for assessing data quality are given

    Quality Control in Crowdsourcing: A Survey of Quality Attributes, Assessment Techniques and Assurance Actions

    Get PDF
    Crowdsourcing enables one to leverage on the intelligence and wisdom of potentially large groups of individuals toward solving problems. Common problems approached with crowdsourcing are labeling images, translating or transcribing text, providing opinions or ideas, and similar - all tasks that computers are not good at or where they may even fail altogether. The introduction of humans into computations and/or everyday work, however, also poses critical, novel challenges in terms of quality control, as the crowd is typically composed of people with unknown and very diverse abilities, skills, interests, personal objectives and technological resources. This survey studies quality in the context of crowdsourcing along several dimensions, so as to define and characterize it and to understand the current state of the art. Specifically, this survey derives a quality model for crowdsourcing tasks, identifies the methods and techniques that can be used to assess the attributes of the model, and the actions and strategies that help prevent and mitigate quality problems. An analysis of how these features are supported by the state of the art further identifies open issues and informs an outlook on hot future research directions.Comment: 40 pages main paper, 5 pages appendi

    Methods for detecting and mitigating linguistic bias in text corpora

    Get PDF
    Im Zuge der fortschreitenden Ausbreitung des Webs in alle Aspekte des täglichen Lebens wird Bias in Form von Voreingenommenheit und versteckten Meinungen zu einem zunehmend herausfordernden Problem. Eine weitverbreitete Erscheinungsform ist Bias in Textdaten. Um dem entgegenzuwirken hat die Online-Enzyklopädie Wikipedia das Prinzip des neutralen Standpunkts (Englisch: Neutral Point of View, kurz: NPOV) eingeführt, welcher die Verwendung neutraler Sprache und die Vermeidung von einseitigen oder subjektiven Formulierungen vorschreibt. Während Studien gezeigt haben, dass die Qualität von Wikipedia-Artikel mit der Qualität von Artikeln in klassischen Enzyklopädien vergleichbar ist, zeigt die Forschung gleichzeitig auch, dass Wikipedia anfällig für verschiedene Typen von NPOV-Verletzungen ist. Bias zu identifizieren, kann eine herausfordernde Aufgabe sein, sogar für Menschen, und mit Millionen von Artikeln und einer zurückgehenden Anzahl von Mitwirkenden wird diese Aufgabe zunehmend schwieriger. Wenn Bias nicht eingedämmt wird, kann dies nicht nur zu Polarisierungen und Konflikten zwischen Meinungsgruppen führen, sondern Nutzer auch negativ in ihrer freien Meinungsbildung beeinflussen. Hinzu kommt, dass sich Bias in Texten und in Ground-Truth-Daten negativ auf Machine Learning Modelle, die auf diesen Daten trainiert werden, auswirken kann, was zu diskriminierendem Verhalten von Modellen führen kann. In dieser Arbeit beschäftigen wir uns mit Bias, indem wir uns auf drei zentrale Aspekte konzentrieren: Bias-Inhalte in Form von geschriebenen Aussagen, Bias von Crowdworkern während des Annotierens von Daten und Bias in Word Embeddings Repräsentationen. Wir stellen zwei Ansätze für die Identifizierung von Aussagen mit Bias in Textsammlungen wie Wikipedia vor. Unser auf Features basierender Ansatz verwendet Bag-of-Word Features inklusive einer Liste von Bias-Wörtern, die wir durch das Identifizieren von Clustern von Bias-Wörtern im Vektorraum von Word Embeddings zusammengestellt haben. Unser verbesserter, neuronaler Ansatz verwendet Gated Recurrent Neural Networks, um Kontext-Abhängigkeiten zu erfassen und die Performance des Modells weiter zu verbessern. Unsere Studie zum Thema Crowd Worker Bias deckt Bias-Verhalten von Crowdworkern mit extremen Meinungen zu einem bestimmten Thema auf und zeigt, dass dieses Verhalten die entstehenden Ground-Truth-Label beeinflusst, was wiederum Einfluss auf die Erstellung von Datensätzen für Aufgaben wie Bias Identifizierung oder Sentiment Analysis hat. Wir stellen Ansätze für die Abschwächung von Worker Bias vor, die Bewusstsein unter den Workern erzeugen und das Konzept der sozialen Projektion verwenden. Schließlich beschäftigen wir uns mit dem Problem von Bias in Word Embeddings, indem wir uns auf das Beispiel von variierenden Sentiment-Scores für Namen konzentrieren. Wir zeigen, dass Bias in den Trainingsdaten von den Embeddings erfasst und an nachgelagerte Modelle weitergegeben wird. In diesem Zusammenhang stellen wir einen Debiasing-Ansatz vor, der den Bias-Effekt reduziert und sich positiv auf die produzierten Label eines nachgeschalteten Sentiment Classifiers auswirkt

    Novel Methods for Designing Tasks in Crowdsourcing

    Get PDF
    Crowdsourcing is becoming more popular as a means for scalable data processing that requires human intelligence. The involvement of groups of people to accomplish tasks could be an effective success factor for data-driven businesses. Unlike in other technical systems, the quality of the results depends on human factors and how well crowd workers understand the requirements of the task, to produce high-quality results. Looking at previous studies in this area, we found that one of the main factors that affect workers’ performance is the design of the crowdsourcing tasks. Previous studies of crowdsourcing task design covered a limited set of factors. The main contribution of this research is the focus on some of the less-studied technical factors, such as examining the effect of task ordering and class balance and measuring the consistency of the same task design over time and on different crowdsourcing platforms. Furthermore, this study ambitiously extends work towards understanding workers’ point of view in terms of the quality of the task and the payment aspect by performing a qualitative study with crowd workers and shedding light on some of the ethical issues around payments for crowdsourcing tasks. To achieve our goal, we performed several crowdsourcing experiments on specific platforms and measured the factors that influenced the quality of the overall result

    Human-Centered Machine Learning: Algorithm Design and Human Behavior

    Get PDF
    Machine learning is increasingly engaged in a large number of important daily decisions and has great potential to reshape various sectors of our modern society. To fully realize this potential, it is important to understand the role that humans play in the design of machine learning algorithms and investigate the impacts of the algorithm on humans. Towards the understanding of such interactions between humans and algorithms, this dissertation takes a human-centric perspective and focuses on investigating the interplay between human behavior and algorithm design. Accounting for the roles of humans in algorithm design creates unique challenges. For example, humans might be strategic or exhibit behavioral biases when generating data or responding to algorithms, violating the standard independence assumption in algorithm design. How do we design algorithms that take such human behavior into account? Moreover, humans possess various ethical values, e.g., humans want to be treated fairly and care about privacy. How do we design algorithms that align with human values? My dissertation addresses these challenges by combining both theoretical and empirical approaches. From the theoretical perspective, we explore how to design algorithms that account for human behavior and respect human values. In particular, we formulate models of human behavior in the data generation process and design algorithms that can leverage data with human biases. Moreover, we investigate the long-term impacts of algorithm decisions and design algorithms that mitigate the reinforcement of existing inequalities. From the empirical perspective, we have conducted behavioral experiments to understand human behavior in the context of data generation and information design. We have further developed more realistic human models based on empirical data and studied the algorithm design building on the updated behavior models
    corecore