13,931 research outputs found

    An introduction to crowdsourcing for language and multimedia technology research

    Get PDF
    Language and multimedia technology research often relies on large manually constructed datasets for training or evaluation of algorithms and systems. Constructing these datasets is often expensive with significant challenges in terms of recruitment of personnel to carry out the work. Crowdsourcing methods using scalable pools of workers available on-demand offers a flexible means of rapid low-cost construction of many of these datasets to support existing research requirements and potentially promote new research initiatives that would otherwise not be possible

    Study to gather evidence on the working conditions of platform workers VT/2018/032 Final Report 13 December 2019

    Get PDF
    Platform work is a type of work using an online platform to intermediate between platform workers, who provide services, and paying clients. Platform work seems to be growing in size and importance. This study explores platform work in the EU28, Norway and Iceland, with a focus on the challenges it presents to working conditions and social protection, and how countries have responded through top-down (e.g. legislation and case law) and bottom-up actions (e.g. collective agreements, actions by platform workers or platforms). This national mapping is accompanied by a comparative assessment of selected EU legal instruments, mostly in the social area. Each instrument is assessed for personal and material scope to determine how it might impact such challenges. Four broad legal domains with relevance to platform work challenges are examined in stand-alone reflection papers. Together, the national mapping and legal analysis support a gap analysis, which aims to indicate where further action on platform work would be useful, and what form such action might take

    Creation of Reliable Relevance Judgments in Information Retrieval Systems Evaluation Experimentation through Crowdsourcing: A Review

    Get PDF
    Test collection is used to evaluate the information retrieval systems in laboratory-based evaluation experimentation. In a classic setting, generating relevance judgments involves human assessors and is a costly and time consuming task. Researchers and practitioners are still being challenged in performing reliable and low-cost evaluation of retrieval systems. Crowdsourcing as a novel method of data acquisition is broadly used in many research fields. It has been proven that crowdsourcing is an inexpensive and quick solution as well as a reliable alternative for creating relevance judgments. One of the crowdsourcing applications in IR is to judge relevancy of query document pair. In order to have a successful crowdsourcing experiment, the relevance judgment tasks should be designed precisely to emphasize quality control. This paper is intended to explore different factors that have an influence on the accuracy of relevance judgments accomplished by workers and how to intensify the reliability of judgments in crowdsourcing experiment

    Crowdsourcing Relevance: Two Studies on Assessment

    Get PDF
    Crowdsourcing has become an alternative approach to collect relevance judgments at large scale. In this thesis, we focus on some specific aspects related to time, scale, and agreement. First, we address the issue of the time factor in gathering relevance label: we study how much time the judges need to assess documents. We conduct a series of four experiments which unexpectedly reveal us how introducing time limitations leads to benefits in terms of the quality of the results. Furthermore, we discuss strategies aimed to determine the right amount of time to make available to the workers for the relevance assessment, in order to both guarantee the high quality of the gathered results and the saving of the valuable resources of time and money. Then we explore the application of magnitude estimation, a psychophysical scaling technique for the measurement of sensation, for relevance assessment. We conduct a large-scale user study across 18 TREC topics, collecting more than 50,000 magnitude estimation judgments, which result to be overall rank-aligned with ordinal judgments made by expert relevance assessors. We discuss the benefits, the reliability of the judgements collected, and the competitiveness in terms of assessor cost. We also report some preliminary results on the agreement among judges. Often, the results of crowdsourcing experiments are affected by noise, that can be ascribed to lack of agreement among workers. This aspect should be considered as it can affect the reliability of the gathered relevance labels, as well as the overall repeatability of the experiments.openDottorato di ricerca in Informatica e scienze matematiche e fisicheopenMaddalena, Edd

    A checklist to combat cognitive biases in crowdsourcing

    Full text link
    corecore