Search CORE

20 research outputs found

MTurk 101: An Introduction to Amazon Mechanical Turk for Extension Professionals

Author: Brar Pooja
Dworkin Jodi
Hessel Heather
Rudi Jessie
Publication venue: Clemson University Libraries
Publication date: 01/12/2016
Field of study

Amazon Mechanical Turk (MTurk) is an online marketplace for labor recruitment that has become a popular platform for data collection. In particular, MTurk can be a valuable tool for Extension professionals. As an example, MTurk workers can provide feedback, write reviews, or give input on a website design. In this article we discuss the many uses of MTurk for Extension professionals and provide best practices for its use

Clemson University: TigerPrints

The (Statistical) Power of Mechanical Turk

Author: Kimball Amelia
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2014
Field of study

In this paper, I argue for the use of Amazon Mechanical Turk (AMT) in language research. AMT is an online marketplace of paid workers who may be used as subjects, which can greatly increase the statistical power of studies quickly and with minimal funding. I will show that—despite some obvious limitations of using distant subjects—properly designed experiments completed on AMT are trustworthy, cheap, and much faster than traditional face-to-face data collection. Not only this, but AMT workers may help with data analysis, which can greatly increase the scope of research that one researcher may carry out. This paper will first argue several reasons for using online subjects, then quickly outline how to build a survey-type experiment using AMT, and finally review several best practices for ensuring reliable data

Purdue E-Pubs

Introduction to the special issue on annotated corpora

Author: Candito Marie
Liberman Mark
Publication venue: 'Associacio catalana de Salut Laboral'
Publication date: 20/12/2019
Field of study

International audienceLes corpus annotés sont toujours plus cruciaux, aussi bien pour la recherche scien- tifique en linguistique que le traitement automatique des langues. Ce numéro spécial passe brièvement en revue l’évolution du domaine et souligne les défis à relever en restant dans le cadre actuel d’annotations utilisant des catégories analytiques, ainsi que ceux remettant en question le cadre lui-même. Il présente trois articles, l’un concernant l’évaluation de la qualité d’annotation, et deux concernant des corpus arborés du français, l’un traitant du plus ancien projet de corpus arboré du français, le French Treebank, le second concernant la conversion de corpus français dans le schéma interlingue des Universal Dependencies, offrant ainsi une illustration de l’histoire du développement des corpus arborés.Annotated corpora are increasingly important for linguistic scholarship, science and technology. This special issue briefly surveys the development of the field and points to challenges within the current framework of annotation using analytical categories as well as challenges to the framework itself. It presents three articles, one concerning the evaluation of the quality of annotation, and two concerning French treebanks, one dealing with the oldest project for French, the French Treebank, the second concerning the conversion of French corpora into the cross-lingual framework of Universal Dependencies, thus offering an illustration of the history of treebank development worldwide

Crowdsourcing Emotions in Music Domain

Author: CANO ERION
MORISIO MAURIZIO
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 01/01/2017
Field of study

An important source of intelligence for music emotion recognition today comes from user-provided community tags about songs or artists. Recent crowdsourcing approaches such as harvesting social tags, design of collaborative games and web services or the use of Mechanical Turk, are becoming popular in the literature. They provide a cheap, quick and efficient method, contrary to professional labeling of songs which is expensive and does not scale for creating large datasets. In this paper we discuss the viability of various crowdsourcing instruments providing examples from research works. We also share our own experience, illustrating the steps we followed using tags collected from Last.fm for the creation of two music mood datasets which are rendered public. While processing affect tags of Last.fm, we observed that they tend to be biased towards positive emotions; the resulting dataset thus contain more positive songs than negative ones

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Descartes: Generating Short Descriptions of Wikipedia Articles

Author: Peyrard Maxime
Sakota Marija
West Robert
Publication venue
Publication date: 02/11/2022
Field of study

Wikipedia is one of the richest knowledge sources on the Web today. In order to facilitate navigating, searching, and maintaining its content, Wikipedia's guidelines state that all articles should be annotated with a so-called short description indicating the article's topic (e.g., the short description of beer is "Alcoholic drink made from fermented cereal grains"). Nonetheless, a large fraction of articles (ranging from 10.2% in Dutch to 99.7% in Kazakh) have no short description yet, with detrimental effects for millions of Wikipedia users. Motivated by this problem, we introduce the novel task of automatically generating short descriptions for Wikipedia articles and propose Descartes, a multilingual model for tackling it. Descartes integrates three sources of information to generate an article description in a target language: the text of the article in all its language versions, the already-existing descriptions (if any) of the article in other languages, and semantic type information obtained from a knowledge graph. We evaluate a Descartes model trained for handling 25 languages simultaneously, showing that it beats baselines (including a strong translation-based baseline) and performs on par with monolingual models tailored for specific languages. A human evaluation on three languages further shows that the quality of Descartes's descriptions is largely indistinguishable from that of human-written descriptions; e.g., 91.3% of our English descriptions (vs. 92.1% of human-written descriptions) pass the bar for inclusion in Wikipedia, suggesting that Descartes is ready for production, with the potential to support human editors in filling a major gap in today's Wikipedia across languages

arXiv.org e-Print Archive

Five sources of bias in natural language processing

Author: Hovy Dirk
Prabhumoye Shrimai
Publication venue: 'Wiley'
Publication date: 01/01/2021
Field of study

Recently, there has been an increased interest in demographically grounded bias in natural language processing (NLP) applications. Much of the recent work has focused on describing bias and providing an overview of bias in a larger context. Here, we provide a simple, actionable summary of this recent work. We outline five sources where bias can occur in NLP systems: (1) the data, (2) the annotation process, (3) the input representations, (4) the models, and finally (5) the research design (or how we conceptualize our research). We explore each of the bias sources in detail in this article, including examples and links to related work, as well as potential counter-measures

Archivio istituzionale della Ricerca - Bocconi

PubMed Central

Characterizing the Global Crowd Workforce: A Cross-Country Comparison of Crowdworker Demographics

Author: Bleier Arnim
Flöck Fabian
Posch Lisa
Strohmaier Markus
Publication venue
Publication date: 14/12/2018
Field of study

Micro-task crowdsourcing is an international phenomenon that has emerged during the past decade. This paper sets out to explore the characteristics of the international crowd workforce and provides a cross-national comparison of the crowd workforce in ten countries. We provide an analysis and comparison of demographic characteristics and shed light on the significance of micro-task income for workers in different countries. This study is the first large-scale country-level analysis of the characteristics of workers on the platform Figure Eight (formerly CrowdFlower), one of the two platforms dominating the micro-task market. We find large differences between the characteristics of the crowd workforces of different countries, both regarding demography and regarding the importance of micro-task income for workers. Furthermore, we find that the composition of the workforce in the ten countries was largely stable across samples taken at different points in time

arXiv.org e-Print Archive

Developing and validating a methodology for crowdsourcing L2 speech ratings in Amazon Mechanical Turk

Author: Nagle Charles
Nagle Charles
Publication venue: Iowa State University Digital Repository
Publication date: 21/01/2019
Field of study

Researchers have increasingly turned to Amazon Mechanical Turk (AMT) to crowdsource speech data, predominantly in English. Although AMT and similar platforms are well positioned to enhance the state of the art in L2 research, it is unclear if crowdsourced L2 speech ratings are reliable, particularly in languages other than English. The present study describes the development and deployment of an AMT task to crowdsource comprehensibility, fluency, and accentedness ratings for L2 Spanish speech samples. Fifty-four AMT workers who were native Spanish speakers from 11 countries participated in the ratings. Intraclass correlation coefficients were used to estimate group-level interrater reliability, and Rasch analyses were undertaken to examine individual differences in rater severity and fit. Excellent reliability was observed for the comprehensibility and fluency ratings, but indices were slightly lower for accentedness, leading to recommendations to improve the task for future data collection

Digital Repository @ Iowa State University (ISU)