Search CORE

1,127 research outputs found

An introduction to crowdsourcing for language and multimedia technology research

Author: A. Doan
C. Callison-Burch
C. Rashtchian
G. Paolacci
G. Pickard
J. Ross
L. Ahn von
L. Ahn von
M. Larson
O. Alonso
R. Snow
S. Novotney
T. Yan
V.C. Rayker
V.S. Sheng
W. Mason
W. Willett
W.S. Lasecki
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Language and multimedia technology research often relies on large manually constructed datasets for training or evaluation of algorithms and systems. Constructing these datasets is often expensive with significant challenges in terms of recruitment of personnel to carry out the work. Crowdsourcing methods using scalable pools of workers available on-demand offers a flexible means of rapid low-cost construction of many of these datasets to support existing research requirements and potentially promote new research initiatives that would otherwise not be possible

Crossref

Irish Universities

DCU Online Research Access Service

An Open System for Social Computation

Author: Adamo Angela
Aloisi Alessandra
Calzetti Daniela
Cignoni Michele
Cook David O.
Dale Daniel A.
Elmegreen Bruce G.
Elmegreen Debra M.
Gallagher John S., III
Gouliermis Dimitrios A.
Grasha Kathryn
Grebel Eva K.
Herrero Davó Artemio
Hunter Deidre A.
Johnson Kelsey E.
Kim Hwihyun
Lee Janice C.
Nair Preethi
Nota Antonella
Pellerin Anne
Ryon Jenna
Sabbi Elena
Sacchi Elena
Smith Linda J.
Thilker David
Tosi Monica
Ubeda Leonardo
Whitmore Brad
Publication venue: 'IOS Press'
Publication date: 01/01/2014
Field of study

Part of the power of social computation comes from using the collective intelligence of humans to tame the aggregate uncertainty of (otherwise) low veracity data obtained from human and automated sources. We have witnessed a surge in development of social computing systems but, ironically, there have been few attempts to generalise across this activity so that creation of the underlying mechanisms themselves can be made more social. We describe a method for achieving this by standardising patterns of social computation via lightweight formal specifications (we call these social artifacts) that can be connected to existing internet architectures via a single model of computation. Upon this framework we build a mechanism for extracting provenance meta-data across social computations

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Crossref

Archivio della Ricerca - Università di Pisa

Caltech Authors

An Open System for Social Computation

Author: Moreau Luc
Murray-Rust David
O'Hara Kieron
Robertson David
Publication venue: 'IOS Press'
Publication date: 01/01/2014
Field of study

Southampton (e-Prints Soton)

Genie: A Generator of Natural Language Semantic Parsers for Virtual Assistant Commands

Author: Alvarez-Melis David
Banarescu Laura
Chen David L
Chu Shumo
Ganitkevitch Juri
Kate Rohit J
Kingma Diederik P
Pasupat Panupong
Quirk Chris
Shetty Jitesh
Steedman Mark
Trakhtenbrot Boris A.
Wang Yushi
Wong Yuk Wah
Xu Xiaojun
Zelle John M
Zettlemoyer Luke S
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/04/2019
Field of study

To understand diverse natural language commands, virtual assistants today are trained with numerous labor-intensive, manually annotated sentences. This paper presents a methodology and the Genie toolkit that can handle new compound commands with significantly less manual effort. We advocate formalizing the capability of virtual assistants with a Virtual Assistant Programming Language (VAPL) and using a neural semantic parser to translate natural language into VAPL code. Genie needs only a small realistic set of input sentences for validating the neural model. Developers write templates to synthesize data; Genie uses crowdsourced paraphrases and data augmentation, along with the synthesized data, to train a semantic parser. We also propose design principles that make VAPL languages amenable to natural language translation. We apply these principles to revise ThingTalk, the language used by the Almond virtual assistant. We use Genie to build the first semantic parser that can support compound virtual assistants commands with unquoted free-form parameters. Genie achieves a 62% accuracy on realistic user inputs. We demonstrate Genie's generality by showing a 19% and 31% improvement over the previous state of the art on a music skill, aggregate functions, and access control.Comment: To appear in PLDI 201

arXiv.org e-Print Archive

Crossref

Bringing Structure into Summaries: Crowdsourcing a Benchmark Corpus of Concept Maps

Author: Falke Tobias
Gurevych Iryna
Publication venue
Publication date: 21/07/2017
Field of study

Concept maps can be used to concisely represent important information and bring structure into large document collections. Therefore, we study a variant of multi-document summarization that produces summaries in the form of concept maps. However, suitable evaluation datasets for this task are currently missing. To close this gap, we present a newly created corpus of concept maps that summarize heterogeneous collections of web documents on educational topics. It was created using a novel crowdsourcing approach that allows us to efficiently determine important elements in large document collections. We release the corpus along with a baseline system and proposed evaluation protocol to enable further research on this variant of summarization.Comment: Published at EMNLP 201

arXiv.org e-Print Archive

TUbiblio

ENHANCING USERS’ EXPERIENCE WITH SMART MOBILE TECHNOLOGY

Author: Haji Matyassin Haji Mohammad Alimin
Ministry of Education Brunei Darussalam
Publication venue
Publication date: 01/01/2015
Field of study

The aim of this thesis is to investigate mobile guides for use with smartphones. Mobile guides have been successfully used to provide information, personalisation and navigation for the user. The researcher also wanted to ascertain how and in what ways mobile guides can enhance users' experience. This research involved designing and developing web based applications to run on smartphones. Four studies were conducted, two of which involved testing of the particular application. The applications tested were a museum mobile guide application and a university mobile guide mapping application. Initial testing examined the prototype work for the ‘Chronology of His Majesty Sultan Haji Hassanal Bolkiah’ application. The results were used to assess the potential of using similar mobile guides in Brunei Darussalam’s museums. The second study involved testing of the ‘Kent LiveMap’ application for use at the University of Kent. Students at the university tested this mapping application, which uses crowdsourcing of information to provide live data. The results were promising and indicate that users' experience was enhanced when using the application. Overall results from testing and using the two applications that were developed as part of this thesis show that mobile guides have the potential to be implemented in Brunei Darussalam’s museums and on campus at the University of Kent. However, modifications to both applications are required to fulfil their potential and take them beyond the prototype stage in order to be fully functioning and commercially viable

Kent Academic Repository

Modeling, enacting, and integrating custom crowdsourcing processes

Author: Casati Fabio
Daniel Florian
Kucherbaev Pavel
Tranquillini Stefano
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Crowdsourcing (CS) is the outsourcing of a unit of work to a crowd of people via an open call for contributions. Thanks to the availability of online CS platforms, such as Amazon Mechanical Turk or CrowdFlower, the practice has experienced a tremendous growth over the past few years and demonstrated its viability in a variety of fields, such as data collection and analysis or human computation. Yet it is also increasingly struggling with the inherent limitations of these platforms: each platform has its own logic of how to crowdsource work (e.g., marketplace or contest), there is only very little support for structured work (work that requires the coordination of multiple tasks), and it is hard to integrate crowdsourced tasks into stateof-the-art business process management (BPM) or information systems. We attack these three shortcomings by (1) developing a flexible CS platform (we call it Crowd Computer, or CC) that allows one to program custom CS logics for individual and structured tasks, (2) devising a BPMN-based modeling language that allows one to program CC intuitively, (3) equipping the language with a dedicated visual editor, and (4) implementing CC on top of standard BPM technology that can easily be integrated into existing software and processes. We demonstrate the effectiveness of the approach with a case study on the crowd-based mining of mashup model patterns

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Hierarchical Multi-Label Classification of Online Vaccine Concerns

Author: Dhingra Bhuwan
Stureborg Rickard
Zhu Chloe Qinyu
Publication venue
Publication date: 01/02/2024
Field of study

Vaccine concerns are an ever-evolving target, and can shift quickly as seen during the COVID-19 pandemic. Identifying longitudinal trends in vaccine concerns and misinformation might inform the healthcare space by helping public health efforts strategically allocate resources or information campaigns. We explore the task of detecting vaccine concerns in online discourse using large language models (LLMs) in a zero-shot setting without the need for expensive training datasets. Since real-time monitoring of online sources requires large-scale inference, we explore cost-accuracy trade-offs of different prompting strategies and offer concrete takeaways that may inform choices in system designs for current applications. An analysis of different prompting strategies reveals that classifying the concerns over multiple passes through the LLM, each consisting a boolean question whether the text mentions a vaccine concern or not, works the best. Our results indicate that GPT-4 can strongly outperform crowdworker accuracy when compared to ground truth annotations provided by experts on the recently introduced VaxConcerns dataset, achieving an overall F1 score of 78.7%.Comment: Published in AAAI 2024 Health Intelligence worksho

arXiv.org e-Print Archive