Search CORE

30 research outputs found

Using Games to Create Language Resources: Successes and Limitations of the Approach

Author: A Quinn
AW Woolley
B Kanefsky
C Fellbaum
D Jurafsky
H Bonneau-Maynard
H Yang
J Carletta
J Cohen
J Howe
J Surowiecki
K Fort
K Siorpaes
L Ahn von
L Ahn von
L Ahn von
M Csikszentmihalyi
M Marcus
M Poesio
M Poesio
M Poesio
O Alonso
O Nov
P Sweetser
R Artstein
R Glott
R Koster
W Mason
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Abstract One of the more novel approaches to collaboratively creating language resources in recent years is to use online games to collect and validate data. The most significant challenges collaborative systems face are how to train users with the necessary expertise and how to encourage participation on a scale required to produce high quality data comparable with data produced by “traditional ” experts. In this chapter we provide a brief overview of collaborative creation and the different approaches that have been used to create language resources, before analysing games used for this purpose. We discuss some key issues in using a gaming approach, including task design, player motivation and data quality, and compare the costs of each approach in terms of development, distribution and ongoing administration. In conclusion, we summarise the benefits and limitations of using a gaming approach to resource creation and suggest key considerations for evaluating its utility in different research scenarios

University of Essex Research Repository

CiteSeerX

Crossref

University of Regensburg Publication Server

HAL Descartes

HAL-Paris 13

Hal-Diderot

Work Hard, Play Hard: Collecting Acceptability Annotations through a 3D Game

Author: Daniela Trotta
Elisa Leonardelli
Federico Bonetti
Raffaele Guarasci
Sara Tonelli
Publication venue: European Language Resources Association
Publication date: 01/01/2022
Field of study

Corpus-based studies on acceptability judgements have always stimulated the interest of researchers, both in theoretical and computational fields. Some approaches focused on spontaneous judgements collected through different types of tasks, others on data annotated through crowd-sourcing platforms, still others relied on expert annotated data available from the literature. The release of CoLA corpus, a large-scale corpus of sentences extracted from linguistic handbooks as examples of acceptable/non acceptable phenomena in English, has revived interest in the reliability of judgements of linguistic experts vs. non-experts. Several issues are still open. In this work, we contribute to this debate by presenting a 3D video game that was used to collect acceptability judgments on Italian sentences. We analyse the resulting annotations in terms of agreement among players and by comparing them with experts{'} acceptability judgments. We also discuss different game settings to assess their impact on participants{'} motivation and engagement. The final dataset containing 1,062 sentences, which were selected based on majority voting, is released for future research and comparisons

Archivio della ricerca - Fondazione Bruno Kessler

Optimising crowdsourcing efficiency: Amplifying human computation with validation

Author: Chamberlain Jon
Kruschwitz Udo
Poesio Massimo
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2018
Field of study

Crowdsourcing has revolutionised the way tasks can be completed but the process is frequently inefficient, costing practitioners time and money. This research investigates whether crowdsourcing can be optimised with a validation process, as measured by four criteria: quality; cost; noise; and speed. A validation model is described, simulated and tested on real data from an online crowdsourcing game to collect data about human language. Results show that by adding an agreement validation (or a like/upvote) step fewer annotations are required, noise and collection time are reduced and quality may be improved

University of Essex Research Repository

University of Regensburg Publication Server

Queen Mary Research Online

Annotating a broad range of anaphoric phenomena, in a variety of genres: the ARRAU Corpus

Author: Artstein R
Bristot A
Cavicchio F
Delogu F
Poesio M
Rodriguez KJ
Uryupina O
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2020
Field of study

Queen Mary Research Online

Learning from disagreement: a survey

Author: Fornaciari Tommaso
Hovy Dirk
Paun Silviu
Plank Barbara
Poesio Massimo
Uma Alexandra N.
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2021
Field of study

Many tasks in Natural Language Processing (nlp) and Computer Vision (cv) offer evidence that humans disagree, from objective tasks such as part-of-speech tagging to more subjective tasks such as classifying an image or deciding whether a proposition follows from certain premises. While most learning in artificial intelligence (ai) still relies on the assumption that a single (gold) interpretation exists for each item, a growing body of research aims to develop learning methods that do not rely on this assumption. In this survey, we review the evidence for disagreements on nlp and cv tasks, focusing on tasks for which substantial datasets containing this information have been created. We discuss the most popular approaches to training models from datasets containing multiple judgments potentially in disagreement. We systematically compare these different approaches by training them with each of the available datasets, considering several ways to evaluate the resulting models. Finally, we discuss the results in depth, focusing on four key research questions, and assess how the type of evaluation and the characteristics of a dataset determine the answers to these questions. Our results suggest, first of all, that even if we abandon the assumption of a gold standard, it is still essential to reach a consensus on how to evaluate models. This is because the relative performance of the various training methods is critically affected by the chosen form of evaluation. Secondly, we observed a strong dataset effect. With substantial datasets, providing many judgments by high-quality coders for each item, training directly with soft labels achieved better results than training from aggregated or even gold labels. This result holds for both hard and soft evaluation. But when the above conditions do not hold, leveraging both gold and soft labels generally achieved the best results in the hard evaluation. All datasets and models employed in this paper are freely available as supplementary materials

Archivio istituzionale della Ricerca - Bocconi

Making the Most of Crowd Information: Learning and Evaluation in AI tasks with Disagreements.

Author: Uma. Alexandra Nnemamaka.
Publication venue: Queen Mary University of London.
Publication date: 01/08/2021
Field of study

PhD ThesesThere is plenty of evidence that humans disagree on the interpretation of many tasks in Natural Language Processing (nlp) and Computer Vision (cv), from objective tasks rooted in linguistics such as part-of-speech tagging to more subjective (observerdependent) tasks such as classifying an image or deciding whether a proposition follows from a certain premise. While most learning in Artificial Intelligence (ai) still relies on the assumption that a single interpretation, captured by the gold label, exists for each item, a growing research body in recent years has focused on learning methods that do not rely on this assumption. Rather, they aim to learn ranges of truth amidst disagreement. This PhD research makes a contribution to this field of study. Firstly, we analytically review the evidence for disagreement on nlp and cv tasks, focusing on tasks where substantial datasets with such information have been created. As part of this review, we also discuss the most popular approaches to training models from datasets containing multiple judgments and group these methods together according to their handling of disagreement. Secondly, we make three proposals for learning with disagreement; soft-loss, multi-task learning from gold and crowds, and automatic temperature-scaled soft-loss. Thirdly, we address one gap in this field of study – the prevalence of hard metrics for model evaluation even when the gold assumption is shown to be an idealization – by proposing several previously existing metrics and novel soft metrics that do not make this assumption and analyzing the merits and assumptions of all the metrics, hard and soft. Finally, we carry out a systematic investigation of the key proposals in learning with disagreement by training them across several tasks, considering several ways to evaluate the resulting models and assessing the conditions under which each approach is effective. This is a key contribution of this research as research in learning with disagreement do not often test proposals across tasks, compare proposals with a variety of approaches, or evaluate using both soft metrics and hard metrics. The results obtained suggest, first of all, that it is essential to reach a consensus on how to evaluate models. This is because the relative performance of the various training methods is critically affected by the chosen form of evaluation. Secondly, we observed a strong dataset effect. With substantial datasets, providing many judgments by high-quality coders for each item, training directly with soft labels achieved better results than training from aggregated or even gold labels. This result holds for both hard and soft evaluation. But when the above conditions do not hold, leveraging both gold and soft labels generally achieved the best results in the hard evaluation. All datasets and models employed in this paper are freely available as supplementary materials

Queen Mary Research Online

Anaphora resolution for Arabic machine translation :a case study of nafs

Author: Hamouda Wafya
Publication venue: Newcastle Univeristy
Publication date: 01/01/2014
Field of study

PhD ThesisIn the age of the internet, email, and social media there is an increasing need for processing online information, for example, to support education and business. This has led to the rapid development of natural language processing technologies such as computational linguistics, information retrieval, and data mining. As a branch of computational linguistics, anaphora resolution has attracted much interest. This is reflected in the large number of papers on the topic published in journals such as Computational Linguistics. Mitkov (2002) and Ji et al. (2005) have argued that the overall quality of anaphora resolution systems remains low, despite practical advances in the area, and that major challenges include dealing with real-world knowledge and accurate parsing. This thesis investigates the following research question: can an algorithm be found for the resolution of the anaphor nafs in Arabic text which is accurate to at least 90%, scales linearly with text size, and requires a minimum of knowledge resources? A resolution algorithm intended to satisfy these criteria is proposed. Testing on a corpus of contemporary Arabic shows that it does indeed satisfy the criteria.Egyptian Government

Newcastle University eTheses

The semantic transparency of English compound nouns

Author: Schäfer Martin
Publication venue: Language Science Press
Publication date: 31/08/2017
Field of study

What is semantic transparency, why is it important, and which factors play a role in its assessment? This work approaches these questions by investigating English compound nouns. The first part of the book gives an overview of semantic transparency in the analysis of compound nouns, discussing its role in models of morphological processing and differentiating it from related notions. After a chapter on the semantic analysis of complex nominals, it closes with a chapter on previous attempts to model semantic transparency. The second part introduces new empirical work on semantic transparency, introducing two different sets of statistical models for compound transparency. In particular, two semantic factors were explored: the semantic relations holding between compound constituents and the role of different readings of the constituents and the whole compound, operationalized in terms of meaning shifts and in terms of the distribution of specifc readings across constituent families. All semantic annotations used in the book are freely available

Language Science Press

The semantic transparency of English compound nouns

Author: Schäfer Martin
Publication venue: Language Science Press
Publication date: 01/04/2020
Field of study

Directory of Open Access Books (DOAB)

The semantic transparency of English compound nouns

Author: Schäfer Martin
Publication venue: Language Science Press
Publication date: 31/08/2017
Field of study

Language Science Press