Search CORE

14 research outputs found

Crowdsourcing for Speech: Economic, Legal and Ethical analysis

Author: Adda Gilles
Besacier Laurent
Gelas Hadrien
Mariani Joseph
Publication venue: HAL CCSD
Publication date: 01/01/2014
Field of study

With respect to spoken language resource production, Crowdsourcing - the process of distributing tasks to an open, unspecified population via the internet - offers a wide range of opportunities: populations with specific skills are potentially instantaneously accessible somewhere on the globe for any spoken language. As is the case for most newly introduced high-tech services, crowdsourcing raises both hopes and doubts, certainties and questions. A general analysis of Crowdsourcing for Speech processing could be found in (Eskenazi et al., 2013). This article will focus on ethical, legal and economic issues of crowdsourcing in general (Zittrain, 2008a) and of crowdsourcing services such as Amazon Mechanical Turk (Fort et al., 2011; Adda et al., 2011), a major platform for multilingual language resources (LR) production

Hal - Université Grenoble Alpes

HAL

Enquête auprès d'un locuteur du Gisir (Gabon) à Lyon: réflexion sur l'origine de la variation

Author: Hadrien Gelas
Publication venue: 'Brill'
Publication date: 01/01/2010
Field of study

Crossref

Développement de ressources en swahili pour un sytème de reconnaisance automatique de la parole (Developments of Swahili resources for an automatic speech recognition system) [in French]

Author: Besacier Laurent
Gelas Hadrien
Pellegrino Francois
Publication venue: ATALA/AFCP
Publication date: 01/01/2012
Field of study

International audienceno abstrac

Hal - Université Grenoble Alpes

Developments of Swahili resources for an automatic speech recognition system

Author: Besacier Laurent
Gelas Hadrien
Pellegrino F.
Publication venue: HAL CCSD
Publication date: 01/01/2012
Field of study

International audienceno abstrac

Hal - Université Grenoble Alpes

Quality assessment of crowdsourcing transcriptions for African languages

Author: Abate S.T.
Besacier Laurent
Gelas Hadrien
Pellegrino F.
Publication venue: HAL CCSD
Publication date: 01/01/2011
Field of study

International audienceno abstrac

Hal - Université Grenoble Alpes

Using automatic speech recognition for phonological purposes: Study of Vowel Lenght in Punu (Bantu B40)

Author: Besacier Laurent
Gelas Hadrien
Pellegrino Francois
Rossato Solange
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

International audienceno abstrac

Hal - Université Grenoble Alpes

Crowdsourcing for Speech: Economic, Legal and Ethical analysis

Author: Adda Gilles
Besacier Laurent
Gelas Hadrien
Mariani Joseph
Publication venue: HAL CCSD
Publication date: 01/01/2014
Field of study

HAL

Quality assessment of crowdsourcing transcriptions for African languages

Author: François Pellegrino
Hadrien Gelas
Laurent Besacier
Solomon Teferra Abate
Publication venue
Publication date: 01/01/2011
Field of study

We evaluate the quality of speech transcriptions acquired by crowdsourcing to develop ASR acoustic models (AM) for under-resourced languages. We have developed AMs using reference (REF) transcriptions and transcriptions from crowdsourcing (TRK) for Swahili and Amharic. While the Amharic transcription was much slower than that of Swahili to complete, the speech recognition systems developed using REF and TRK transcriptions have almost similar (40.1 vs 39.6 for Amharic and 38.0 vs 38.5 for Swahili) word recognition error rate. Moreover, the character level disagreement rates between REF and TRK are only 3.3 % and 6.1 % for Amharic and Swahili, respectively. We conclude that it is possible to acquire quality transcriptions from the crowd for under-resourced languages using Amazon’s Mechanical Turk. Recognizing such a great potential of it, we recommend some legal and ethical issues to consider. Index Terms: speech transcription, under-resourced languages, African languages, Amazon’s Mechanical Tur

CiteSeerX

Hal - Université Grenoble Alpes

Analyse des performances de modèles de langage sub-lexicale pour des langues peu-dotées à morphologie riche (Performance analysis of sub-word language modeling for under-resourced languages with rich morphology: case study on Swahili and Amharic) [in Fren

Author: Abate Solomon Teferra
Besacier Laurent
Gelas Hadrien
Pellegrino Francois
Publication venue: ATALA/AFCP
Publication date: 01/01/2012
Field of study

International audienceno abstrac

Hal - Université Grenoble Alpes