Search CORE

8 research outputs found

Parallel Speech Collection for Under-resourced Language Studies Using the Lig-Aikuma Mobile Device App

Author: Adda-Decker Martine
Besacier Laurent
Blachon David
Gauthier Elodie
Kouarata Guy-Noël
Rialland Annie
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

International audienceThis paper reports on our ongoing efforts to collect speech data in under-resourced or endangered languages of Africa. Data collection is carried out using an improved version of the Android application Aikuma developed by Steven Bird and colleagues 1. Features were added to the app in order to facilitate the collection of parallel speech data in line with the requirements of the French-German ANR/DFG BULB (Breaking the Unwritten Language Barrier) project. The resulting app, called Lig-Aikuma, runs on various mobile phones and tablets and proposes a range of different speech collection modes (recording, respeaking, translation and elicitation). Lig-Aikuma's improved features include a smart generation and handling of speaker metadata as well as respeaking and parallel audio data mapping. It was used for field data collections in Congo-Brazzaville resulting in a total of over 80 hours of speech. Design issues of the mobile app as well as the use of Lig-Aikuma during two recording campaigns, are further described in this paper

Elsevier - Publisher Connector

Hal - Université Grenoble Alpes

Ghent University Academic Bibliography

HAL

Automatic Speech Recognition for Low-Resource and Morphologically Complex Languages

Author: Morris Ethan
Publication venue: RIT Scholar Works
Publication date: 01/04/2021
Field of study

The application of deep neural networks to the task of acoustic modeling for automatic speech recognition (ASR) has resulted in dramatic decreases of word error rates, allowing for the use of this technology in smart phones and personal home assistants in high-resource languages. Developing ASR models of this caliber, however, requires hundreds or thousands of hours of transcribed speech recordings, which presents challenges for most of the world’s languages. In this work, we investigate the applicability of three distinct architectures that have previously been used for ASR in languages with limited training resources. We tested these architectures using publicly available ASR datasets for several typologically and orthographically diverse languages, whose data was produced under a variety of conditions using different speech collection strategies, practices, and equipment. Additionally, we performed data augmentation on this audio, such that the amount of data could increase nearly tenfold, synthetically creating higher resource training. The architectures and their individual components were modified, and parameters explored such that we might find a best-fit combination of features and modeling schemas to fit a specific language morphology. Our results point to the importance of considering language-specific and corpus-specific factors and experimenting with multiple approaches when developing ASR systems for resource-constrained languages

RIT Scholar Works

A Temporal Coherence Loss Function for Learning Unsupervised Acoustic Embeddings

Author: Dupoux Emmanuel
Synnaeve Gabriel
Publication venue: The Author(s). Published by Elsevier B.V.
Publication date: 31/12/2016
Field of study

AbstractWe train neural networks of varying depth with a loss function which imposes the output representations to have a temporal profile which looks like that of phonemes. We show that a simple loss function which maximizes the dissimilarity between near frames and long distance frames helps to construct a speech embedding that improves phoneme discriminability, both within and across speakers, even though the loss function only uses within speaker information. However, with too deep an architecture, this loss function yields overfitting, suggesting the need for more data and/or regularization

Elsevier - Publisher Connector

Language documentation twenty-five years on

Author: Evans N.
Hammarström H.
Levinson S.C.
Seifart F.
Publication venue
Publication date: 01/12/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Modelau Cyfrifiadurol ar gyfer Prosesu Lleferydd Cymraeg

Author: Marshall Indeg
Publication venue
Publication date: 30/04/2021
Field of study

Bangor University Research Portal

Journal of Telecommunications and Information Technology

Author
Publication venue: 'National Institute of Telecommunications'
Publication date
Field of study

kwartalni

Biblioteka Cyfrowa Instytutu Łączności / National Institute of Telecomunications: Digital Library

Using weighted model averaging in distributed multilingual DNNs to improve low resource ASR

Author: Sahraeian Reza
Van Compernolle Dirk
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

© 2016 The Authors. Multilingual Deep Neural Networks (DNNs) have been successfully used to leverage out-of-language data to boost the performance of a low resource ASR. However, the mismatch between auxiliary source languages and the target language can leave a negative effect on acoustic modeling for the target language. Thus, a key challenge in multilingual DNNs is to exploit acoustic data from multiple donor languages to improve on ASR performance while mitigating the problem of language mismatch. In this paper, we propose to employ weighted model averaging in the framework of distributed multilingual DNN which allows the target language or similar languages to take higher weights during the multilingual DNN training, and consequently shift the parameters towards the acoustic space of target data. Furthermore, we utilize the same strategy in the adaptation phase where a conventional multilingual DNN is the starting point and retraining is applied using all languages with different weights. The experiments with four languages from the GlobalPhone dataset show that the recognition performances in both scenarios are improved. The latter, moreover, provides a low-cost and efficient methodology for multilingual DNNs.Sahraeian R., Van Compernolle D., ''Using weighted model averaging in distributed multilingual DNNs to improve low resource ASR'', Procedia computer science, vol. 81, pp. 152-158, 2016 (5th international workshop on spoken language technologies for under-resourced languages - SLTU 2016, May 9-12, 2016, Yogyakarta, Indonesia).status: publishe

Lirias

Natural Language Processing: Emerging Neural Approaches and Applications

Author
Publication venue: 'MDPI AG'
Publication date: 06/05/2022
Field of study

This Special Issue highlights the most recent research being carried out in the NLP field to discuss relative open issues, with a particular focus on both emerging approaches for language learning, understanding, production, and grounding interactively or autonomously from data in cognitive and neural systems, as well as on their potential or real applications in different domains

Directory of Open Access Books (DOAB)