Search CORE

5 research outputs found

How speaker tongue and name source language affect the automatic recognition of spoken names

Author: D'Hoore Bart
Martens Jean-Pierre
Réveil Bert
Publication venue: International Speech Communication Association (ISCA)
Publication date: 01/01/2009
Field of study

In this paper the automatic recognition of person names and geographical names uttered by native and non-native speakers is examined in an experimental set-up. The major aim was to raise our understanding of how well and under which circumstances previously proposed methods of multilingual pronunciation modeling and multilingual acoustic modeling contribute to a better name recognition in a cross-lingual context. To come to a meaningful interpretation of results we have categorized each language according to the amount of exposure a native speaker is expected to have had to this language. After having interpreted our results we have also tried to find an answer to the question of how much further improvement one might be able to attain with a more advanced pronunciation modeling technique which we plan to develop

Ghent University Academic Bibliography

Designing a Bilingual Speech Corpus for French and German Language Learners: a Two-Step Process

Author: Andreeva Bistra
Bonneau Anne
Colotte Vincent
Fauth Camille
Fohr Dominique
Jouvet Denis
Jügler Jeanin
Laprie Yves
Mella Odile
Möbius Bernd
Trouvain Jürgen
Zimmerer Frank
Publication venue: HAL CCSD
Publication date: 26/05/2014
Field of study

International audienceWe present the design of a corpus of native and non-native speech for the language pair French-German, with a special emphasis on phonetic and prosodic aspects. To our knowledge there is no suitable corpus, in terms of size and coverage, currently available for the target language pair. To select the target L1-L2 interference phenomena we prepare a small preliminary corpus (corpus1), which is analyzed for coverage and cross-checked jointly by French and German experts. Based on this analysis, target phenomena on the phonetic and phonological level are selected on the basis of the expected degree of deviation from the native performance and the frequency of occurrence. 14 speakers performed both L2 (either French or German) and L1 material (either German or French). This allowed us to test, recordings duration, recordings material, the performance of our automatic aligner software. Then, we built corpus2 taking into account what we learned about corpus1. The aims are the same but we adapted speech material to avoid too long recording sessions. 100 speakers will be recorded. The corpus (corpus1 and corpus2) will be prepared as a searchable database, available for the scientific community after completion of the project

INRIA a CCSD electronic archive server

Constitution d'un Corpus de Français Langue Etrangère destiné aux Apprenants Allemands

Author: Bonneau Anne
Colotte Vincent
Fauth Camille
Fohr Dominique
Jouvet Denis
Laprie Yves
Mella Odile
Trouvain Jürgen
Publication venue: 'EDP Sciences'
Publication date: 01/01/2014
Field of study

International audienceLa plupart des corpus en langue se focalisent sur les phénomènes linguistiques écrits et concernent l’anglais (voir le site web : « Learner corpora around the world » de l’Université de Louvain - Belgique). La recherche phonétique sur l’acquisition d’une L2 est généralement orientée vers l’étude des phénomènes segmentaux et la plupart des études ont également l’anglais comme langue cible. Les modèles de parole en L2 actuels - voir par exemple Speech Learning Model (Flege, 1995) ou Best’s Perceptual Assimilation Model (Best, 1995) – négligent bien souvent les aspects prosodiques. Notre étude concerne le français en tant que langue seconde et s’inscrit dans un projet plus vaste mené en partenariat avec une université allemande, dont l’un des buts est le développement de l’apprentissage des langues par ordinateur. (Projet ANR-DFG – Agence Nationale de la Recherche et Deutsche Forschungsgemeinschaft attribué à l’équipe Parole du LORIA UMR 7503, Nancy – France et à l’Equipe de Linguistique Computationnelle et de Phonétique FR 4.7 de l’Université de la Sarre Sarrebruck – Allemagne) dans lequel le français et l’allemand sont des langues cibles. Pour la paire allemand-français, peu de corpus parallèles sont disponibles. Nous présentons ici l’élaboration d’un corpus de productions orales de locuteurs natifs et non natifs pour la paire allemand-français. Notre corpus entend mettre au jour les déviations phonétiques et phonologiques que les locuteurs allemands produisent lorsqu’ils apprennent le français. Ce travail s’insère dans un projet plus global, Ce projet entend étudier les difficultés que les locuteurs français rencontrent lorsqu’ils apprennent l’allemand, et réciproquement. Aussi, cinquante locuteurs allemands seront recrutés dans des milieux universitaires et scolaires (niveau lycée) en Allemagne et cinquante locuteurs français dans les mêmes milieux en France. Il s’agit pour les deux populations de produire d’une part le corpus en langue étrangère (en langue française pour les locuteurs allemands et en langue allemande pour les locuteurs français) mais également le corpus en langue maternelle (en allemand pour les allemands et en français pour les français). Les corpus ainsi obtenus devraient nous permettre d’identifier les difficultés que les locuteurs allemands ou français rencontrent lorsqu’ils apprennent le français ou l’allemand. Les données de contrôle sont doubles puisque l’on pourra à la fois se référer aux productions des apprenants dans leur langue maternelle (ici l’allemand), mais également à celles de locuteurs natifs (ici germanophones). Nous ne présenterons ici que la constitution du corpus en français

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

INRIA a CCSD electronic archive server

On using units trained on foreign data for improved multiple accent speech recognition

Author: Arslan
Denis Jouvet
Flege
Goronzy
He
Katarina Bartkova
Mokbel
Strik
Van Compernolle
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

WHERE IS THE LOCUS OF DIFFICULTY IN RECOGNIZING FOREIGN-ACCENTED WORDS? NEIGHBORHOOD DENSITY AND PHONOTACTIC PROBABILITY EFFECTS ON THE RECOGNITION OF FOREIGN-ACCENTED WORDS BY NATIVE ENGLISH LISTENERS

Author: Chan Kit Ying
Publication venue: 'Paleontological Institute at The University of Kansas'
Publication date: 01/01/2012
Field of study

This series of experiments (1) examined whether native listeners experience recognition difficulty in all kinds of foreign-accented words or only in a subset of words with certain lexical and sub-lexical characteristics-- neighborhood density and phonotactic probability; (2) identified the locus of foreign-accented word recognition difficulty, and (3) investigated how accent-induced mismatches impact the lexical retrieval process. Experiments 1 and 4 examined the recognition of native-produced and foreign-accented words varying in neighborhood density with auditory lexical decision and perceptual identification tasks respectively, which emphasize the lexical level of processing. Findings from Experiment 1 revealed increased accent-induced processing cost in reaction times, especially for words with many similar sounding words, implying that native listeners increase their reliance on top-down lexical knowledge during foreign-accented word recognition. Analysis of perception errors from Experiment 4 found the misperceptions in the foreign-accented condition to be more similar to the target words than those in the native-produced condition. This suggests that accent-induced mismatches tend to activate similar sounding words as alternative word candidates, which possibly pose increased lexical competition for the target word and result in greater processing costs for foreign-accented word recognition at the lexical level. Experiments 2 and 3 examined the sub-lexical processing of the foreign-accented words varying in neighborhood density and phonotactic probability respectively with a same-different matching task, which emphasizes the sub-lexical level of processing. Findings from both experiments revealed no extra processing costs , in either reaction times or accuracy rates, for the foreign-accented stimuli, implying that the sub-lexical processing of the foreign-accented words is as good as that of the native-produced words. Taken together, the overall recognition difficulty of foreign-accented stimuli, as well as the differentially increased processing difficulty for accented dense words (observed in Experiment 1), mainly stems from the lexical level, due to the increased lexical competition posed by the similar sounding word candidates

KU ScholarWorks