Search CORE

13 research outputs found

Automatic extraction of prosodic features for automatic language identification

Author: ANDRE-OBRECHT R.
FARINAS J.
PELLEGRINO F.
ROUAS J.-L.
Publication venue: GRETSI, Saint Martin d'Hères, France
Publication date: 01/01/2005
Field of study

The aim of this study is to propose a new approach to Automatic Language Identification: it is based on rhythmic modelling and fundamental frequency modelling and does not require any hand labelled data. First we need to investigate how prosodic or rhythmic information can be taken into account for Automatic Language Identification. A new automatically extracted unit, the pseudo syllable, is introduced. Rhythmic and intonative features are then automatically extracted from this unit. Elementary decision modules are defined with gaussian mixture models. These prosodic modellings are combined with a more classical approach, a vocalic system acoustic modelling. Experiments are conducted on the five European languages of the MULTEXT corpus: English, French, German, Italian and Spanish. The relevance of the rhythmic parameters and the efficiency of each system (rhythmic model, fundamental frequency model and vowel system model) are evaluated. The influence of these approaches on the performances of automatic language identification system is addressed. We obtain 91 % of correct identification with 21 s. utterances using all the information sources.Le but de cette étude est de proposer une nouvelle approche pour l’identification automatique des langues, basée sur une modélisation du rythme, ne nécessitant pas de données étiquetées manuellement. Il faut tout d’abord savoir comment apporter des informations sur la prosodie, le rythme pour l’identification automatique des langues. Pour répondre à cette question nous avons introduit une nouvelle unité, la pseudo-syllabe, qui est automatiquement extraite. Des paramètres rythmiques et intonatifs sont alors calculés à partir de cette unité. Des modèles élémentaires pour chaque type de paramètres sont définis en utilisant des mélanges de lois gaussiennes. Ces modélisations de la prosodie sont couplées à une approche plus classique utilisant une modélisation acoustique des systèmes vocaliques. Les expériences sont menées sur les cinq langues européennes du corpus MULTEXT. L’intérêt des paramètres rythmiques, et l’efficacité de chaque système (modèle rythmique, modèle de la fréquence fondamentale et modèle vocalique) sont évalués. L’impact de ces approches sur les performances d’identification est analysé. Nous obtenons des résultats de 91% d’identification correcte avec des fichiers de 21 secondes

I-Revues

Reconnaissance de la parole dans le cadre de très grands vocabulaires

Author: B. JACOB
R. ANDRE-OBRECHT
Publication venue: 'EDP Sciences'
Publication date: 01/01/1994
Field of study

This paper describes a new strategy for very large vocabulary speech recognition. The main problem is to reduce the lexical access without pruning the correct candidate. We propose to exploit the branching structure of BDLEX and the description of each word into root and flexional ending. More we use the notion of phonetic classes to decompose the dictionnary into sub-dictionnaries. We develop a two-stage recognition algorithm : — Each dictionnary which is considered as a sequence of phonetics classes is modeled by a HMM where the elementaries units are these phonetics classes. — Each word is modeled by a classical HMM where the elementary unit is the pseudodyphone. For a unknown word utterance, a first recognition gives the best dictionnary to which it belongs, the Viterbi algorithm applied to the network of the best dictionnary words, gives the word with the most likelihood. Experiments are carried out with telephonic database

EDP Sciences OAI-PMH repository (1.2.0)

Scientific Publications of the University of Toulouse II Le Mirail