13 research outputs found

    Automatic extraction of prosodic features for automatic language identification

    Get PDF
    The aim of this study is to propose a new approach to Automatic Language Identification: it is based on rhythmic modelling and fundamental frequency modelling and does not require any hand labelled data. First we need to investigate how prosodic or rhythmic information can be taken into account for Automatic Language Identification. A new automatically extracted unit, the pseudo syllable, is introduced. Rhythmic and intonative features are then automatically extracted from this unit. Elementary decision modules are defined with gaussian mixture models. These prosodic modellings are combined with a more classical approach, a vocalic system acoustic modelling. Experiments are conducted on the five European languages of the MULTEXT corpus: English, French, German, Italian and Spanish. The relevance of the rhythmic parameters and the efficiency of each system (rhythmic model, fundamental frequency model and vowel system model) are evaluated. The influence of these approaches on the performances of automatic language identification system is addressed. We obtain 91 % of correct identification with 21 s. utterances using all the information sources.Le but de cette Ă©tude est de proposer une nouvelle approche pour l’identification automatique des langues, basĂ©e sur une modĂ©lisation du rythme, ne nĂ©cessitant pas de donnĂ©es Ă©tiquetĂ©es manuellement. Il faut tout d’abord savoir comment apporter des informations sur la prosodie, le rythme pour l’identification automatique des langues. Pour rĂ©pondre Ă  cette question nous avons introduit une nouvelle unitĂ©, la pseudo-syllabe, qui est automatiquement extraite. Des paramĂštres rythmiques et intonatifs sont alors calculĂ©s Ă  partir de cette unitĂ©. Des modĂšles Ă©lĂ©mentaires pour chaque type de paramĂštres sont dĂ©finis en utilisant des mĂ©langes de lois gaussiennes. Ces modĂ©lisations de la prosodie sont couplĂ©es Ă  une approche plus classique utilisant une modĂ©lisation acoustique des systĂšmes vocaliques. Les expĂ©riences sont menĂ©es sur les cinq langues europĂ©ennes du corpus MULTEXT. L’intĂ©rĂȘt des paramĂštres rythmiques, et l’efficacitĂ© de chaque systĂšme (modĂšle rythmique, modĂšle de la frĂ©quence fondamentale et modĂšle vocalique) sont Ă©valuĂ©s. L’impact de ces approches sur les performances d’identification est analysĂ©. Nous obtenons des rĂ©sultats de 91% d’identification correcte avec des fichiers de 21 secondes

    Reconnaissance de la parole dans le cadre de trĂšs grands vocabulaires

    No full text
    This paper describes a new strategy for very large vocabulary speech recognition. The main problem is to reduce the lexical access without pruning the correct candidate. We propose to exploit the branching structure of BDLEX and the description of each word into root and flexional ending. More we use the notion of phonetic classes to decompose the dictionnary into sub-dictionnaries. We develop a two-stage recognition algorithm : — Each dictionnary which is considered as a sequence of phonetics classes is modeled by a HMM where the elementaries units are these phonetics classes. — Each word is modeled by a classical HMM where the elementary unit is the pseudodyphone. For a unknown word utterance, a first recognition gives the best dictionnary to which it belongs, the Viterbi algorithm applied to the network of the best dictionnary words, gives the word with the most likelihood. Experiments are carried out with telephonic database
    corecore