Search CORE

2 research outputs found

Spectre de rythme et sources multiples : au cœur des contenus ethnomusicologiques et sonores

Author: Le Coz Maxime
Publication venue
Publication date: 10/07/2014
Field of study

Les travaux de cette thèse portent sur des méthodes permettant de retrouver automatiquement des informations dans des enregistrements sonores. Les données que nous analysons sont fournies par les archives du Musée de l’Homme de Paris : il s’agit de milliers d’heures d’enregistrements musicaux et d’interviews de 1900 à nos jours. Nous proposons deux types d’analyse conçues pour fonctionner aussi bien sûr de la musique que sur de la parole. Le premier permet d’extraire le rythme de l’enregistrement à partir de la répartition des zones stables du signal à l’aide d’un « spectre de rythme ». Le second effectue un suivi sur les fréquences les plus présentes et cherche à les regrouper par source pour détecter si plusieurs personnes ou instruments sont présents. Ces analyses peuvent permettre, entre autres, de retrouver la structure d’un chant en fonction du nombre de sources ou savoir si une personne parle, raconte, récite en encore scande en utilisant le rythme présent dans la parole.This thesis aims at designing methods to automatically extract information on sound signals. The sound archives we analyse are provided by the Musée de l’Homme of Paris : they are compounded of thousands of hours of musical recording and interviews from year 1900 to nowadays. We propose two different types of analysis designed to work on music as well as speech. The first system aims at extracting rhythm according to the repartition of stable areas of the signal using a “rhythm spectrum”. The second uses a frequency tracking of the most predominant frequencies to group them into source-related clusters to detect if different people or instruments are present. Those techniques may extract different kind of information such as structuring a song using the number of singers or automatically knowing if a record contains someone speaking, reciting or even chanting

Toulouse Capitole Publications

Automatic Detection of the Prosodic Structures of Speech Utterances

Author: Bartkova Katarina
Jouvet Denis
Publication venue: Springer Verlag
Publication date: 01/09/2013
Field of study

International audienceThis paper presents an automatic approach for the detection of the prosodic structures of speech utterances. The algorithm relies on a hierarchical representation of the prosodic organization of the speech utterances. The approach is applied on a corpus of radio French broadcast news and also on radio and TV shows which are more spontaneous speech data. The algorithm detects prosodic boundaries whether they are followed or not by pause. The detection of the prosodic boundaries and of the prosodic structures is based on an approach that integrates little linguistic knowledge and mainly uses the amplitude of the F0 slopes and the inversion of the slopes as described in [1], as well as phone durations. The automatic prosodic segmentation results are then compared to a manual prosodic segmentation made by an expert phonetician. Finally, the results obtained by this automatic approach provide an insight into the most frequently used prosodic structures in the broadcasting speech style as well as in a more spontaneous speech style

INRIA a CCSD electronic archive server