Search CORE

3 research outputs found

Recherche d'information textuelle et phonétique pour le contrôle de l'étiquetage automatique d'émissions dans un flux télévisuel

Author: Guinaudeau Camille
Publication venue: HAL CCSD
Publication date: 01/05/2009
Field of study

National audienceEn 2007, Naturel (Naturel, 2007) a proposé un système qui associe automatiquement une étiquette, c'est-à-dire un titre, à des émissions issues du découpage d'un flux TV. Cependant, ce système ne permet pas de vérifier la correction des associations étiquette-émission. Nous proposons dans cet article de contrôler cet étiquetage en nous basant sur les transcriptions textuelle et phonétique de la bande sonore contenue dans le flux. Nous montrons que des méthodes de recherche d'information permettent d'associer à chaque émission une description, issue d'un guide de programmes TV, description qui est ensuite comparée avec l'étiquette originale de l'émission. La technique proposée permet de contrôler un peu plus de 45% des émissions étudiées et de diminuer de nombre d'erreurs de l'étiquetage original de 3,5%

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

HAL-Rennes 1

Music Synchronization, Audio Matching, Pattern Detection, and User Interfaces for a Digital Music Library System

Author: Kriesel Verena
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Over the last two decades, growing efforts to digitize our cultural heritage could be observed. Most of these digitization initiatives pursuit either one or both of the following goals: to conserve the documents - especially those threatened by decay - and to provide remote access on a grand scale. For music documents these trends are observable as well, and by now several digital music libraries are in existence. An important characteristic of these music libraries is an inherent multimodality resulting from the large variety of available digital music representations, such as scanned score, symbolic score, audio recordings, and videos. In addition, for each piece of music there exists not only one document of each type, but many. Considering and exploiting this multimodality and multiplicity, the DFG-funded digital library initiative PROBADO MUSIC aimed at developing a novel user-friendly interface for content-based retrieval, document access, navigation, and browsing in large music collections. The implementation of such a front end requires the multimodal linking and indexing of the music documents during preprocessing. As the considered music collections can be very large, the automated or at least semi-automated calculation of these structures would be recommendable. The field of music information retrieval (MIR) is particularly concerned with the development of suitable procedures, and it was the goal of PROBADO MUSIC to include existing and newly developed MIR techniques to realize the envisioned digital music library system. In this context, the present thesis discusses the following three MIR tasks: music synchronization, audio matching, and pattern detection. We are going to identify particular issues in these fields and provide algorithmic solutions as well as prototypical implementations. In Music synchronization, for each position in one representation of a piece of music the corresponding position in another representation is calculated. This thesis focuses on the task of aligning scanned score pages of orchestral music with audio recordings. Here, a previously unconsidered piece of information is the textual specification of transposing instruments provided in the score. Our evaluations show that the neglect of such information can result in a measurable loss of synchronization accuracy. Therefore, we propose an OCR-based approach for detecting and interpreting the transposition information in orchestral scores. For a given audio snippet, audio matching methods automatically calculate all musically similar excerpts within a collection of audio recordings. In this context, subsequence dynamic time warping (SSDTW) is a well-established approach as it allows for local and global tempo variations between the query and the retrieved matches. Moving to real-life digital music libraries with larger audio collections, however, the quadratic runtime of SSDTW results in untenable response times. To improve on the response time, this thesis introduces a novel index-based approach to SSDTW-based audio matching. We combine the idea of inverted file lists introduced by Kurth and Müller (Efficient index-based audio matching, 2008) with the shingling techniques often used in the audio identification scenario. In pattern detection, all repeating patterns within one piece of music are determined. Usually, pattern detection operates on symbolic score documents and is often used in the context of computer-aided motivic analysis. Envisioned as a new feature of the PROBADO MUSIC system, this thesis proposes a string-based approach to pattern detection and a novel interactive front end for result visualization and analysis

bonndoc – Der Publikationsserver der Universität Bonn

Variability Tolerant Audio Motif Discovery

Author: Bimbot Frédéric
Gravier Guillaume
Muscariello Armando
Publication venue: HAL CCSD
Publication date: 08/01/2009
Field of study

International audienceMining of repeating patterns is useful in inferring structure in streams and in multimedia indexing, as it allows to summarize even large archives by small sets of recurrent items. Techniques for their discovery are required to handle large data sets and tolerate a certain amount of variability among instances of the same underlying pattern (like spectral variability and temporal distortion). In this paper, early approaches and experiments are described for the retrieval of such variable patterns in audio, a task that we call audio motif discovery, for analogy with its counterpart in biology. The algorithm is based on a combination of ARGOS to segment the data and organize the search of the motifs, and a novel technique based on segmental dynamic time warping to detect similarities in the audio data. Moreover, precision-recall measures are defined for evaluation purposes and preliminary experiments on the word discovery case are discussed

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

HAL-Rennes 1