Search CORE

79 research outputs found

A regularity-constrained Viterbi algorithm and its application to the structural segmentation of songs

Author: Bimbot Frédéric
Sargent Gabriel
Vincent Emmanuel
Publication venue: HAL CCSD
Publication date: 24/10/2011
Field of study

International audienceThis paper presents a general approach for the structural segmentation of songs. It is formalized as a cost optimization problem that combines properties of the musical content and prior regularity assumption on the segment length. A versatile implementation of this approach is proposed by means of a Viterbi algorithm, and the design of the costs are discussed. We then present two systems derived from this approach, based on acoustic and symbolic features respectively. The advantages of the regularity constraint are evaluated on a database of 100 popular songs by showing a significant improvement of the segmentation performance in terms of F-measure

INRIA a CCSD electronic archive server

Supplementary material to the article: Estimating the structural segmentation of popular music pieces under regularity constraints

Author: Bimbot Frédéric
Sargent Gabriel
Vincent Emmanuel
Publication venue: HAL CCSD
Publication date: 23/09/2016
Field of study

This document gathers descriptions of the structural segmentation systems considered in the IEEE/ACM TASLP paper by the same authors

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

A music structure inference algorithm based on symbolic data analysis

Author: Bimbot Frédéric
Raczynski Stanislaw,
Sagayama Shigeki
Sargent Gabriel
Vincent Emmanuel
Publication venue: HAL CCSD
Publication date: 24/10/2011
Field of study

International audienceThe present document describes a music structure inference algorithm submitted to the MIREX 2011 evaluation campaign (structural segmentation task). It consists of 3 stages : symbolic feature extraction, structural segment boundary estimation, and structural segment clustering. We consider as inputs chord estimations from the system of Ueda et al., expressed at the 2-beat scale. Beats and downbeats are estimated by the system of Davies et al. The structural segmentation step uses a regularity-constrained Viterbi approach. It assumes that the structure of pop songs is generally based on a few typical segments, whose sizes are called structural pulsation periods. The segments are then clustered according to their similarity, through the minimization of an adaptive model selection criterion

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Music Boundary Detection using Convolutional Neural Networks: A comparative analysis of combined input features

Author: Beltran Jose R.
Diaz-Guerra David
Hernandez-Olivan Carlos
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 01/01/2021
Field of study

The analysis of the structure of musical pieces is a task that remains a challenge for Artificial Intelligence, especially in the field of Deep Learning. It requires prior identification of structural boundaries of the music pieces. This structural boundary analysis has recently been studied with unsupervised methods and \textit{end-to-end} techniques such as Convolutional Neural Networks (CNN) using Mel-Scaled Log-magnitude Spectograms features (MLS), Self-Similarity Matrices (SSM) or Self-Similarity Lag Matrices (SSLM) as inputs and trained with human annotations. Several studies have been published divided into unsupervised and \textit{end-to-end} methods in which pre-processing is done in different ways, using different distance metrics and audio characteristics, so a generalized pre-processing method to compute model inputs is missing. The objective of this work is to establish a general method of pre-processing these inputs by comparing the inputs calculated from different pooling strategies, distance metrics and audio characteristics, also taking into account the computing time to obtain them. We also establish the most effective combination of inputs to be delivered to the CNN in order to establish the most efficient way to extract the limits of the structure of the music pieces. With an adequate combination of input matrices and pooling strategies we obtain a measurement accuracy

F_1

of 0.411 that outperforms the current one obtained under the same conditions

arXiv.org e-Print Archive

Repositorio Universidad de Zaragoza

Re-UNIR

Design of Soft Viterbi Algorithm Decoder Enhanced With Non-Transmittable Codewords for Storage Media

Author: Hassan Kilavo
Michael Kisangiri
Mrutu Salehe I.
Publication venue
Publication date: 01/02/2017
Field of study

Viterbi Algorithm Decoder Enhanced with Non-transmittable Codewords is one of the best decoding algorithm which effectively improves forward error correction performance. HoweverViterbi decoder enhanced with NTCs is not yet designed to work in storage media devices. Currently Reed Solomon (RS) Algorithm is almost the dominant algorithm used in correcting error in storage media. Conversely, recent studies show that there still exist low reliability of data in storage media while the demand for storage media increases drastically. This study proposes a design of the Soft Viterbi Algorithm decoder enhanced with Non-transmittable Codewords (SVAD-NTCs) to be used in storage media for error correction. Matlab simulation was used in this design in order to investigate behavior and effectiveness of SVAD-NTCs in correcting errors in data retrieving from storage media.Sample data of one million bits are randomly generated, Additive White Gaussian Noise (AWGN) was used as data distortion model and Binary Phase- Shift Keying (BPSK) was applied for simulation modulation. Results show that,behaviors of SVAD-NTC performance increase as you increase the NTCs, but beyond 6NTCs there is no significant change and SVAD-NTCs design drastically reduce the total residual error from 216,878 of Reed Solomon to 23,900

arXiv.org e-Print Archive

NM-AIST Repository

Semiotic Description of Music Structure: an Introduction to the Quaero/Metiss Structural Annotations

Author: Bimbot Frédéric
Deruty Emmanuel
Guichaoua Corentin
Sargent Gabriel
Vincent Emmanuel
Publication venue: HAL CCSD
Publication date: 26/01/2014
Field of study

12 pagesInternational audienceInterest has been steadily growing in semantic audio and music information retrieval for the description of music structure, i.e., the global organization of music pieces in terms of large-scale structural units. This article presents a detailed methodology for the semiotic description of music structure, based on concepts and criteria which are formulated as generically as possible. We sum up the essential principles and practices developed during an annotation effort deployed by our research group (Metiss) on audio data in the context of the Quaero project, which has led to the public release of over 380 annotations of pop songs from three different data sets. The paper also includes a few case studies and a concise statistical overview of the annotated data

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Self-Similarity-Based and Novelty-based loss for music structure analysis

Author: Peeters Geoffroy
Publication venue
Publication date: 05/09/2023
Field of study

Music Structure Analysis (MSA) is the task aiming at identifying musical segments that compose a music track and possibly label them based on their similarity. In this paper we propose a supervised approach for the task of music boundary detection. In our approach we simultaneously learn features and convolution kernels. For this we jointly optimize -- a loss based on the Self-Similarity-Matrix (SSM) obtained with the learned features, denoted by SSM-loss, and -- a loss based on the novelty score obtained applying the learned kernels to the estimated SSM, denoted by novelty-loss. We also demonstrate that relative feature learning, through self-attention, is beneficial for the task of MSA. Finally, we compare the performances of our approach to previously proposed approaches on the standard RWC-Pop, and various subsets of SALAMI

arXiv.org e-Print Archive

Methodological and musicological investigation of the System & Contrast model for musical form description

Author: Bimbot Frédéric
Deruty Emmanuel
Van Wymeersch Brigitte
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

The semiotic description of music structure aims at representing the high-level organization of music pieces in a concise, generic and reproducible way as a low-rate stream of arbitrary symbols from a limited alphabet, which results into a sequence of " semiotic units ". In this context, the purpose of the System & Contrast model is to address the internal organization of the semiotic units. In this report, the System & Contrast model is approached from different angles in relation to varied disciplines : cognitive psychology, music analysis and information theory. After establishing a number of links between the System & Contrast model and other approaches of music structure, the model is illustrated on studio-based popular music pieces, as well as on music from the classical Viennese period

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

A Cross-Cultural Analysis of Music Structure

Author: Tian Mi
Publication venue: 'Queen Mary University of London'
Publication date: 07/07/2017
Field of study

PhDMusic signal analysis is a research field concerning the extraction of meaningful information from musical audio signals. This thesis analyses the music signals from the note-level to the song-level in a bottom-up manner and situates the research in two Music information retrieval (MIR) problems: audio onset detection (AOD) and music structural segmentation (MSS). Most MIR tools are developed for and evaluated on Western music with specific musical knowledge encoded. This thesis approaches the investigated tasks from a cross-cultural perspective by developing audio features and algorithms applicable for both Western and non-Western genres. Two Chinese Jingju databases are collected to facilitate respectively the AOD and MSS tasks investigated. New features and algorithms for AOD are presented relying on fusion techniques. We show that fusion can significantly improve the performance of the constituent baseline AOD algorithms. A large-scale parameter analysis is carried out to identify the relations between system configurations and the musical properties of different music types. Novel audio features are developed to summarise music timbre, harmony and rhythm for its structural description. The new features serve as effective alternatives to commonly used ones, showing comparable performance on existing datasets, and surpass them on the Jingju dataset. A new segmentation algorithm is presented which effectively captures the structural characteristics of Jingju. By evaluating the presented audio features and different segmentation algorithms incorporating different structural principles for the investigated music types, this thesis also identifies the underlying relations between audio features, segmentation methods and music genres in the scenario of music structural analysis.China Scholarship Council EPSRC C4DM Travel Funding, EPSRC Fusing Semantic and Audio Technologies for Intelligent Music Production and Consumption (EP/L019981/1), EPSRC Platform Grant on Digital Music (EP/K009559/1), European Research Council project CompMusic, International Society for Music Information Retrieval Student Grant, QMUL Postgraduate Research Fund, QMUL-BUPT Joint Programme Funding Women in Music Information Retrieval Grant

Queen Mary Research Online

Proceedings of the 7th Sound and Music Computing Conference

Author: Emilia Gómez
Perfecto Herrera
Rafael Ramirez
Publication venue: SMC Network
Publication date: 25/07/2010
Field of study

Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010

ZENODO