Search CORE

209 research outputs found

Learning to rank music tracks using triplet loss

Author: Peeters Geoffroy
Prétet Laure
Richard Gaël
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/05/2020
Field of study

Most music streaming services rely on automatic recommendation algorithms to exploit their large music catalogs. These algorithms aim at retrieving a ranked list of music tracks based on their similarity with a target music track. In this work, we propose a method for direct recommendation based on the audio content without explicitly tagging the music tracks. To that aim, we propose several strategies to perform triplet mining from ranked lists. We train a Convolutional Neural Network to learn the similarity via triplet loss. These different strategies are compared and validated on a large-scale experiment against an auto-tagging based approach. The results obtained highlight the efficiency of our system, especially when associated with an Auto-pooling layer

arXiv.org e-Print Archive

Crossref

Self-Similarity-Based and Novelty-based loss for music structure analysis

Author: Peeters Geoffroy
Publication venue
Publication date: 05/09/2023
Field of study

Music Structure Analysis (MSA) is the task aiming at identifying musical segments that compose a music track and possibly label them based on their similarity. In this paper we propose a supervised approach for the task of music boundary detection. In our approach we simultaneously learn features and convolution kernels. For this we jointly optimize -- a loss based on the Self-Similarity-Matrix (SSM) obtained with the learned features, denoted by SSM-loss, and -- a loss based on the novelty score obtained applying the learned kernels to the estimated SSM, denoted by novelty-loss. We also demonstrate that relative feature learning, through self-attention, is beneficial for the task of MSA. Finally, we compare the performances of our approach to previously proposed approaches on the standard RWC-Pop, and various subsets of SALAMI

arXiv.org e-Print Archive

Notes from the ISMIR 2012 late-breaking session on evaluation in music information retrieval

Author: Jones Gareth J.F.
Peeters Geoffroy
Urbano Julian
Publication venue: International Society for Music Information Retrieval
Publication date: 01/10/2012
Field of study

During the last day of the ISMIR 2012 conference there were two events related to Music IR Evaluation. A panel took place during the morning to discuss several issues concerning the various evaluation initiatives with the general audience at ISMIR. A late-breaking session during the afternoon kept the discussion alive between a group of researchers who wanted to dig deeper into these issues. This extended abstract reports the main topics covered during this short session and the general thoughts that came up

HAL Descartes

DCU Online Research Access Service

Hal-Diderot

Towards a (better) Definition of Annotated MIR Corpora

Author: Fort Karen
Peeters Geoffroy
Publication venue: HAL CCSD
Publication date: 08/10/2012
Field of study

International audienceToday, annotated MIR corpora are provided by various re- search labs or companies, each one using its own annota- tion methodology, concept definitions, and formats. This is not an issue as such. However, the lack of descriptions of the methodology used--how the corpus was actually an- notated, and by whom--and of the annotated concepts, i.e. what is actually described, is a problem with respect to the sustainability, usability, and sharing of the corpora. Ex- perience shows that it is essential to define precisely how annotations are supplied and described. We propose here a survey and consolidation report on the nature of the an- notated corpora used and shared in MIR, with proposals for the axis against which corpora can be described so to enable effective comparison and the inherent influence this has on tasks performed using them

HAL Descartes

HAL-Paris 13

Hal-Diderot

Degradation-Invariant Music Indexing

Author: Mignot Rémi
Peeters Geoffroy
Publication venue
Publication date: 01/03/2024
Field of study

For music indexing robust to sound degradations and scalable for big music catalogs, this scientific report presents an approach based on audio descriptors relevant to the music content and invariant to sound transformations (noise addition, distortion, lossy coding, pitch/time transformations, or filtering e.g.). To achieve this task, one of the key point of the proposed method is the definition of high-dimensional audio prints, which are intrinsically (by design) robust to some sound degradations. The high dimensionality of this first representation is then used to learn a linear projection to a sub-space significantly smaller, which reduces again the sensibility to sound degradations using a series of discriminant analyses. Finally, anchoring the analysis times on local maxima of a selected onset function, an approximative hashing is done to provide a better tolerance to bit corruptions, and in the same time to make easier the scaling of the method

arXiv.org e-Print Archive

Blind estimation of audio effects using an auto-encoder approach and differentiable signal processing

Author: Peeters Geoffroy
Peladeau Côme
Publication venue
Publication date: 18/10/2023
Field of study

Blind Estimation of Audio Effects (BE-AFX) aims at estimating the Audio Effects (AFXs) applied to an original, unprocessed audio sample solely based on the processed audio sample. To train such a system traditional approaches optimize a loss between ground truth and estimated AFX parameters. This involves knowing the exact implementation of the AFXs used for the process. In this work, we propose an alternative solution that eliminates the requirement for knowing this implementation. Instead, we introduce an auto-encoder approach, which optimizes an audio quality metric. We explore, suggest, and compare various implementations of commonly used mastering AFXs, using differential signal processing or neural approximations. Our findings demonstrate that our auto-encoder approach yields superior estimates of the audio quality produced by a chain of AFXs, compared to the traditional parameter-based approach, even if the latter provides a more accurate parameter estimation

arXiv.org e-Print Archive