Search CORE

6 research outputs found

The Skipping Behavior of Users of Music Streaming Services and its Relation to Musical Structure

Author: Montecchio Nicola
Pachet François
Roy Pierre
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 12/03/2019
Field of study

The behavior of users of music streaming services is investigated from the point of view of the temporal dimension of individual songs; specifically, the main object of the analysis is the point in time within a song at which users stop listening and start streaming another song ("skip"). The main contribution of this study is the ascertainment of a correlation between the distribution in time of skipping events and the musical structure of songs. It is also shown that such distribution is not only specific to the individual songs, but also independent of the cohort of users and, under stationary conditions, date of observation. Finally, user behavioral data is used to train a predictor of the musical structure of a song solely from its acoustic content; it is shown that the use of such data, available in large quantities to music streaming services, yields significant improvements in accuracy over the customary fashion of training this class of algorithms, in which only smaller amounts of hand-labeled data are available

arXiv.org e-Print Archive

Directory of Open Access Journals

Gabor frames and deep scattering networks in audio processing

Author: Bammer Roswitha
Dörfler Monika
Harar Pavol
Publication venue: 'MDPI AG'
Publication date: 26/09/2019
Field of study

This paper introduces Gabor scattering, a feature extractor based on Gabor frames and Mallat's scattering transform. By using a simple signal model for audio signals specific properties of Gabor scattering are studied. It is shown that for each layer, specific invariances to certain signal characteristics occur. Furthermore, deformation stability of the coefficient vector generated by the feature extractor is derived by using a decoupling technique which exploits the contractivity of general scattering networks. Deformations are introduced as changes in spectral shape and frequency modulation. The theoretical results are illustrated by numerical examples and experiments. Numerical evidence is given by evaluation on a synthetic and a "real" data set, that the invariances encoded by the Gabor scattering transform lead to higher performance in comparison with just using Gabor transform, especially when few training samples are available.Comment: 26 pages, 8 figures, 4 tables. Repository for reproducibility: https://gitlab.com/hararticles/gs-gt . Keywords: machine learning; scattering transform; Gabor transform; deep learning; time-frequency analysis; CNN. Accepted and published after peer revisio

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Digital library of Brno University of Technology

AAM: a dataset of Artificial Audio Multitracks for diverse music information retrieval tasks

Author: Ebeling Martin
Ostermann Fabian
Vatolkin Igor
Publication venue
Publication date: 23/03/2023
Field of study

We present a new dataset of 3000 artificial music tracks with rich annotations based on real instrument samples and generated by algorithmic composition with respect to music theory. Our collection provides ground truth onset information and has several advantages compared to many available datasets. It can be used to compare and optimize algorithms for various music information retrieval tasks like music segmentation, instrument recognition, source separation, onset detection, key and chord recognition, or tempo estimation. As the audio is perfectly aligned to original MIDIs, all annotations (onsets, pitches, instruments, keys, tempos, chords, beats, and segment boundaries) are absolutely precise. Because of that, specific scenarios can be addressed, for instance, detection of segment boundaries with instrument and key change only, or onset detection only in tracks with drums and slow tempo. This allows for the exhaustive evaluation and identification of individual weak points of algorithms. In contrast to datasets with commercial music, all audio tracks are freely available, allowing for extraction of own audio features. All music pieces are stored as single instrument audio tracks and a mix track, so that different augmentations and DSP effects can be applied to extend training sets and create individual mixes, e.g., for deep neural networks. In three case studies, we show how different algorithms and neural network models can be analyzed and compared for music segmentation, instrument recognition, and onset detection. In future, the dataset can be easily extended under consideration of specific demands to the composition process

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Final Research Report on Auto-Tagging of Music

Author: Cohen-Hadria Alice
Cornu Frédéric
Fourer Dominique
Hofmann Robin
Laffitte Pierre
Marchand Ugo
Mignot Rémi
Peeters Geoffroy
Schindler Daniel
Schwarz Diemo
Spadaveccia Rino
Publication venue
Publication date: 12/12/2018
Field of study

The deliverable D4.7 concerns the work achieved by IRCAM until M36 for the “auto-tagging of music”. The deliverable is a research report. The software libraries resulting from the research have been integrated into Fincons/HearDis! Music Library Manager or are used by TU Berlin. The final software libraries are described in D4.5. The research work on auto-tagging has concentrated on four aspects: 1) Further improving IRCAM’s machine-learning system ircamclass. This has been done by developing the new MASSS audio features, including audio augmentation and audio segmentation into ircamclass. The system has then been applied to train HearDis! “soft” features (Vocals-1, Vocals-2, Pop-Appeal, Intensity, Instrumentation, Timbre, Genre, Style). This is described in Part 3. 2) Developing two sets of “hard” features (i.e. related to musical or musicological concepts) as specified by HearDis! (for integration into Fincons/HearDis! Music Library Manager) and TU Berlin (as input for the prediction model of the GMBI attributes). Such features are either derived from previously estimated higher-level concepts (such as structure, key or succession of chords) or by developing new signal processing algorithm (such as HPSS) or main melody estimation. This is described in Part 4. 3) Developing audio features to characterize the audio quality of a music track. The goal is to describe the quality of the audio independently of its apparent encoding. This is then used to estimate audio degradation or music decade. This is to be used to ensure that playlists contain tracks with similar audio quality. This is described in Part 5. 4) Developing innovative algorithms to extract specific audio features to improve music mixes. So far, innovative techniques (based on various Blind Audio Source Separation algorithms and Convolutional Neural Network) have been developed for singing voice separation, singing voice segmentation, music structure boundaries estimation, and DJ cue-region estimation. This is described in Part 6.EC/H2020/688122/EU/Artist-to-Business-to-Business-to-Consumer Audio Branding System/ABC D

DepositOnce

Análisis musical mediante inteligencia artificial

Author: Beltrán Blázquez José Ramón
Hernández Oliván Carlos
Publication venue: 'Universidad de Zaragoza'
Publication date: 01/01/2019
Field of study

Las Redes Neuronales son una herramienta muy potente para clasificar, procesar y generar nuevos datos. Con respecto a la música, estas redes se han utilizado para componer nuevas melodías, armonizar temas, etc., pero solo unas pocas investigaciones han tenido en cuenta la importancia del análisis musical. En este proyecto se han desarrollado dos modelos de Redes Neuronales que identifican las transiciones de las diferentes partes de la estructura de las piezas musicales y las diferencias entre las transiciones para etiquetarlas. Para ello, se ha realizado un etiquetado de las partes de la estructura formal de piezas musicales a través de una red neuronal y se han detectado las transiciones en la estructura musical a través de técnicas de aprendizaje profundo y aprendizaje automático con Pytorch. Los resultados obtenidos son similares al estado del arte de este trabajo que se ha tomado como ejemplo para desarrollar este software.Este proyecto consta de un primer capítulo de introducción, el segundo capítulo explica las características de la teoría de las Redes Neuronales que se han utilizado en este proyecto, el tercer capítulo expone el caso del etiquetado de estructuras, el cuarto capítulo estudia el caso de detección de transiciones y el quinto capítulo compara los resultados obtenidos con el estado del arte. El sexto capítulo expone las conclusiones y las líneas futuras.<br /

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Universidad de Zaragoza

APPLICATIONS OF GRAPH THEORY AND MACHINE LEARNING IN TIME SERIES ANALYSIS AND SIGNAL PROCESSING: FROM FINANCIAL TIME SERIES TO MUSIC AUDIO

Author: Vamvakaris Michail
Publication venue
Publication date: 31/12/2022
Field of study

The University of Manchester - Institutional Repository