Search CORE

129 research outputs found

Recommended from our members

Deep neural networks with voice entry estimation heuristics for voice separation in symbolic music representations

Author: de Valk R.
Weyde T.
Publication venue
Publication date
Field of study

In this study we explore the use of deep feedforward neural networks for voice separation in symbolic music representations. We experiment with different network architectures, varying the number and size of the hidden layers, and with dropout. We integrate two voice entry estimation heuristics that estimate the entry points of the individual voices in the polyphonic fabric into the models. These heuristics serve to reduce error propagation at the beginning of a piece, which, as we have shown in previous work, can seriously hamper model performance. The models are evaluated on the 48 fugues from Johann Sebastian Bach’s The Well-Tempered Clavier and his 30 inventions—a dataset that we curated and make publicly available. We find that a model with two hidden layers yields the best results. Using more layers does not lead to a significant performance improvement. Furthermore, we find that our voice entry estimation heuristics are highly effective in the reduction of error propagation, improving performance significantly. Our best-performing model outperforms our previous models, where the difference is significant, and, depending on the evaluation metric, performs close to or better than the reported state of the art

City Research Online

Automatic Transcription of Polyphonic Vocal Music

Author: Benetos E
McLeod A
Schramm R
Steedman M
Publication venue: 'MDPI AG'
Publication date: 01/12/2017
Field of study

This paper presents a method for automatic music transcription applied to audio recordings of a cappella performances with multiple singers. We propose a system for multi-pitch detection and voice assignment that integrates an acoustic and a music language model. The acoustic model performs spectrogram decomposition, extending probabilistic latent component analysis (PLCA) using a six-dimensional dictionary with pre-extracted log-spectral templates. The music language model performs voice separation and assignment using hidden Markov models that apply musicological assumptions. By integrating the two models, the system is able to detect multiple concurrent pitches in polyphonic vocal music and assign each detected pitch to a specific voice type such as soprano, alto, tenor or bass (SATB). We compare our system against multiple baselines, achieving state-of-the-art results for both multi-pitch detection and voice assignment on a dataset of Bach chorales and another of barbershop quartets. We also present an additional evaluation of our system using varied pitch tolerance levels to investigate its performance at 20-cent pitch resolution

Directory of Open Access Journals

Evaluating Automatic Polyphonic Music Transcription

Author: Mcleod Andrew
Steedman Mark
Publication venue
Publication date: 23/09/2018
Field of study

Automatic Music Transcription (AMT) is an important task in music information retrieval. Prior work has focused on multiple fundamental frequency estimation (multi-pitch detection), the conversion of an audio signal into a timefrequency representation such as a MIDI file. It is less common to annotate this output with musical features such as voicing information, metrical structure, and harmonic information, though these are important aspects of a complete transcription. Evaluation of these features is most often performed separately and independent of multi-pitch detection; however, these features are non-independent. We therefore introduce M V 2H, a quantitative, automatic, joint evaluation metric based on musicological principles, and show its effectiveness through the use of specific examples. The metric is modularised in such a way that it can still be used with partially performed annotation— for example, when the transcription process has been applied to some transduced format such as MIDI (which may itself be the result of multi-pitch detection). The code for the evaluation metric described here is available at https://www.github.com/apmcleod/MV2H

ZENODO

Design of Pattern Matching Systems: Pattern, Algorithm, and Scanner

Author: Wang Hao
Publication venue
Publication date: 28/04/2015
Field of study

Pattern matching is at the core of many computational problems, e.g., search engine, data mining, network security and information retrieval. In this dissertation, we target at the more complex patterns of regular expression and time series, and proposed a general modular structure, named character class with constraint repetition (CCR), as the building block for the pattern matching algorithm. An exact matching algorithm named MIN-MAX is developed to support overlapped matching of CCR based regexps, and an approximate matching algorithm named Elastic Matching Algorithm is designed to support overlapped matching of CCR based time series, i.e., music melody. Both algorithms are parallelized to run on FPGA to achieve high performance, and the FPGA-based scanners are designed as a modular architecture which is parameterizable and can be reconfigured by simple memory writes, achieving a perfect balance between performance and deployment time

Adaptive prototype-based dissimilarity learning

Author: Zhu Xibin
Publication venue: Universitätsbibliothek Bielefeld
Publication date: 01/01/2015
Field of study

Zhu X. Adaptive prototype-based dissimilarity learning. Bielefeld: Universitätsbibliothek Bielefeld; 2015.In this thesis we focus on prototype-based learning techniques, namely three unsuper- vised techniques: generative topographic mapping (GTM), neural gas (NG) and affinity propagation (AP), and two supervised techniques: generalized learning vector quantiza- tion (GLVQ) and robust soft learning vector quantization (RSLVQ). We extend their abilities with respect to the following central aspects: • Applicability on dissimilarity data: Due to the increased complexity of data, in many cases data are only available in form of (dis)similarities which describe the relations between objects. Classical methods can not directly deal with this kind of data. For unsupervised methods this problem has been studied, here we transfer the same idea to the two supervised prototype-based techniques such that they can directly deal with dissimilarities without an explicit embedding into a vector space. • Quadratic complexity issue: For dealing with dissimilarity data, due to the need of the full dissimilarity matrix, the complexity becomes quadratic which is infeasible for large data sets. In this thesis we investigate two linear approximation techniques: Nyström approximation and patch processing, and integrate them into unsupervised and supervised prototype-based techniques. • Reliability of prototype-based classifiers: In practical applications, a relia- bility measure is beneficial for evaluating the classification quality expected by the end users. Here we adopt concepts from conformal prediction (CP), which provides point-wise confidence measure of the prediction, and we combine those with supervised prototype-based techniques. • Model complexity: By means of the confidence values provided by CP, the model complexity can be automatically adjusted by adding new prototypes to cover low confidence data space. • Extendability to semi-supervised problems: Besides its ability to evaluate a classifier, conformal prediction can also be considered as a classifier. This opens a way that supervised techniques can be easily extended for semi-supervised settings by means of a self-training approach

Publications at Bielefeld University

Proceedings of the 19th Sound and Music Computing Conference

Author: Michon Romain
Orlarey Yann
Pottier Laurent
Publication venue: SMC Network
Publication date: 12/07/2022
Field of study

Proceedings of the 19th Sound and Music Computing Conference - June 5-12, 2022 - Saint-Étienne (France). https://smc22.grame.f

HAL-UJM

INRIA a CCSD electronic archive server

Proceedings full papers ISG*ISARC2012 : joint conference of the 8th World Conference of the International Society for Gerontechnology (ISG) and the 29th International Symposium on Automation and Robotics in Construction (ISARC), June 26-29, Eindhoven, The Netherlands

Author
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2012
Field of study