4,244 research outputs found
Score-Informed Source Separation for Musical Audio Recordings [An overview]
(c) 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works
A general framework for online audio source separation
We consider the problem of online audio source separation. Existing
algorithms adopt either a sliding block approach or a stochastic gradient
approach, which is faster but less accurate. Also, they rely either on spatial
cues or on spectral cues and cannot separate certain mixtures. In this paper,
we design a general online audio source separation framework that combines both
approaches and both types of cues. The model parameters are estimated in the
Maximum Likelihood (ML) sense using a Generalised Expectation Maximisation
(GEM) algorithm with multiplicative updates. The separation performance is
evaluated as a function of the block size and the step size and compared to
that of an offline algorithm.Comment: International conference on Latente Variable Analysis and Signal
Separation (2012
ACCOUNTING FOR PHASE CANCELLATIONS IN NON-NEGATIVE MATRIX FACTORIZATION USING WEIGHTED DISTANCES
(c)2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Published in: Proc IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014), Florence, Italy, 5-9 May 2014
Tensor Decompositions for Signal Processing Applications From Two-way to Multiway Component Analysis
The widespread use of multi-sensor technology and the emergence of big
datasets has highlighted the limitations of standard flat-view matrix models
and the necessity to move towards more versatile data analysis tools. We show
that higher-order tensors (i.e., multiway arrays) enable such a fundamental
paradigm shift towards models that are essentially polynomial and whose
uniqueness, unlike the matrix methods, is guaranteed under verymild and natural
conditions. Benefiting fromthe power ofmultilinear algebra as theirmathematical
backbone, data analysis techniques using tensor decompositions are shown to
have great flexibility in the choice of constraints that match data properties,
and to find more general latent components in the data than matrix-based
methods. A comprehensive introduction to tensor decompositions is provided from
a signal processing perspective, starting from the algebraic foundations, via
basic Canonical Polyadic and Tucker models, through to advanced cause-effect
and multi-view data analysis schemes. We show that tensor decompositions enable
natural generalizations of some commonly used signal processing paradigms, such
as canonical correlation and subspace techniques, signal separation, linear
regression, feature extraction and classification. We also cover computational
aspects, and point out how ideas from compressed sensing and scientific
computing may be used for addressing the otherwise unmanageable storage and
manipulation problems associated with big datasets. The concepts are supported
by illustrative real world case studies illuminating the benefits of the tensor
framework, as efficient and promising tools for modern signal processing, data
analysis and machine learning applications; these benefits also extend to
vector/matrix data through tensorization. Keywords: ICA, NMF, CPD, Tucker
decomposition, HOSVD, tensor networks, Tensor Train
Score-Informed Source Separation for Music Signals
In recent years, the processing of audio recordings by exploiting additional musical knowledge has turned out to be a promising research direction. In particular, additional note information as specified by a musical score or a MIDI file has been employed to support various audio processing tasks such as source separation, audio parameterization, performance analysis, or instrument equalization. In this contribution, we provide an overview of approaches for score-informed source separation and illustrate their potential by discussing innovative applications and interfaces. Additionally, to illustrate some basic principles behind these approaches, we demonstrate how score information can be integrated into the well-known non-negative matrix factorization (NMF) framework. Finally, we compare this approach to advanced methods based on parametric models
Final Research Report on Auto-Tagging of Music
The deliverable D4.7 concerns the work achieved by IRCAM until M36 for the “auto-tagging of music”. The deliverable is a research report. The software libraries resulting from the research have been integrated into Fincons/HearDis! Music Library Manager or are used by TU Berlin. The final software libraries are described in D4.5.
The research work on auto-tagging has concentrated on four aspects:
1) Further improving IRCAM’s machine-learning system ircamclass. This has been done by developing the new MASSS audio features, including audio augmentation and audio segmentation into ircamclass. The system has then been applied to train HearDis! “soft” features (Vocals-1, Vocals-2, Pop-Appeal, Intensity, Instrumentation, Timbre, Genre, Style). This is described in Part 3.
2) Developing two sets of “hard” features (i.e. related to musical or musicological concepts) as specified by HearDis! (for integration into Fincons/HearDis! Music Library Manager) and TU Berlin (as input for the prediction model of the GMBI attributes). Such features are either derived from previously estimated higher-level concepts (such as structure, key or succession of chords) or by developing new signal processing algorithm (such as HPSS) or main melody estimation. This is described in Part 4.
3) Developing audio features to characterize the audio quality of a music track. The goal is to describe the quality of the audio independently of its apparent encoding. This is then used to estimate audio degradation or music decade. This is to be used to ensure that playlists contain tracks with similar audio quality. This is described in Part 5.
4) Developing innovative algorithms to extract specific audio features to improve music mixes. So far, innovative techniques (based on various Blind Audio Source Separation algorithms and Convolutional Neural Network) have been developed for singing voice separation, singing voice segmentation, music structure boundaries estimation, and DJ cue-region estimation. This is described in Part 6.EC/H2020/688122/EU/Artist-to-Business-to-Business-to-Consumer Audio Branding System/ABC D
- …