3 research outputs found
Shift-Invariant Kernel Additive Modelling for Audio Source Separation
A major goal in blind source separation to identify and separate sources is
to model their inherent characteristics. While most state-of-the-art approaches
are supervised methods trained on large datasets, interest in non-data-driven
approaches such as Kernel Additive Modelling (KAM) remains high due to their
interpretability and adaptability. KAM performs the separation of a given
source applying robust statistics on the time-frequency bins selected by a
source-specific kernel function, commonly the K-NN function. This choice
assumes that the source of interest repeats in both time and frequency. In
practice, this assumption does not always hold. Therefore, we introduce a
shift-invariant kernel function capable of identifying similar spectral content
even under frequency shifts. This way, we can considerably increase the amount
of suitable sound material available to the robust statistics. While this leads
to an increase in separation performance, a basic formulation, however, is
computationally expensive. Therefore, we additionally present acceleration
techniques that lower the overall computational complexity.Comment: Feedback is welcom
Signal Processing And Graph Theory Techniques For Sound Source Separation.
PhD ThesisSource separation aims to identify and separate the sources from a
given mixture. In music source separation, the sources are typically
musical instruments and the given mixture, a recorded track. When
there is little or no prior information about the sources or recording
conditions, a major goal becomes to target the inherent characteristics
of the sources to help with their differentiation and separation. This
thesis is concerned with methods for doing so, introducing novel approaches
based on signal processing and graph theory techniques.
Kernel Additive Modelling (KAM) is a popular music source separation
framework as it is flexible, computationally efficient and requires
no training data. The main idea behind KAM is that one can
use the inherent repetitions of musical signals to reconstruct a source
by defining a proximity kernel. KAM employs robust statistics for
the separation, whose success ultimately depends on the kernel ability
to identify similar instances of a source in the presence of other
overlaying sources. In existing KAM approaches, the kernel design is
rather rudimentary and its simplicity is limiting. In this thesis we investigate
the current kernel and propose novel extensions boosting its
performance without losing interpretability, flexibility or efficiency.
We then explore the inherent graph structure in KAM, leading to
the first unsupervised method to optimise the sole parameter in the
framework. Following this perspective, we further investigate graph
representations, introducing visibility graphs to magnitude spectra.
We present a novel visibility graph-based representation with valuable
properties for audio. Finally, we propose the first method to
compute visibility graphs on-line, broadening the relevance of this
thesis to generic time series analysis