3 research outputs found

    Shift-Invariant Kernel Additive Modelling for Audio Source Separation

    Full text link
    A major goal in blind source separation to identify and separate sources is to model their inherent characteristics. While most state-of-the-art approaches are supervised methods trained on large datasets, interest in non-data-driven approaches such as Kernel Additive Modelling (KAM) remains high due to their interpretability and adaptability. KAM performs the separation of a given source applying robust statistics on the time-frequency bins selected by a source-specific kernel function, commonly the K-NN function. This choice assumes that the source of interest repeats in both time and frequency. In practice, this assumption does not always hold. Therefore, we introduce a shift-invariant kernel function capable of identifying similar spectral content even under frequency shifts. This way, we can considerably increase the amount of suitable sound material available to the robust statistics. While this leads to an increase in separation performance, a basic formulation, however, is computationally expensive. Therefore, we additionally present acceleration techniques that lower the overall computational complexity.Comment: Feedback is welcom

    Signal Processing And Graph Theory Techniques For Sound Source Separation.

    No full text
    PhD ThesisSource separation aims to identify and separate the sources from a given mixture. In music source separation, the sources are typically musical instruments and the given mixture, a recorded track. When there is little or no prior information about the sources or recording conditions, a major goal becomes to target the inherent characteristics of the sources to help with their differentiation and separation. This thesis is concerned with methods for doing so, introducing novel approaches based on signal processing and graph theory techniques. Kernel Additive Modelling (KAM) is a popular music source separation framework as it is flexible, computationally efficient and requires no training data. The main idea behind KAM is that one can use the inherent repetitions of musical signals to reconstruct a source by defining a proximity kernel. KAM employs robust statistics for the separation, whose success ultimately depends on the kernel ability to identify similar instances of a source in the presence of other overlaying sources. In existing KAM approaches, the kernel design is rather rudimentary and its simplicity is limiting. In this thesis we investigate the current kernel and propose novel extensions boosting its performance without losing interpretability, flexibility or efficiency. We then explore the inherent graph structure in KAM, leading to the first unsupervised method to optimise the sole parameter in the framework. Following this perspective, we further investigate graph representations, introducing visibility graphs to magnitude spectra. We present a novel visibility graph-based representation with valuable properties for audio. Finally, we propose the first method to compute visibility graphs on-line, broadening the relevance of this thesis to generic time series analysis
    corecore