Search CORE

93,037 research outputs found

Improving Source Separation via Multi-Speaker Representations

Author: Van hamme Hugo
Zegers Jeroen
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2017
Field of study

Lately there have been novel developments in deep learning towards solving the cocktail party problem. Initial results are very promising and allow for more research in the domain. One technique that has not yet been explored in the neural network approach to this task is speaker adaptation. Intuitively, information on the speakers that we are trying to separate seems fundamentally important for the speaker separation task. However, retrieving this speaker information is challenging since the speaker identities are not known a priori and multiple speakers are simultaneously active. There is thus some sort of chicken and egg problem. To tackle this, source signals and i-vectors are estimated alternately. We show that blind multi-speaker adaptation improves the results of the network and that (in our case) the network is not capable of adequately retrieving this useful speaker information itself

arXiv.org e-Print Archive

Lirias

Crossref

Self-Supervised Blind Source Separation via Multi-Encoder Autoencoders

Author: Lee Joonnyong
Webster Matthew B.
Publication venue
Publication date: 31/08/2023
Field of study

The task of blind source separation (BSS) involves separating sources from a mixture without prior knowledge of the sources or the mixing system. This is a challenging problem that often requires making restrictive assumptions about both the mixing system and the sources. In this paper, we propose a novel method for addressing BSS of non-linear mixtures by leveraging the natural feature subspace specialization ability of multi-encoder autoencoders with fully self-supervised learning without strong priors. During the training phase, our method unmixes the input into the separate encoding spaces of the multi-encoder network and then remixes these representations within the decoder for a reconstruction of the input. Then to perform source inference, we introduce a novel encoding masking technique whereby masking out all but one of the encodings enables the decoder to estimate a source signal. To this end, we also introduce a so-called pathway separation loss that encourages sparsity between the unmixed encoding spaces throughout the decoder's layers and a so-called zero reconstruction loss on the decoder for coherent source estimations. In order to carefully evaluate our method, we conduct experiments on a toy dataset and with real-world biosignal recordings from a polysomnography sleep study for extracting respiration.Comment: 17 pages, 8 figures, submitted to Information Science

arXiv.org e-Print Archive

Multi-modal dictionary learning for image separation with application in art investigation

Author: Cornelis Bruno
Daubechies Ingrid
Deligiannis Nikos
Mota Joao F. C.
Rodrigues Miguel R. D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/07/2016
Field of study

In support of art investigation, we propose a new source separation method that unmixes a single X-ray scan acquired from double-sided paintings. In this problem, the X-ray signals to be separated have similar morphological characteristics, which brings previous source separation methods to their limits. Our solution is to use photographs taken from the front and back-side of the panel to drive the separation process. The crux of our approach relies on the coupling of the two imaging modalities (photographs and X-rays) using a novel coupled dictionary learning framework able to capture both common and disparate features across the modalities using parsimonious representations; the common component models features shared by the multi-modal images, whereas the innovation component captures modality-specific information. As such, our model enables the formulation of appropriately regularized convex optimization procedures that lead to the accurate separation of the X-rays. Our dictionary learning framework can be tailored both to a single- and a multi-scale framework, with the latter leading to a significant performance improvement. Moreover, to improve further on the visual quality of the separated images, we propose to train coupled dictionaries that ignore certain parts of the painting corresponding to craquelure. Experimentation on synthetic and real data - taken from digital acquisition of the Ghent Altarpiece (1432) - confirms the superiority of our method against the state-of-the-art morphological component analysis technique that uses either fixed or trained dictionaries to perform image separation.Comment: submitted to IEEE Transactions on Images Processin

arXiv.org e-Print Archive

Heriot Watt Pure

UCL Discovery