93,037 research outputs found
Improving Source Separation via Multi-Speaker Representations
Lately there have been novel developments in deep learning towards solving
the cocktail party problem. Initial results are very promising and allow for
more research in the domain. One technique that has not yet been explored in
the neural network approach to this task is speaker adaptation. Intuitively,
information on the speakers that we are trying to separate seems fundamentally
important for the speaker separation task. However, retrieving this speaker
information is challenging since the speaker identities are not known a priori
and multiple speakers are simultaneously active. There is thus some sort of
chicken and egg problem. To tackle this, source signals and i-vectors are
estimated alternately. We show that blind multi-speaker adaptation improves the
results of the network and that (in our case) the network is not capable of
adequately retrieving this useful speaker information itself
Self-Supervised Blind Source Separation via Multi-Encoder Autoencoders
The task of blind source separation (BSS) involves separating sources from a
mixture without prior knowledge of the sources or the mixing system. This is a
challenging problem that often requires making restrictive assumptions about
both the mixing system and the sources. In this paper, we propose a novel
method for addressing BSS of non-linear mixtures by leveraging the natural
feature subspace specialization ability of multi-encoder autoencoders with
fully self-supervised learning without strong priors. During the training
phase, our method unmixes the input into the separate encoding spaces of the
multi-encoder network and then remixes these representations within the decoder
for a reconstruction of the input. Then to perform source inference, we
introduce a novel encoding masking technique whereby masking out all but one of
the encodings enables the decoder to estimate a source signal. To this end, we
also introduce a so-called pathway separation loss that encourages sparsity
between the unmixed encoding spaces throughout the decoder's layers and a
so-called zero reconstruction loss on the decoder for coherent source
estimations. In order to carefully evaluate our method, we conduct experiments
on a toy dataset and with real-world biosignal recordings from a
polysomnography sleep study for extracting respiration.Comment: 17 pages, 8 figures, submitted to Information Science
Multi-modal dictionary learning for image separation with application in art investigation
In support of art investigation, we propose a new source separation method
that unmixes a single X-ray scan acquired from double-sided paintings. In this
problem, the X-ray signals to be separated have similar morphological
characteristics, which brings previous source separation methods to their
limits. Our solution is to use photographs taken from the front and back-side
of the panel to drive the separation process. The crux of our approach relies
on the coupling of the two imaging modalities (photographs and X-rays) using a
novel coupled dictionary learning framework able to capture both common and
disparate features across the modalities using parsimonious representations;
the common component models features shared by the multi-modal images, whereas
the innovation component captures modality-specific information. As such, our
model enables the formulation of appropriately regularized convex optimization
procedures that lead to the accurate separation of the X-rays. Our dictionary
learning framework can be tailored both to a single- and a multi-scale
framework, with the latter leading to a significant performance improvement.
Moreover, to improve further on the visual quality of the separated images, we
propose to train coupled dictionaries that ignore certain parts of the painting
corresponding to craquelure. Experimentation on synthetic and real data - taken
from digital acquisition of the Ghent Altarpiece (1432) - confirms the
superiority of our method against the state-of-the-art morphological component
analysis technique that uses either fixed or trained dictionaries to perform
image separation.Comment: submitted to IEEE Transactions on Images Processin
- …