2 research outputs found
Weakly Supervised Audio Source Separation via Spectrum Energy Preserved Wasserstein Learning
Separating audio mixtures into individual instrument tracks has been a long
standing challenging task. We introduce a novel weakly supervised audio source
separation approach based on deep adversarial learning. Specifically, our loss
function adopts the Wasserstein distance which directly measures the
distribution distance between the separated sources and the real sources for
each individual source. Moreover, a global regularization term is added to
fulfill the spectrum energy preservation property regardless separation. Unlike
state-of-the-art weakly supervised models which often involve deliberately
devised constraints or careful model selection, our approach need little prior
model specification on the data, and can be straightforwardly learned in an
end-to-end fashion. We show that the proposed method performs competitively on
public benchmark against state-of-the-art weakly supervised methods
Recommended from our members
An Adaptive Strategy for Sensory Processing
Recognizing objects and detecting associations among them is essential for the survival of organisms. The ability to perform these tasks is derived from the representations of objects obtained through processing information along sensory pathways. Our current understanding of sensory processing is based on two sets of foundational theories β The Efficient Coding Hypothesis and hierarchical assembly of object representations. These theories suggest that sensory processing aims to identify independent features of the environment and progressively represent objects in terms of comprehensive combinations of these features. Separately, the two sets of theories have successfully explained the detection of associations and perceptual invariance, respectively; however, reconciling them together in one unified theory has remained challenging. Independent features are deemed essential for detecting association by the Efficient coding hypothesis, but to achieve consistency in representations, multiple comprehensive structures corresponding to the same object must be hierarchically assembled, ignoring independence among such structures.
Here we propose an alternative framework for sensory processing in which the system, instead of finding the truly independent components of the environment, aims to represent objects based on their most informative structures. Using theoretical arguments, we show that following such a strategy allows the system to efficiently represent sensory cues without necessarily acquiring knowledge about statistical properties of all possible inputs. Through mathematical simulations, we find that the framework can describe the known characteristics of early sensory processing stages and permits consistent input representations observed at later stages of processing. We also demonstrate that the framework can be implemented in a biologically plausible neuronal circuit and explain aspects of experience and learning from corrupted inputs. Thus, this framework provides a novel perspective and a unified description of sensory processing in its entirety