2,086 research outputs found
On the Informed Source Separation Approach for Interactive Remixing in Stereo
International audienceInformed source separation (ISS) has become a popular trend in the audio signal processing community over the past few years. Its purpose is to decompose a mixture signal into its constituent parts at the desired or the best possible quality level given some metadata. In this paper we present a comparison between two ISS systems and relate the ISS approach in various configurations with conventional coding of separate tracks for interactive remixing in stereo. The compared systems are Underdetermined Source Signal Recovery (USSR) and Enhanced Audio Object Separation (EAOS). The latter forms a part of MPEG's Spatial Audio Object Coding technology. The performance is evaluated using objective difference grades computed with PEMO-Q. The results suggest that USSR performs perceptually better than EOAS and has a lower computational complexity
Efficient coding of spectrotemporal binaural sounds leads to emergence of the auditory space representation
To date a number of studies have shown that receptive field shapes of early
sensory neurons can be reproduced by optimizing coding efficiency of natural
stimulus ensembles. A still unresolved question is whether the efficient coding
hypothesis explains formation of neurons which explicitly represent
environmental features of different functional importance. This paper proposes
that the spatial selectivity of higher auditory neurons emerges as a direct
consequence of learning efficient codes for natural binaural sounds. Firstly,
it is demonstrated that a linear efficient coding transform - Independent
Component Analysis (ICA) trained on spectrograms of naturalistic simulated
binaural sounds extracts spatial information present in the signal. A simple
hierarchical ICA extension allowing for decoding of sound position is proposed.
Furthermore, it is shown that units revealing spatial selectivity can be
learned from a binaural recording of a natural auditory scene. In both cases a
relatively small subpopulation of learned spectrogram features suffices to
perform accurate sound localization. Representation of the auditory space is
therefore learned in a purely unsupervised way by maximizing the coding
efficiency and without any task-specific constraints. This results imply that
efficient coding is a useful strategy for learning structures which allow for
making behaviorally vital inferences about the environment.Comment: 22 pages, 9 figure
Informed Separation of Spatial Images of Stereo Music Recordings Using Second-Order Statistics
International audienceIn this work we address a reverse audio engineering problem, i.e. the separation of stereo tracks of professionally produced music recordings. More precisely, we apply a spatial filtering approach with a quadratic constraint using an explicit source-image-mixture model. The model parameters are "learned" from a given set of original stereo tracks, reduced in size and used afterwards to demix the desired tracks in best possible quality from a preexisting mixture. Our approach implicates a side-information rate of 10 kbps per source or channel and has a low computational complexity. The results obtained for the SiSEC 2013 dataset are intended to be used as reference for comparison with unpublished approaches
- …