30,447 research outputs found
Acoustic Space Learning for Sound Source Separation and Localization on Binaural Manifolds
In this paper we address the problems of modeling the acoustic space
generated by a full-spectrum sound source and of using the learned model for
the localization and separation of multiple sources that simultaneously emit
sparse-spectrum sounds. We lay theoretical and methodological grounds in order
to introduce the binaural manifold paradigm. We perform an in-depth study of
the latent low-dimensional structure of the high-dimensional interaural
spectral data, based on a corpus recorded with a human-like audiomotor robot
head. A non-linear dimensionality reduction technique is used to show that
these data lie on a two-dimensional (2D) smooth manifold parameterized by the
motor states of the listener, or equivalently, the sound source directions. We
propose a probabilistic piecewise affine mapping model (PPAM) specifically
designed to deal with high-dimensional data exhibiting an intrinsic piecewise
linear structure. We derive a closed-form expectation-maximization (EM)
procedure for estimating the model parameters, followed by Bayes inversion for
obtaining the full posterior density function of a sound source direction. We
extend this solution to deal with missing data and redundancy in real world
spectrograms, and hence for 2D localization of natural sound sources such as
speech. We further generalize the model to the challenging case of multiple
sound sources and we propose a variational EM framework. The associated
algorithm, referred to as variational EM for source separation and localization
(VESSL) yields a Bayesian estimation of the 2D locations and time-frequency
masks of all the sources. Comparisons of the proposed approach with several
existing methods reveal that the combination of acoustic-space learning with
Bayesian inference enables our method to outperform state-of-the-art methods.Comment: 19 pages, 9 figures, 3 table
Kolmogorov turbulence, Anderson localization and KAM integrability
The conditions for emergence of Kolmogorov turbulence, and related weak wave
turbulence, in finite size systems are analyzed by analytical methods and
numerical simulations of simple models. The analogy between Kolmogorov energy
flow from large to small spacial scales and conductivity in disordered solid
state systems is proposed. It is argued that the Anderson localization can stop
such an energy flow. The effects of nonlinear wave interactions on such a
localization are analyzed. The results obtained for finite size system models
show the existence of an effective chaos border between the
Kolmogorov-Arnold-Moser (KAM) integrability at weak nonlinearity, when energy
does not flow to small scales, and developed chaos regime emerging above this
border with the Kolmogorov turbulent energy flow from large to small scales.Comment: 8 pages, 6 figs, EPJB style
Effects of feedback, mobility and index of difficulty on deictic spatial audio target acquisition in the horizontal plane
We present the results of an empirical study investigating the effect of feedback, mobility and index of difficulty on a deictic spatial audio target acquisition task in the horizontal plane in front of a user. With audio feedback, spatial audio display elements are found to enable usable deictic interac-tion that can be described using Fitts law. Feedback does not affect perceived workload or preferred walking speed compared to interaction without feedback. Mobility is found to degrade interaction speed and accuracy by 20%. Participants were able to perform deictic spatial audio target acquisition when mobile while walking at 73% of their pre-ferred walking speed. The proposed feedback design is ex-amined in detail and the effects of variable target widths are quantified. Deictic interaction with a spatial audio display is found to be a feasible solution for future interface designs
2D to 3D ambience upmixing based on perceptual band allocation
3D multichannel audio systems employ additional elevated loudspeakers in order to provide listeners with a vertical dimension to their auditory experience. Listening tests were conducted to evaluate the feasibility of a novel vertical upmixing technique called âperceptual band allocation (PBA),â which is based on a psychoacoustic principle of vertical sound localization, the âpitch heightâ effect. The practical feasibility of the method was investigated using 4-channel ambience signals recorded in a reverberant concert hall using the Hamasaki-Square microphone technique. Results showed that the PBA-upmixed 3D stimuli were significantly stronger than or similar to 9-channel 3D stimuli in 3D listener-envelopment (LEV), depending on the sound source and the crossover frequency of PBA. They also significantly produced greater 3D LEV than the 7-channel 3D stimuli. For the preference tests, the PBA stimuli were significantly preferred over the original 9-channel stimuli
- âŠ