853 research outputs found
Localization of Sound Sources in a Room with One Microphone
Estimation of the location of sound sources is usually done using microphone
arrays. Such settings provide an environment where we know the difference
between the received signals among different microphones in the terms of phase
or attenuation, which enables localization of the sound sources. In our
solution we exploit the properties of the room transfer function in order to
localize a sound source inside a room with only one microphone. The shape of
the room and the position of the microphone are assumed to be known. The design
guidelines and limitations of the sensing matrix are given. Implementation is
based on the sparsity in the terms of voxels in a room that are occupied by a
source. What is especially interesting about our solution is that we provide
localization of the sound sources not only in the horizontal plane, but in the
terms of the 3D coordinates inside the room
Towards End-to-End Acoustic Localization using Deep Learning: from Audio Signal to Source Position Coordinates
This paper presents a novel approach for indoor acoustic source localization
using microphone arrays and based on a Convolutional Neural Network (CNN). The
proposed solution is, to the best of our knowledge, the first published work in
which the CNN is designed to directly estimate the three dimensional position
of an acoustic source, using the raw audio signal as the input information
avoiding the use of hand crafted audio features. Given the limited amount of
available localization data, we propose in this paper a training strategy based
on two steps. We first train our network using semi-synthetic data, generated
from close talk speech recordings, and where we simulate the time delays and
distortion suffered in the signal that propagates from the source to the array
of microphones. We then fine tune this network using a small amount of real
data. Our experimental results show that this strategy is able to produce
networks that significantly improve existing localization methods based on
\textit{SRP-PHAT} strategies. In addition, our experiments show that our CNN
method exhibits better resistance against varying gender of the speaker and
different window sizes compared with the other methods.Comment: 18 pages, 3 figures, 8 table
Near-field acoustic holography for high-frequency weak sound sources under low signal-to-noise ratio
The mechanical noise in the cabin of the ship is so large that the leakage of high-pressure fluid is not easily noticed. In view of this situation, a near-field acoustic holography for high-frequency weak sound source under low signal-to-noise ratio is proposed. The method uses the empirical mode decomposition method to add weights to the time-domain sampling signals of each array element, and then uses the plane equivalent source near-field acoustic holography combined with compressive sensing to find the holographic surface acoustic pressure distribution. The simulation and experiment show that this method has certain feasibility under low signal-to-noise ratio, and the results are better than the method based on Fourier transform and the traditional boundary element method. It is of positive significance to apply it to engineering practice
Efficient coding of spectrotemporal binaural sounds leads to emergence of the auditory space representation
To date a number of studies have shown that receptive field shapes of early
sensory neurons can be reproduced by optimizing coding efficiency of natural
stimulus ensembles. A still unresolved question is whether the efficient coding
hypothesis explains formation of neurons which explicitly represent
environmental features of different functional importance. This paper proposes
that the spatial selectivity of higher auditory neurons emerges as a direct
consequence of learning efficient codes for natural binaural sounds. Firstly,
it is demonstrated that a linear efficient coding transform - Independent
Component Analysis (ICA) trained on spectrograms of naturalistic simulated
binaural sounds extracts spatial information present in the signal. A simple
hierarchical ICA extension allowing for decoding of sound position is proposed.
Furthermore, it is shown that units revealing spatial selectivity can be
learned from a binaural recording of a natural auditory scene. In both cases a
relatively small subpopulation of learned spectrogram features suffices to
perform accurate sound localization. Representation of the auditory space is
therefore learned in a purely unsupervised way by maximizing the coding
efficiency and without any task-specific constraints. This results imply that
efficient coding is a useful strategy for learning structures which allow for
making behaviorally vital inferences about the environment.Comment: 22 pages, 9 figure
Pushing towards the Limit of Sampling Rate: Adaptive Chasing Sampling
Measurement samples are often taken in various monitoring applications. To
reduce the sensing cost, it is desirable to achieve better sensing quality
while using fewer samples. Compressive Sensing (CS) technique finds its role
when the signal to be sampled meets certain sparsity requirements. In this
paper we investigate the possibility and basic techniques that could further
reduce the number of samples involved in conventional CS theory by exploiting
learning-based non-uniform adaptive sampling.
Based on a typical signal sensing application, we illustrate and evaluate the
performance of two of our algorithms, Individual Chasing and Centroid Chasing,
for signals of different distribution features. Our proposed learning-based
adaptive sampling schemes complement existing efforts in CS fields and do not
depend on any specific signal reconstruction technique. Compared to
conventional sparse sampling methods, the simulation results demonstrate that
our algorithms allow less number of samples for accurate signal
reconstruction and achieve up to smaller signal reconstruction error
under the same noise condition.Comment: 9 pages, IEEE MASS 201
Deep Sound Field Reconstruction in Real Rooms:Introducing the ISOBEL Sound Field Dataset
Knowledge of loudspeaker responses are useful in a number of applications,
where a sound system is located inside a room that alters the listening
experience depending on position within the room. Acquisition of sound fields
for sound sources located in reverberant rooms can be achieved through labor
intensive measurements of impulse response functions covering the room, or
alternatively by means of reconstruction methods which can potentially require
significantly fewer measurements. This paper extends evaluations of sound field
reconstruction at low frequencies by introducing a dataset with measurements
from four real rooms. The ISOBEL Sound Field dataset is publicly available, and
aims to bridge the gap between synthetic and real-world sound fields in
rectangular rooms. Moreover, the paper advances on a recent deep learning-based
method for sound field reconstruction using a very low number of microphones,
and proposes an approach for modeling both magnitude and phase response in a
U-Net-like neural network architecture. The complex-valued sound field
reconstruction demonstrates that the estimated room transfer functions are of
high enough accuracy to allow for personalized sound zones with contrast ratios
comparable to ideal room transfer functions using 15 microphones below 150 Hz
- …