853 research outputs found

    Localization of Sound Sources in a Room with One Microphone

    Get PDF
    Estimation of the location of sound sources is usually done using microphone arrays. Such settings provide an environment where we know the difference between the received signals among different microphones in the terms of phase or attenuation, which enables localization of the sound sources. In our solution we exploit the properties of the room transfer function in order to localize a sound source inside a room with only one microphone. The shape of the room and the position of the microphone are assumed to be known. The design guidelines and limitations of the sensing matrix are given. Implementation is based on the sparsity in the terms of voxels in a room that are occupied by a source. What is especially interesting about our solution is that we provide localization of the sound sources not only in the horizontal plane, but in the terms of the 3D coordinates inside the room

    Towards End-to-End Acoustic Localization using Deep Learning: from Audio Signal to Source Position Coordinates

    Full text link
    This paper presents a novel approach for indoor acoustic source localization using microphone arrays and based on a Convolutional Neural Network (CNN). The proposed solution is, to the best of our knowledge, the first published work in which the CNN is designed to directly estimate the three dimensional position of an acoustic source, using the raw audio signal as the input information avoiding the use of hand crafted audio features. Given the limited amount of available localization data, we propose in this paper a training strategy based on two steps. We first train our network using semi-synthetic data, generated from close talk speech recordings, and where we simulate the time delays and distortion suffered in the signal that propagates from the source to the array of microphones. We then fine tune this network using a small amount of real data. Our experimental results show that this strategy is able to produce networks that significantly improve existing localization methods based on \textit{SRP-PHAT} strategies. In addition, our experiments show that our CNN method exhibits better resistance against varying gender of the speaker and different window sizes compared with the other methods.Comment: 18 pages, 3 figures, 8 table

    Near-field acoustic holography for high-frequency weak sound sources under low signal-to-noise ratio

    Get PDF
    The mechanical noise in the cabin of the ship is so large that the leakage of high-pressure fluid is not easily noticed. In view of this situation, a near-field acoustic holography for high-frequency weak sound source under low signal-to-noise ratio is proposed. The method uses the empirical mode decomposition method to add weights to the time-domain sampling signals of each array element, and then uses the plane equivalent source near-field acoustic holography combined with compressive sensing to find the holographic surface acoustic pressure distribution. The simulation and experiment show that this method has certain feasibility under low signal-to-noise ratio, and the results are better than the method based on Fourier transform and the traditional boundary element method. It is of positive significance to apply it to engineering practice

    Efficient coding of spectrotemporal binaural sounds leads to emergence of the auditory space representation

    Full text link
    To date a number of studies have shown that receptive field shapes of early sensory neurons can be reproduced by optimizing coding efficiency of natural stimulus ensembles. A still unresolved question is whether the efficient coding hypothesis explains formation of neurons which explicitly represent environmental features of different functional importance. This paper proposes that the spatial selectivity of higher auditory neurons emerges as a direct consequence of learning efficient codes for natural binaural sounds. Firstly, it is demonstrated that a linear efficient coding transform - Independent Component Analysis (ICA) trained on spectrograms of naturalistic simulated binaural sounds extracts spatial information present in the signal. A simple hierarchical ICA extension allowing for decoding of sound position is proposed. Furthermore, it is shown that units revealing spatial selectivity can be learned from a binaural recording of a natural auditory scene. In both cases a relatively small subpopulation of learned spectrogram features suffices to perform accurate sound localization. Representation of the auditory space is therefore learned in a purely unsupervised way by maximizing the coding efficiency and without any task-specific constraints. This results imply that efficient coding is a useful strategy for learning structures which allow for making behaviorally vital inferences about the environment.Comment: 22 pages, 9 figure

    Pushing towards the Limit of Sampling Rate: Adaptive Chasing Sampling

    Full text link
    Measurement samples are often taken in various monitoring applications. To reduce the sensing cost, it is desirable to achieve better sensing quality while using fewer samples. Compressive Sensing (CS) technique finds its role when the signal to be sampled meets certain sparsity requirements. In this paper we investigate the possibility and basic techniques that could further reduce the number of samples involved in conventional CS theory by exploiting learning-based non-uniform adaptive sampling. Based on a typical signal sensing application, we illustrate and evaluate the performance of two of our algorithms, Individual Chasing and Centroid Chasing, for signals of different distribution features. Our proposed learning-based adaptive sampling schemes complement existing efforts in CS fields and do not depend on any specific signal reconstruction technique. Compared to conventional sparse sampling methods, the simulation results demonstrate that our algorithms allow 46%46\% less number of samples for accurate signal reconstruction and achieve up to 57%57\% smaller signal reconstruction error under the same noise condition.Comment: 9 pages, IEEE MASS 201

    Deep Sound Field Reconstruction in Real Rooms:Introducing the ISOBEL Sound Field Dataset

    Get PDF
    Knowledge of loudspeaker responses are useful in a number of applications, where a sound system is located inside a room that alters the listening experience depending on position within the room. Acquisition of sound fields for sound sources located in reverberant rooms can be achieved through labor intensive measurements of impulse response functions covering the room, or alternatively by means of reconstruction methods which can potentially require significantly fewer measurements. This paper extends evaluations of sound field reconstruction at low frequencies by introducing a dataset with measurements from four real rooms. The ISOBEL Sound Field dataset is publicly available, and aims to bridge the gap between synthetic and real-world sound fields in rectangular rooms. Moreover, the paper advances on a recent deep learning-based method for sound field reconstruction using a very low number of microphones, and proposes an approach for modeling both magnitude and phase response in a U-Net-like neural network architecture. The complex-valued sound field reconstruction demonstrates that the estimated room transfer functions are of high enough accuracy to allow for personalized sound zones with contrast ratios comparable to ideal room transfer functions using 15 microphones below 150 Hz
    • …
    corecore