1,468 research outputs found

    Effects of virtual acoustics on dynamic auditory distance perception

    Get PDF
    Sound propagation encompasses various acoustic phenomena including reverberation. Current virtual acoustic methods, ranging from parametric filters to physically-accurate solvers, can simulate reverberation with varying degrees of fidelity. We investigate the effects of reverberant sounds generated using different propagation algorithms on acoustic distance perception, i.e., how faraway humans perceive a sound source. In particular, we evaluate two classes of methods for real-time sound propagation in dynamic scenes based on parametric filters and ray tracing. Our study shows that the more accurate method shows less distance compression as compared to the approximate, filter-based method. This suggests that accurate reverberation in VR results in a better reproduction of acoustic distances. We also quantify the levels of distance compression introduced by different propagation methods in a virtual environment.Comment: 8 Pages, 7 figure

    Towards End-to-End Acoustic Localization using Deep Learning: from Audio Signal to Source Position Coordinates

    Full text link
    This paper presents a novel approach for indoor acoustic source localization using microphone arrays and based on a Convolutional Neural Network (CNN). The proposed solution is, to the best of our knowledge, the first published work in which the CNN is designed to directly estimate the three dimensional position of an acoustic source, using the raw audio signal as the input information avoiding the use of hand crafted audio features. Given the limited amount of available localization data, we propose in this paper a training strategy based on two steps. We first train our network using semi-synthetic data, generated from close talk speech recordings, and where we simulate the time delays and distortion suffered in the signal that propagates from the source to the array of microphones. We then fine tune this network using a small amount of real data. Our experimental results show that this strategy is able to produce networks that significantly improve existing localization methods based on \textit{SRP-PHAT} strategies. In addition, our experiments show that our CNN method exhibits better resistance against varying gender of the speaker and different window sizes compared with the other methods.Comment: 18 pages, 3 figures, 8 table

    Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments

    Get PDF
    We address the problem of online localization and tracking of multiple moving speakers in reverberant environments. The paper has the following contributions. We use the direct-path relative transfer function (DP-RTF), an inter-channel feature that encodes acoustic information robust against reverberation, and we propose an online algorithm well suited for estimating DP-RTFs associated with moving audio sources. Another crucial ingredient of the proposed method is its ability to properly assign DP-RTFs to audio-source directions. Towards this goal, we adopt a maximum-likelihood formulation and we propose to use an exponentiated gradient (EG) to efficiently update source-direction estimates starting from their currently available values. The problem of multiple speaker tracking is computationally intractable because the number of possible associations between observed source directions and physical speakers grows exponentially with time. We adopt a Bayesian framework and we propose a variational approximation of the posterior filtering distribution associated with multiple speaker tracking, as well as an efficient variational expectation-maximization (VEM) solver. The proposed online localization and tracking method is thoroughly evaluated using two datasets that contain recordings performed in real environments.Comment: IEEE Journal of Selected Topics in Signal Processing, 201

    Auditory environmental context affects visual distance perception

    Get PDF
    In this article, we show that visual distance perception (VDP) is influenced by the auditory environmental context through reverberation-related cues. We performed two VDP experiments in two dark rooms with extremely different reverberation times: an anechoic chamber and a reverberant room. Subjects assigned to the reverberant room perceived the targets farther than subjects assigned to the anechoic chamber. Also, we found a positive correlation between the maximum perceived distance and the auditorily perceived room size. We next performed a second experiment in which the same subjects of Experiment 1 were interchanged between rooms. We found that subjects preserved the responses from the previous experiment provided they were compatible with the present perception of the environment; if not, perceived distance was biased towards the auditorily perceived boundaries of the room. Results of both experiments show that the auditory environment can influence VDP, presumably through reverberation cues related to the perception of room size.Fil: Etchemendy, Pablo Esteban. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología. Laboratorio de Acústica y Percepción Sonora; ArgentinaFil: Abregú, Ezequiel Lucas. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología. Laboratorio de Acústica y Percepción Sonora; ArgentinaFil: Calcagno, Esteban. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología. Laboratorio de Acústica y Percepción Sonora; ArgentinaFil: Eguia, Manuel Camilo. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología. Laboratorio de Acústica y Percepción Sonora; ArgentinaFil: Vechiatti, Nilda. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología. Laboratorio de Acústica y Percepción Sonora; ArgentinaFil: Iasi, Federico. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología. Laboratorio de Acústica y Percepción Sonora; ArgentinaFil: Vergara, Ramiro Oscar. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología. Laboratorio de Acústica y Percepción Sonora; Argentin

    A Geometric Approach to Sound Source Localization from Time-Delay Estimates

    Get PDF
    This paper addresses the problem of sound-source localization from time-delay estimates using arbitrarily-shaped non-coplanar microphone arrays. A novel geometric formulation is proposed, together with a thorough algebraic analysis and a global optimization solver. The proposed model is thoroughly described and evaluated. The geometric analysis, stemming from the direct acoustic propagation model, leads to necessary and sufficient conditions for a set of time delays to correspond to a unique position in the source space. Such sets of time delays are referred to as feasible sets. We formally prove that every feasible set corresponds to exactly one position in the source space, whose value can be recovered using a closed-form localization mapping. Therefore we seek for the optimal feasible set of time delays given, as input, the received microphone signals. This time delay estimation problem is naturally cast into a programming task, constrained by the feasibility conditions derived from the geometric analysis. A global branch-and-bound optimization technique is proposed to solve the problem at hand, hence estimating the best set of feasible time delays and, subsequently, localizing the sound source. Extensive experiments with both simulated and real data are reported; we compare our methodology to four state-of-the-art techniques. This comparison clearly shows that the proposed method combined with the branch-and-bound algorithm outperforms existing methods. These in-depth geometric understanding, practical algorithms, and encouraging results, open several opportunities for future work.Comment: 13 pages, 2 figures, 3 table, journa
    corecore