1,825 research outputs found

    Effects of reverberation conditions and physical versus virtual source placement on localization in virtual sound environments

    Get PDF
    Sound field synthesis systems vary in number and arrangement of loudspeakers and methods used to generate virtual sound environments to study human hearing perception. While previous work has evaluated the accuracy with which these systems physically reproduce room acoustic conditions, less is known on assessing subjective perception of those conditions, such as how well such systems preserve source localization. This work quantifies the accuracy and precision of perceived localization from a multi-channel sound field synthesis system at Boys Town National Research Hospital, which used 24 physical loudspeakers and vector-based amplitude panning to generate sound fields. Short bursts of broadband speech-shaped noise were presented from source locations (either coinciding with a physical loudspeaker location, or panned between loudspeakers) under free-field and modeled reverberant-room conditions. Listeners used a HTC Vive remote laser tracking system to point to the perceived source location.Results show that the system synthesizes source locations accurately for both physical and panned sources, in both azimuth and elevation. Panned sources, though, are localized less precisely than physical sources. Reverberant condition is also found to affect both the accuracy and precision of localization in the azimuthal plane, with dry conditions producing greater accuracy and better precision. Only accuracy (not precision) of localization in elevation was impacted by reverberant condition, with reverberant cases producing results closer to the target than dry cases. An interaction effect of reverberant condition with elevation on localization in elevation, though, indicates that dry conditions result in better localization in elevation than reverberant ones at an elevation close to head height, but the situations at higher elevations are where subjects localized dry sources lower than the target height, while reverberant ones were more accurately placed. Other laboratories with sound field synthesis systems are encouraged to gather similar data on the accuracy and precision of localization in azimuth and elevation, so that results from studies using these systems can be better interpreted in light of the capabilities of the system to generate accurate and precise reproductions of source locations. [Work supported by NIH GM109023.] Advisor: Lily M. Wan

    Regression and Classification for Direction-of-Arrival Estimation with Convolutional Recurrent Neural Networks

    Full text link
    We present a novel learning-based approach to estimate the direction-of-arrival (DOA) of a sound source using a convolutional recurrent neural network (CRNN) trained via regression on synthetic data and Cartesian labels. We also describe an improved method to generate synthetic data to train the neural network using state-of-the-art sound propagation algorithms that model specular as well as diffuse reflections of sound. We compare our model against three other CRNNs trained using different formulations of the same problem: classification on categorical labels, and regression on spherical coordinate labels. In practice, our model achieves up to 43% decrease in angular error over prior methods. The use of diffuse reflection results in 34% and 41% reduction in angular prediction errors on LOCATA and SOFA datasets, respectively, over prior methods based on image-source methods. Our method results in an additional 3% error reduction over prior schemes that use classification based networks, and we use 36% fewer network parameters

    PSD Estimation of Multiple Sound Sources in a Reverberant Room Using a Spherical Microphone Array

    Full text link
    We propose an efficient method to estimate source power spectral densities (PSDs) in a multi-source reverberant environment using a spherical microphone array. The proposed method utilizes the spatial correlation between the spherical harmonics (SH) coefficients of a sound field to estimate source PSDs. The use of the spatial cross-correlation of the SH coefficients allows us to employ the method in an environment with a higher number of sources compared to conventional methods. Furthermore, the orthogonality property of the SH basis functions saves the effort of designing specific beampatterns of a conventional beamformer-based method. We evaluate the performance of the algorithm with different number of sources in practical reverberant and non-reverberant rooms. We also demonstrate an application of the method by separating source signals using a conventional beamformer and a Wiener post-filter designed from the estimated PSDs.Comment: Accepted for WASPAA 201

    Sound Source Localization in a Multipath Environment Using Convolutional Neural Networks

    Full text link
    The propagation of sound in a shallow water environment is characterized by boundary reflections from the sea surface and sea floor. These reflections result in multiple (indirect) sound propagation paths, which can degrade the performance of passive sound source localization methods. This paper proposes the use of convolutional neural networks (CNNs) for the localization of sources of broadband acoustic radiated noise (such as motor vessels) in shallow water multipath environments. It is shown that CNNs operating on cepstrogram and generalized cross-correlogram inputs are able to more reliably estimate the instantaneous range and bearing of transiting motor vessels when the source localization performance of conventional passive ranging methods is degraded. The ensuing improvement in source localization performance is demonstrated using real data collected during an at-sea experiment.Comment: 5 pages, 5 figures, Final draft of paper submitted to 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 15-20 April 2018 in Calgary, Alberta, Canada. arXiv admin note: text overlap with arXiv:1612.0350

    Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments

    Get PDF
    We address the problem of online localization and tracking of multiple moving speakers in reverberant environments. The paper has the following contributions. We use the direct-path relative transfer function (DP-RTF), an inter-channel feature that encodes acoustic information robust against reverberation, and we propose an online algorithm well suited for estimating DP-RTFs associated with moving audio sources. Another crucial ingredient of the proposed method is its ability to properly assign DP-RTFs to audio-source directions. Towards this goal, we adopt a maximum-likelihood formulation and we propose to use an exponentiated gradient (EG) to efficiently update source-direction estimates starting from their currently available values. The problem of multiple speaker tracking is computationally intractable because the number of possible associations between observed source directions and physical speakers grows exponentially with time. We adopt a Bayesian framework and we propose a variational approximation of the posterior filtering distribution associated with multiple speaker tracking, as well as an efficient variational expectation-maximization (VEM) solver. The proposed online localization and tracking method is thoroughly evaluated using two datasets that contain recordings performed in real environments.Comment: IEEE Journal of Selected Topics in Signal Processing, 201
    • …
    corecore