1,272 research outputs found

    A Geometric Approach to Sound Source Localization from Time-Delay Estimates

    Get PDF
    This paper addresses the problem of sound-source localization from time-delay estimates using arbitrarily-shaped non-coplanar microphone arrays. A novel geometric formulation is proposed, together with a thorough algebraic analysis and a global optimization solver. The proposed model is thoroughly described and evaluated. The geometric analysis, stemming from the direct acoustic propagation model, leads to necessary and sufficient conditions for a set of time delays to correspond to a unique position in the source space. Such sets of time delays are referred to as feasible sets. We formally prove that every feasible set corresponds to exactly one position in the source space, whose value can be recovered using a closed-form localization mapping. Therefore we seek for the optimal feasible set of time delays given, as input, the received microphone signals. This time delay estimation problem is naturally cast into a programming task, constrained by the feasibility conditions derived from the geometric analysis. A global branch-and-bound optimization technique is proposed to solve the problem at hand, hence estimating the best set of feasible time delays and, subsequently, localizing the sound source. Extensive experiments with both simulated and real data are reported; we compare our methodology to four state-of-the-art techniques. This comparison clearly shows that the proposed method combined with the branch-and-bound algorithm outperforms existing methods. These in-depth geometric understanding, practical algorithms, and encouraging results, open several opportunities for future work.Comment: 13 pages, 2 figures, 3 table, journa

    End-to-End Magnitude Least Squares Binaural Rendering of Spherical Microphone Array Signals

    Get PDF
    Spherica1 microphone array (SMA) recordings are particularly suited for dynamic binaural rendering as the microphone signals can be decomposed into a spherical harmonic (SH) representation that can be freely rotated to match the head orientation of the listener. The rendering of such SMA recordings is a non-trivial task as the SH signals are impaired due to truncation of the SH decomposition order, spatial aliasing and the gain limitation of the employed radial filters. The perceptually most relevant consequence of this is an alteration of the magnitude transfer function at high frequencies. Previously, the magnitude least squares (MagLS) renderer for binaural rendering of SH signals was proposed to mitigate these effects under the assumption of ideal order-truncated plane waves, i.e., disregarding the influence of spatial aliasing as well as of non-ideal radial filters. Based on the MagLS renderer, we present a binaural rendering method for SMA recordings that integrates a comprehensive SMA model into the magnitude least squares objective. We evaluate the proposed end-to-end renderer by analyzing the reproduced binaural magnitude response. Our results suggest that the method significantly improves the high-frequency rendering mainly due to the inherent binaural diffuse-field equalization, while it achieves a slight improvement in the low and mid frequency range, where the error of the conventional method is already small. A reference implementation of the method accompanies this paper

    Multiple source localization using spherical microphone arrays

    Get PDF
    Direction-of-Arrival (DOA) estimation is a fundamental task in acoustic signal processing and is used in source separation, localization, tracking, environment mapping, speech enhancement and dereverberation. In applications such as hearing aids, robot audition, teleconferencing and meeting diarization, the presence of multiple simultaneously active sources often occurs. Therefore DOA estimation which is robust to Multi-Source (MS) scenarios is of particular importance. In the past decade, interest in Spherical Microphone Arrays (SMAs) has been rapidly grown due to its ability to analyse the sound field with equal resolution in all directions. Such symmetry makes SMAs suitable for applications in robot audition where potential variety of heights and positions of the talkers are expected. Acoustic signal processing for SMAs is often formulated in the Spherical Harmonic Domain (SHD) which describes the sound field in a form that is independent of the geometry of the SMA. DOA estimation methods for the real-world scenarios address one or more performance degrading factors such as noise, reverberation, multi-source activity or tackled problems such as source counting or reducing computational complexity. This thesis addresses various problems in MS DOA estimation for speech sources each of which focuses on one or more performance degrading factor(s). Firstly a narrowband DOA estimator is proposed utilizing high order spatial information in two computationally efficient ways. Secondly, an autonomous source counting technique is proposed which uses density-based clustering in an evolutionary framework. Thirdly, a confidence metric for validity of Single Source (SS) assumption in a Time-Frequency (TF) bin is proposed. It is based on MS assumption in a short time interval where the number and the TF bin of active sources are adaptively estimated. Finally two analytical narrowband MS DOA estimators are proposed based on MS assumption in a TF bin. The proposed methods are evaluated using simulations and real recordings. Each proposed technique outperforms comparative baseline methods and performs at least as accurately as the state-of-the-art.Open Acces

    A Novel Method for Obtaining Diffuse Field Measurements for Microphone Calibration

    Get PDF
    We propose a straightforward and cost-effective method to perform diffuse soundfield measurements for calibrating the magnitude response of a microphone array. Typically, such calibration is performed in a diffuse soundfield created in reverberation chambers, an expensive and time-consuming process. A method is proposed for obtaining diffuse field measurements in untreated environments. First, a closed-form expression for the spatial correlation of a wideband signal in a diffuse field is derived. Next, we describe a practical procedure for obtaining the diffuse field response of a microphone array in the presence of a non-diffuse soundfield by the introduction of random perturbations in the microphone location. Experimental spatial correlation data obtained is compared with the theoretical model, confirming that it is possible to obtain diffuse field measurements in untreated environments with relatively few loudspeakers. A 30 second test signal played from 4-8 loudspeakers is shown to be sufficient in obtaining a diffuse field measurement using the proposed method. An Eigenmike is then successfully calibrated at two different geographical locations.Comment: Accepted to appear in IEEE ICASSP 202

    Proceedings of the EAA Spatial Audio Signal Processing symposium: SASP 2019

    Get PDF
    International audienc

    Recording, Analysis and Playback of Spatial Sound Field using Novel Design Methods of Transducer Arrays

    Get PDF
    Nowadays, a growing interest in the recording and reproduction of spatial audio has been observed. With virtual and augmented reality technologies spreading fast thanks to entertainment and video game industries, also the professional opportunities in the field of engineering are evolving. However, despite many microphone arrays are reaching the market, most of them is not optimized for engineering or diagnostic use and remains mainly confined to voice and music recordings. In this thesis, the design of two new systems for recording and analysing the spatial distribution of sound energy, employing arrays of transducers and cameras, is discussed. Both acoustic and visual spatial information is recorded and combined together to produce static and dynamic colour maps, with a specially designed software and employing Ambisonics and Spatial PCM Sampling (SPS), two common spatial audio formats, for signals processing. The first solution consists in a microphone array made of 32 capsules and a circular array of eight cameras, optimized for low frequencies. The size of the array is designed accordingly to the frequency range of interest for automotive Noise, Vibration & Harshness (NVH) applications. The second system is an underwater probe with four hydrophones and a panoramic camera, with which it is possible to monitor the effects of underwater noise produced by human activities on marine species. Finite Elements Method (FEM) simulations have been used to calculate the array response, thus deriving the filtering matrix and performing theoretical evaluation of the spatial performance. Field tests of the proposed solutions are presented in comparison with the current state-of-the-art equipment. The faithful reproduction of the spatial sound field arouses equally interest. Hence, a method to playback panoramic video with spatial audio is presented, making use of Virtual Reality (VR) technology, spatial audio, individualized Head Related Transfer Functions (HRTFs) and personalized headphones equalization. The work in its entirety presents a complete methodology for recording, analysing and reproducing the spatial information of soundscapes

    Acceleration Techniques for Sparse Recovery Based Plane-wave Decomposition of a Sound Field

    Get PDF
    Plane-wave decomposition by sparse recovery is a reliable and accurate technique for plane-wave decomposition which can be used for source localization, beamforming, etc. In this work, we introduce techniques to accelerate the plane-wave decomposition by sparse recovery. The method consists of two main algorithms which are spherical Fourier transformation (SFT) and sparse recovery. Comparing the two algorithms, the sparse recovery is the most computationally intensive. We implement the SFT on an FPGA and the sparse recovery on a multithreaded computing platform. Then the multithreaded computing platform could be fully utilized for the sparse recovery. On the other hand, implementing the SFT on an FPGA helps to flexibly integrate the microphones and improve the portability of the microphone array. For implementing the SFT on an FPGA, we develop a scalable FPGA design model that enables the quick design of the SFT architecture on FPGAs. The model considers the number of microphones, the number of SFT channels and the cost of the FPGA and provides the design of a resource optimized and cost-effective FPGA architecture as the output. Then we investigate the performance of the sparse recovery algorithm executed on various multithreaded computing platforms (i.e., chip-multiprocessor, multiprocessor, GPU, manycore). Finally, we investigate the influence of modifying the dictionary size on the computational performance and the accuracy of the sparse recovery algorithms. We introduce novel sparse-recovery techniques which use non-uniform dictionaries to improve the performance of the sparse recovery on a parallel architecture

    Eigenbeamforming array systems for sound source localization

    Get PDF
    corecore