292 research outputs found
Surround by Sound: A Review of Spatial Audio Recording and Reproduction
In this article, a systematic overview of various recording and reproduction techniques for spatial audio is presented. While binaural recording and rendering is designed to resemble the human two-ear auditory system and reproduce sounds specifically for a listener’s two ears, soundfield recording and reproduction using a large number of microphones and loudspeakers replicate an acoustic scene within a region. These two fundamentally different types of techniques are discussed in the paper. A recent popular area, multi-zone reproduction, is also briefly reviewed in the paper. The paper is concluded with a discussion of the current state of the field and open problemsThe authors acknowledge National Natural Science Foundation of China (NSFC) No.
61671380 and Australian Research Council Discovery Scheme DE 150100363
A Geometric Approach to Sound Source Localization from Time-Delay Estimates
This paper addresses the problem of sound-source localization from time-delay
estimates using arbitrarily-shaped non-coplanar microphone arrays. A novel
geometric formulation is proposed, together with a thorough algebraic analysis
and a global optimization solver. The proposed model is thoroughly described
and evaluated. The geometric analysis, stemming from the direct acoustic
propagation model, leads to necessary and sufficient conditions for a set of
time delays to correspond to a unique position in the source space. Such sets
of time delays are referred to as feasible sets. We formally prove that every
feasible set corresponds to exactly one position in the source space, whose
value can be recovered using a closed-form localization mapping. Therefore we
seek for the optimal feasible set of time delays given, as input, the received
microphone signals. This time delay estimation problem is naturally cast into a
programming task, constrained by the feasibility conditions derived from the
geometric analysis. A global branch-and-bound optimization technique is
proposed to solve the problem at hand, hence estimating the best set of
feasible time delays and, subsequently, localizing the sound source. Extensive
experiments with both simulated and real data are reported; we compare our
methodology to four state-of-the-art techniques. This comparison clearly shows
that the proposed method combined with the branch-and-bound algorithm
outperforms existing methods. These in-depth geometric understanding, practical
algorithms, and encouraging results, open several opportunities for future
work.Comment: 13 pages, 2 figures, 3 table, journa
Direction of Arrival with One Microphone, a few LEGOs, and Non-Negative Matrix Factorization
Conventional approaches to sound source localization require at least two
microphones. It is known, however, that people with unilateral hearing loss can
also localize sounds. Monaural localization is possible thanks to the
scattering by the head, though it hinges on learning the spectra of the various
sources. We take inspiration from this human ability to propose algorithms for
accurate sound source localization using a single microphone embedded in an
arbitrary scattering structure. The structure modifies the frequency response
of the microphone in a direction-dependent way giving each direction a
signature. While knowing those signatures is sufficient to localize sources of
white noise, localizing speech is much more challenging: it is an ill-posed
inverse problem which we regularize by prior knowledge in the form of learned
non-negative dictionaries. We demonstrate a monaural speech localization
algorithm based on non-negative matrix factorization that does not depend on
sophisticated, designed scatterers. In fact, we show experimental results with
ad hoc scatterers made of LEGO bricks. Even with these rudimentary structures
we can accurately localize arbitrary speakers; that is, we do not need to learn
the dictionary for the particular speaker to be localized. Finally, we discuss
multi-source localization and the related limitations of our approach.Comment: This article has been accepted for publication in IEEE/ACM
Transactions on Audio, Speech, and Language processing (TASLP
Proceedings of the EAA Spatial Audio Signal Processing symposium: SASP 2019
International audienc
Compensating first reflections in non-anechoic head-related transfer function measurements
[EN] Personalized Head-Related Transfer Functions (HRTFs) are needed as part of the binaural sound individ- ualization process in order to provide a high-quality immersive experience for a specific user. Signal processing methods for performing HRTF measurements in non-anechoic conditions are of high interest to avoid the complex and inconvenient access to anechoic facilities. Non-anechoic HRTF measurements capture the effect of room reflections, which should be correctly identified and eliminated to obtain HRTFs estimates comparable to ones acquired in an anechoic setup. This paper proposes a sub-band frequency-dependent processing method for reflection suppression in non-anechoic HRTF signals. Array processing techniques based on Plane Wave Decomposition (PWD) are adopted as an essential part of the solution for low frequency ranges, whereas the higher frequencies are easily handled by means of time-crop windowing methods. The formulation of the model, extraction of parameters and evaluation of the method are described in detail. In addition, a validation case study is presented showing the suppression of reflections from an HRTF measured in a real system. The results confirm that the method allows to obtain processed HRTFs comparable to those acquired in anechoic conditions.This work has received funding from the Spanish Ministry of Science, Innovation and Universities, through projects RTI2018097045-B-C21 and RTI2018-097045-B-C22, and Generalitat Valenciana under the AICO/2020/154 project grant.López Monfort, JJ.; Gutierrez-Parera, P.; Cobos, M. (2022). Compensating first reflections in non-anechoic head-related transfer function measurements. Applied Acoustics. 188:1-13. https://doi.org/10.1016/j.apacoust.2021.10852311318
- …