26 research outputs found
Multiple source direction of arrival estimation using subspace pseudointensity vectors
The recently proposed subspace pseudointensity method for direction of
arrival estimation is applied in the context of Tasks 1 and 2 of the LOCATA
Challenge using the Eigenmike recordings. Specific implementation details are
described and results reported for the development dataset, for which the
ground truth source directions are available. For both single and multiple
source scenarios, the average absolute error angle is about 9 degrees.Comment: In Proceedings of the LOCATA Challenge Workshop - a satellite event
of IWAENC 2018 (arXiv:1811.08482
Augmented Intensity Vectors for Direction of Arrival Estimation in the Spherical Harmonic Domain
Pseudointensity vectors (PIVs) provide a means of direction of arrival (DOA) estimation for spherical microphone arrays using only the zeroth and the first-order spherical harmonics. An augmented intensity vector (AIV) is proposed which improves the accuracy of PIVs by exploiting higher order spherical harmonics. We compared DOA estimation using our proposed AIVs against PIVs, steered response power (SRP) and subspace methods where the number of sources, their angular separation, the reverberation time of the room and the sensor noise level are varied. The results show that the proposed approach outperforms the baseline methods and performs at least as accurately as the state-of-the-art method with strong robustness to reverberation, sensor noise, and number of sources. In the single and multiple source scenarios tested, which include realistic levels of reverberation and noise, the proposed method had average error of 1.5â and 2â, respectively
Online DOA estimation using real eigenbeam ESPRIT with propagation vector matching
International audienceThe Eigenbeam estimation of signal parameters via rotational invariance technique (EB-ESPRIT) [1] is a method to estimate multiple directions-of-arrival (DOAs) of sound sources from a spherical microphone array recording in the spherical harmonics domain (SHD). The method, first, constructs a signal subspace from the SHD signal and then makes use of the fact that, for plane-wave sources, the signal subspace is spanned by the (complex conjugate) spherical harmonic vectors at the source directions. The DOAs are then estimated from the signal subspace using recurrence relations of spherical harmonics.In recent publications, the singularity and ambiguity problems of the original EB-ESPRIT have been solved by jointly combining several types of recurrence relations. The state-of-the-art EB-ESPRIT, denoted as DOA-vector EB-ESPRIT, is based on three recurrence relations [2,3]. This EB-ESPRIT variant can estimate the source DOAs with significantly higher accuracy compared to the other EB-ESPRIT variants [3]. However, a permutation problem arises, which can be solved by using, for example, a joint diagonalization method [3].For parametric spatial audio signal processing purposes in the short-time Fourier transform (STFT) domain, DOA estimates are usually needed per time-frame and frequency bin. In principle, one can use the DOA-vector EB-ESPRIT method to estimate the source DOAs per time-frequency bin in an online manner. However, due to the eigendecompostion of the PSD matrix and the joint diagonalization procedure, the computational cost might be too large for many real-time applications.In this work, we propose a computationally more efficient version of the DOA-vector EB-ESPRIT based on real spherical harmonics recurrence relations. First, we separate the real and imaginary parts of the real SHD signal in the STFT domain and then construct a real signal subspace thereof, which can be recursively estimated using the deflated projection approximation subspace tracking (PASTd) [4] method. For the case of one source per time-frequency bin, the joint diagonalization is not necessary and we can simplify the EB-ESPRIT equations. For the case of two sources, the plane-wave propagation vectors can directly be estimated from the signal subspace eigenvectors by employing properties of the propagation vectors. This method can be seen as a higher order ambisonics extension of the robust B-format DOA estimation in [5]. The proposed method for estimating two DOAs can be summarized as follows:1. Separate real and imaginary parts of the real SHD signal in the STFT domain.2. Recursively estimate the signal subspace eigenvectors using PASTd.3. Estimate the two plane-wave propagation vectors from the signal subspace eigenvectors by using that they span the same subspace and by using properties of the propagation vectors (subspace-propagation vector matching).4. Estimate the DOAs by using three types of real spherical harmonics recurrence relations.Alternatively, one can estimate the DOAs analogously to the complex DOA-vector EB-ESPRIT using the joint diagonalization method proposed in [3].For the evaluation, we simulate SHD signals up to third order with one and two speech sources in reverberant and noisy environments. For the one-source scenarios, we compare the real DOA-vector EB-ESPRIT with subspace estimation based on singular value decomposition (SVD) against PASTd. For the two-source scenarios, we compare the real DOA-vector EB-ESPRIT with joint diagonalization against subspace-propagation vector matching and the robust B-format DOA estimation method.We analyze the angular distributions of the DOA estimates and find, that the DOA estimation using PASTd for the signal subspace estimation is slightly less accurate than the SVD based method but computationally much more efficient. For the estimation of two DOAs, the EB-ESPRIT based methods outperform the robust B-format estimation method when higher SHD orders are considered. The joint diagonalization method is more accurate than the subspace-propagation vector matching method. However, the latter is computationally more efficient.References:[1] H. Teutsch and W. Kellermann, âDetection and localization of multiple wideband acoustic sources based on wavefield decomposition using spherical apertures,â in Proc. IEEE Intl. Conf. Acoust., Speech Signal Proc. (ICASSP), Mar. 2008, pp. 5276â5279.[2] B. Jo and J. W. Choi, âNonsingular EB-ESPRIT for the localization of early reflections in a room,â J. Acoust. Soc. Am., vol. 144, no. 3, p. 1882, Sep. 2018.[3] A. Herzog and E. A. P. Habets, âEigenbeam-ESPRIT for DOA-vector estimation,â IEEE Signal Process. Lett., vol. 26, no. 4, pp. 572-576, April 2019.[4] B. Yang â âProjection Approximation Subspace Tracking, IEEE Trans. Sig. Proc.,â vol. 43, no. 1, Jan. 1995.[5] O. Thiergart and E.A.P. Habets, âRobust direction-of-arrival estimation of two simultaneous plane waves from a B-format signal,â IEEE 27th Conv. of Electrical and Electronics Engineers in Israel, Nov. 2012
Blind identification of Ambisonic reduced room impulse response
Recently proposed Generalized Time-domain Velocity Vector (GTVV) is a
generalization of relative room impulse response in spherical harmonic (aka
Ambisonic) domain that allows for blind estimation of early-echo parameters:
the directions and relative delays of individual reflections. However, the
derived closed-form expression of GTVV mandates few assumptions to hold, most
important being that the impulse response of the reference signal needs to be a
minimum-phase filter. In practice, the reference is obtained by spatial
filtering towards the Direction-of-Arrival of the source, and the
aforementioned condition is bounded by the performance of the applied
beamformer (and thus, by the Ambisonic array order). In the present work, we
suggest to circumvent this problem by properly modelling the GTVV time series,
which permits not only to relax the initial assumptions, but also to extract
the information therein is a more consistent and efficient manner, entering the
realm of blind system identification. Experiments using measured room impulse
responses confirm the effectiveness of the proposed approach.Comment: Submitte
Spherical microphone array processing for acoustic parameter estimation and signal enhancement
In many distant speech acquisition scenarios, such as hands-free telephony or teleconferencing, the desired speech signal is corrupted by noise and reverberation. This degrades both the speech quality and intelligibility, making communication difficult or even impossible. Speech enhancement techniques seek to mitigate these effects and extract the desired speech signal.
This objective is commonly achieved through the use of microphone arrays, which take advantage of the spatial properties of the sound field in order to reduce noise and reverberation. Spherical microphone arrays, where the microphones are arranged in a spherical configuration, usually mounted on a rigid baffle, are able to analyze the sound field in three dimensions; the captured sound field can then be efficiently described in the spherical harmonic domain (SHD).
In this thesis, a number of novel spherical array processing algorithms are proposed, based in the SHD. In order to comprehensively evaluate these algorithms under a variety of conditions, a method is developed for simulating the acoustic impulse responses between a sound source and microphones positioned on a rigid spherical array placed in a reverberant environment.
The performance of speech enhancement algorithms can often be improved by taking advantage of additional a priori information, obtained by estimating various acoustic parameters. Methods for estimating two such parameters, the direction of arrival (DOA) of a source (static or moving) and the signal-to-diffuse energy ratio, are introduced.
Finally, the signals received by a microphone array can be filtered and summed by a beamformer. A tradeoff beamformer is proposed, which achieves a balance between speech distortion and noise reduction. The beamformer weights depend on the noise statistics, which cannot be directly observed and must be estimated. An estimation algorithm is developed for this purpose, exploiting the DOA estimates previously obtained to differentiate between desired and interfering coherent sources.Open Acces
Linear prediction based dereverberation for spherical microphone arrays
Dereverberation is an important preprocessing step in many speech systems, both for human and machine listening. In many situations, including robot audition, the sound sources of interest can be incident from any direction. In such circumstances, a spherical microphone array allows direction of arrival estimation which is free of spatial aliasing and directionindependent beam patterns can be formed. This contribution formulates the Weighted Prediction Error algorithm in the spherical harmonic domain and compares the performance to a space domain implementation. Simulation results demonstrate that performing dereverberation in the spherical harmonic domain allows many more microphones to be used without increasing the computational cost. The benefit of using many microphones is particularly apparent at low signal to noise ratios, where for the conditions tested up to 71% improvement in speech-to-reverberation modulation ratio was achieved
Microphone array signal processing for robot audition
Robot audition for humanoid robots interacting naturally with humans in an unconstrained real-world environment is a hitherto unsolved challenge. The recorded microphone signals are usually distorted by background and interfering noise sources (speakers) as well as room reverberation. In addition, the movements of a robot and its actuators cause ego-noise which degrades the recorded signals significantly. The movement of the robot body and its head also complicates the detection and tracking of the desired, possibly moving, sound sources of interest. This paper presents an overview of the concepts in microphone array processing for robot audition and some recent achievements
Spatial sound intensity vectors in spherical harmonic domain
Sound intensity is a fundamental quantity describing acoustic wave fields and it contains both energy and directivity information. It is used in a variety of applications such as source localization, reproduction,
and power measurement. Until now, intensity is defined at a point in space, however given sound propagates over space, knowing its spatial distribution could be more powerful. This paper formulates spatial sound intensity vectors in spherical harmonic domain such that the
vectors contain energy and directivity information over continuous spatial regions. These representations are derived with finite sets of closed form coefficients enabling ease of implementation