5,116 research outputs found

    Online DOA estimation using real eigenbeam ESPRIT with propagation vector matching

    Get PDF
    International audienceThe Eigenbeam estimation of signal parameters via rotational invariance technique (EB-ESPRIT) [1] is a method to estimate multiple directions-of-arrival (DOAs) of sound sources from a spherical microphone array recording in the spherical harmonics domain (SHD). The method, first, constructs a signal subspace from the SHD signal and then makes use of the fact that, for plane-wave sources, the signal subspace is spanned by the (complex conjugate) spherical harmonic vectors at the source directions. The DOAs are then estimated from the signal subspace using recurrence relations of spherical harmonics.In recent publications, the singularity and ambiguity problems of the original EB-ESPRIT have been solved by jointly combining several types of recurrence relations. The state-of-the-art EB-ESPRIT, denoted as DOA-vector EB-ESPRIT, is based on three recurrence relations [2,3]. This EB-ESPRIT variant can estimate the source DOAs with significantly higher accuracy compared to the other EB-ESPRIT variants [3]. However, a permutation problem arises, which can be solved by using, for example, a joint diagonalization method [3].For parametric spatial audio signal processing purposes in the short-time Fourier transform (STFT) domain, DOA estimates are usually needed per time-frame and frequency bin. In principle, one can use the DOA-vector EB-ESPRIT method to estimate the source DOAs per time-frequency bin in an online manner. However, due to the eigendecompostion of the PSD matrix and the joint diagonalization procedure, the computational cost might be too large for many real-time applications.In this work, we propose a computationally more efficient version of the DOA-vector EB-ESPRIT based on real spherical harmonics recurrence relations. First, we separate the real and imaginary parts of the real SHD signal in the STFT domain and then construct a real signal subspace thereof, which can be recursively estimated using the deflated projection approximation subspace tracking (PASTd) [4] method. For the case of one source per time-frequency bin, the joint diagonalization is not necessary and we can simplify the EB-ESPRIT equations. For the case of two sources, the plane-wave propagation vectors can directly be estimated from the signal subspace eigenvectors by employing properties of the propagation vectors. This method can be seen as a higher order ambisonics extension of the robust B-format DOA estimation in [5]. The proposed method for estimating two DOAs can be summarized as follows:1. Separate real and imaginary parts of the real SHD signal in the STFT domain.2. Recursively estimate the signal subspace eigenvectors using PASTd.3. Estimate the two plane-wave propagation vectors from the signal subspace eigenvectors by using that they span the same subspace and by using properties of the propagation vectors (subspace-propagation vector matching).4. Estimate the DOAs by using three types of real spherical harmonics recurrence relations.Alternatively, one can estimate the DOAs analogously to the complex DOA-vector EB-ESPRIT using the joint diagonalization method proposed in [3].For the evaluation, we simulate SHD signals up to third order with one and two speech sources in reverberant and noisy environments. For the one-source scenarios, we compare the real DOA-vector EB-ESPRIT with subspace estimation based on singular value decomposition (SVD) against PASTd. For the two-source scenarios, we compare the real DOA-vector EB-ESPRIT with joint diagonalization against subspace-propagation vector matching and the robust B-format DOA estimation method.We analyze the angular distributions of the DOA estimates and find, that the DOA estimation using PASTd for the signal subspace estimation is slightly less accurate than the SVD based method but computationally much more efficient. For the estimation of two DOAs, the EB-ESPRIT based methods outperform the robust B-format estimation method when higher SHD orders are considered. The joint diagonalization method is more accurate than the subspace-propagation vector matching method. However, the latter is computationally more efficient.References:[1] H. Teutsch and W. Kellermann, “Detection and localization of multiple wideband acoustic sources based on wavefield decomposition using spherical apertures,” in Proc. IEEE Intl. Conf. Acoust., Speech Signal Proc. (ICASSP), Mar. 2008, pp. 5276–5279.[2] B. Jo and J. W. Choi, “Nonsingular EB-ESPRIT for the localization of early reflections in a room,” J. Acoust. Soc. Am., vol. 144, no. 3, p. 1882, Sep. 2018.[3] A. Herzog and E. A. P. Habets, “Eigenbeam-ESPRIT for DOA-vector estimation,” IEEE Signal Process. Lett., vol. 26, no. 4, pp. 572-576, April 2019.[4] B. Yang – “Projection Approximation Subspace Tracking, IEEE Trans. Sig. Proc.,” vol. 43, no. 1, Jan. 1995.[5] O. Thiergart and E.A.P. Habets, “Robust direction-of-arrival estimation of two simultaneous plane waves from a B-format signal,” IEEE 27th Conv. of Electrical and Electronics Engineers in Israel, Nov. 2012

    Improving acoustic vehicle classification by information fusion

    No full text
    We present an information fusion approach for ground vehicle classification based on the emitted acoustic signal. Many acoustic factors can contribute to the classification accuracy of working ground vehicles. Classification relying on a single feature set may lose some useful information if its underlying sound production model is not comprehensive. To improve classification accuracy, we consider an information fusion diagram, in which various aspects of an acoustic signature are taken into account and emphasized separately by two different feature extraction methods. The first set of features aims to represent internal sound production, and a number of harmonic components are extracted to characterize the factors related to the vehicle’s resonance. The second set of features is extracted based on a computationally effective discriminatory analysis, and a group of key frequency components are selected by mutual information, accounting for the sound production from the vehicle’s exterior parts. In correspondence with this structure, we further put forward a modifiedBayesian fusion algorithm, which takes advantage of matching each specific feature set with its favored classifier. To assess the proposed approach, experiments are carried out based on a data set containing acoustic signals from different types of vehicles. Results indicate that the fusion approach can effectively increase classification accuracy compared to that achieved using each individual features set alone. The Bayesian-based decision level fusion is found fusion is found to be improved than a feature level fusion approac

    Regression and Classification for Direction-of-Arrival Estimation with Convolutional Recurrent Neural Networks

    Full text link
    We present a novel learning-based approach to estimate the direction-of-arrival (DOA) of a sound source using a convolutional recurrent neural network (CRNN) trained via regression on synthetic data and Cartesian labels. We also describe an improved method to generate synthetic data to train the neural network using state-of-the-art sound propagation algorithms that model specular as well as diffuse reflections of sound. We compare our model against three other CRNNs trained using different formulations of the same problem: classification on categorical labels, and regression on spherical coordinate labels. In practice, our model achieves up to 43% decrease in angular error over prior methods. The use of diffuse reflection results in 34% and 41% reduction in angular prediction errors on LOCATA and SOFA datasets, respectively, over prior methods based on image-source methods. Our method results in an additional 3% error reduction over prior schemes that use classification based networks, and we use 36% fewer network parameters

    Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments

    Get PDF
    We address the problem of online localization and tracking of multiple moving speakers in reverberant environments. The paper has the following contributions. We use the direct-path relative transfer function (DP-RTF), an inter-channel feature that encodes acoustic information robust against reverberation, and we propose an online algorithm well suited for estimating DP-RTFs associated with moving audio sources. Another crucial ingredient of the proposed method is its ability to properly assign DP-RTFs to audio-source directions. Towards this goal, we adopt a maximum-likelihood formulation and we propose to use an exponentiated gradient (EG) to efficiently update source-direction estimates starting from their currently available values. The problem of multiple speaker tracking is computationally intractable because the number of possible associations between observed source directions and physical speakers grows exponentially with time. We adopt a Bayesian framework and we propose a variational approximation of the posterior filtering distribution associated with multiple speaker tracking, as well as an efficient variational expectation-maximization (VEM) solver. The proposed online localization and tracking method is thoroughly evaluated using two datasets that contain recordings performed in real environments.Comment: IEEE Journal of Selected Topics in Signal Processing, 201
    corecore