32 research outputs found

    Covariance Blocking and Whitening Method for Successive Relative Transfer Function Vector Estimation in Multi-Speaker Scenarios

    Full text link
    This paper addresses the challenge of estimating the relative transfer function (RTF) vectors of multiple speakers in a noisy and reverberant environment. More specifically, we consider a scenario where two speakers activate successively. In this scenario, the RTF vector of the first speaker can be estimated in a straightforward way and the main challenge lies in estimating the RTF vector of the second speaker during segments where both speakers are simultaneously active. To estimate the RTF vector of the second speaker the so-called blind oblique projection (BOP) method determines the oblique projection operator that optimally blocks the second speaker. Instead of blocking the second speaker, in this paper we propose a covariance blocking and whitening (CBW) method, which first blocks the first speaker and applies whitening using the estimated noise covariance matrix and then estimates the RTF vector of the second speaker based on a singular value decomposition. When using the estimated RTF vectors of both speakers in a linearly constrained minimum variance beamformer, simulation results using real-world recordings for multiple speaker positions demonstrate that the proposed CBW method outperforms the conventional BOP and covariance whitening methods in terms of signal-to-interferer-and-noise ratio improvement.Comment: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, Oct 22-25, 202

    Acoustic Echo Estimation using the model-based approach with Application to Spatial Map Construction in Robotics

    Get PDF

    Overcoming DoF Limitation in Robust Beamforming: A Penalized Inequality-Constrained Approach

    Full text link
    A well-known challenge in beamforming is how to optimally utilize the degrees of freedom (DoF) of the array to design a robust beamformer, especially when the array DoF is smaller than the number of sources in the environment. In this paper, we leverage the tool of constrained convex optimization and propose a penalized inequality-constrained minimum variance (P-ICMV) beamformer to address this challenge. Specifically, we propose a beamformer with a well-targeted objective function and inequality constraints to achieve the design goals. The constraints on interferences penalize the maximum gain of the beamformer at any interfering directions. This can efficiently mitigate the total interference power regardless of whether the number of interfering sources is less than the array DoF or not. Multiple robust constraints on the target protection and interference suppression can be introduced to increase the robustness of the beamformer against steering vector mismatch. By integrating the noise reduction, interference suppression, and target protection, the proposed formulation can efficiently obtain a robust beamformer design while optimally trade off various design goals. When the array DoF is fewer than the number of interferences, the proposed formulation can effectively align the limited DoF to all of the sources to obtain the best overall interference suppression.  \ To numerically solve this problem, we formulate the P-ICMV beamformer design as a convex second-order cone program (SOCP) and propose a low complexity iterative algorithm based on the alternating direction method of multipliers (ADMM). Three applications are simulated to demonstrate the effectiveness of the proposed beamformer.Comment: submitted to IEEE Transactions on Signal Processin

    Broadband adaptive beamforming with low complexity and frequency invariant response

    No full text
    This thesis proposes different methods to reduce the computational complexity as well as increasing the adaptation rate of adaptive broadband beamformers. This is performed exemplarily for the generalised sidelobe canceller (GSC) structure. The GSC is an alternative implementation of the linearly constrained minimum variance beamformer, which can utilise well-known adaptive filtering algorithms, such as the least mean square (LMS) or the recursive least squares (RLS) to perform unconstrained adaptive optimisation.A direct DFT implementation, by which broadband signals are decomposed into frequency bins and processed by independent narrowband beamforming algorithms, is thought to be computationally optimum. However, this setup fail to converge to the time domain minimum mean square error (MMSE) if signal components are not aligned to frequency bins, resulting in a large worst case error. To mitigate this problem of the so-called independent frequency bin (IFB) processor, overlap-save based GSC beamforming structures have been explored. This system address the minimisation of the time domain MMSE, with a significant reduction in computational complexity when compared to time-domain implementations, and show a better convergence behaviour than the IFB beamformer. By studying the effects that the blocking matrix has on the adaptive process for the overlap-save beamformer, several modifications are carried out to enhance both the simplicity of the algorithm as well as its convergence speed. These modifications result in the GSC beamformer utilising a significantly lower computational complexity compare to the time domain approach while offering similar convergence characteristics.In certain applications, especially in the areas of acoustics, there is a need to maintain constant resolution across a wide operating spectrum that may extend across several octaves. To attain constant beamwidth is difficult, particularly if uniformly spaced linear sensor array are employed for beamforming, since spatial resolution is reciprocally proportional to both the array aperture and the frequency. A scaled aperture arrangement is introduced for the subband based GSC beamformer to achieve near uniform resolution across a wide spectrum, whereby an octave-invariant design is achieved. This structure can also be operated in conjunction with adaptive beamforming algorithms. Frequency dependent tapering of the sensor signals is proposed in combination with the overlap-save GSC structure in order to achieve an overall frequency-invariant characteristic. An adaptive version is proposed for frequency-invariant overlap-save GSC beamformer. Broadband adaptive beamforming algorithms based on the family of least mean squares (LMS) algorithms are known to exhibit slow convergence if the input signal is correlated. To improve the convergence of the GSC when based on LMS-type algorithms, we propose the use of a broadband eigenvalue decomposition (BEVD) to decorrelate the input of the adaptive algorithm in the spatial dimension, for which an increase in convergence speed can be demonstrated over other decorrelating measures, such as the Karhunen-Loeve transform. In order to address the remaining temporal correlation after BEVD processing, this approach is combined with subband decomposition through the use of oversampled filter banks. The resulting spatially and temporally decorrelated GSC beamformer provides further enhanced convergence speed over spatial or temporal decorrelation methods on their own

    Robust acoustic beamforming in the presence of channel propagation uncertainties

    No full text
    Beamforming is a popular multichannel signal processing technique used in conjunction with microphone arrays to spatially filter a sound field. Conventional optimal beamformers assume that the propagation channels between each source and microphone pair are a deterministic function of the source and microphone geometry. However in real acoustic environments, there are several mechanisms that give rise to unpredictable variations in the phase and amplitudes of the propagation channels. In the presence of these uncertainties the performance of beamformers degrade. Robust beamformers are designed to reduce this performance degradation. However, robust beamformers rely on tuning parameters that are not closely related to the array geometry. By modeling the uncertainty in the acoustic channels explicitly we can derive more accurate expressions for the source-microphone channel variability. As such we are able to derive beamformers that are well suited to the application of acoustics in realistic environments. Through experiments we validate the acoustic channel models and through simulations we show the performance gains of the associated robust beamformer. Furthermore, by modeling the speech short time Fourier transform coefficients we are able to design a beamformer framework in the power domain. By utilising spectral subtraction we are able to see performance benefits over ideal conventional beamformers. Including the channel uncertainties models into the weights design improves robustness.Open Acces

    Real-time Microphone Array Processing for Sound-field Analysis and Perceptually Motivated Reproduction

    Get PDF
    This thesis details real-time implementations of sound-field analysis and perceptually motivated reproduction methods for visualisation and auralisation purposes. For the former, various methods for visualising the relative distribution of sound energy from one point in space are investigated and contrasted; including a novel reformulation of the cross-pattern coherence (CroPaC) algorithm, which integrates a new side-lobe suppression technique. Whereas for auralisation applications, listening tests were conducted to compare ambisonics reproduction with a novel headphone formulation of the directional audio coding (DirAC) method. The results indicate that the side-lobe suppressed CroPaC method offers greater spatial selectivity in reverberant conditions compared with other popular approaches, and that the new DirAC formulation yields higher perceived spatial accuracy when compared to the ambisonics method

    Multimodal methods for blind source separation of audio sources

    Get PDF
    The enhancement of the performance of frequency domain convolutive blind source separation (FDCBSS) techniques when applied to the problem of separating audio sources recorded in a room environment is the focus of this thesis. This challenging application is termed the cocktail party problem and the ultimate aim would be to build a machine which matches the ability of a human being to solve this task. Human beings exploit both their eyes and their ears in solving this task and hence they adopt a multimodal approach, i.e. they exploit both audio and video modalities. New multimodal methods for blind source separation of audio sources are therefore proposed in this work as a step towards realizing such a machine. The geometry of the room environment is initially exploited to improve the separation performance of a FDCBSS algorithm. The positions of the human speakers are monitored by video cameras and this information is incorporated within the FDCBSS algorithm in the form of constraints added to the underlying cross-power spectral density matrix-based cost function which measures separation performance. [Continues.
    corecore