245 research outputs found

    Block-Online Multi-Channel Speech Enhancement Using DNN-Supported Relative Transfer Function Estimates

    Get PDF
    This work addresses the problem of block-online processing for multi-channel speech enhancement. Such processing is vital in scenarios with moving speakers and/or when very short utterances are processed, e.g., in voice assistant scenarios. We consider several variants of a system that performs beamforming supported by DNN-based voice activity detection (VAD) followed by post-filtering. The speaker is targeted through estimating relative transfer functions between microphones. Each block of the input signals is processed independently in order to make the method applicable in highly dynamic environments. Owing to the short length of the processed block, the statistics required by the beamformer are estimated less precisely. The influence of this inaccuracy is studied and compared to the processing regime when recordings are treated as one block (batch processing). The experimental evaluation of the proposed method is performed on large datasets of CHiME-4 and on another dataset featuring moving target speaker. The experiments are evaluated in terms of objective and perceptual criteria (such as signal-to-interference ratio (SIR) or perceptual evaluation of speech quality (PESQ), respectively). Moreover, word error rate (WER) achieved by a baseline automatic speech recognition system is evaluated, for which the enhancement method serves as a front-end solution. The results indicate that the proposed method is robust with respect to short length of the processed block. Significant improvements in terms of the criteria and WER are observed even for the block length of 250 ms.Comment: 10 pages, 8 figures, 4 tables. Modified version of the article accepted for publication in IET Signal Processing journal. Original results unchanged, additional experiments presented, refined discussion and conclusion

    A robust sequential hypothesis testing method for brake squeal localisation

    Get PDF
    This contribution deals with the in situ detection and localisation of brake squeal in an automobile. As brake squeal is emitted from regions known a priori, i.e., near the wheels, the localisation is treated as a hypothesis testing problem. Distributed microphone arrays, situated under the automobile, are used to capture the directional properties of the sound field generated by a squealing brake. The spatial characteristics of the sampled sound field is then used to formulate the hypothesis tests. However, in contrast to standard hypothesis testing approaches of this kind, the propagation environment is complex and time-varying. Coupled with inaccuracies in the knowledge of the sensor and source positions as well as sensor gain mismatches, modelling the sound field is difficult and standard approaches fail in this case. A previously proposed approach implicitly tried to account for such incomplete system knowledge and was based on ad hoc likelihood formulations. The current paper builds upon this approach and proposes a second approach, based on more solid theoretical foundations, that can systematically account for the model uncertainties. Results from tests in a real setting show that the proposed approach is more consistent than the prior state-of-the-art. In both approaches, the tasks of detection and localisation are decoupled for complexity reasons. The localisation (hypothesis testing) is subject to a prior detection of brake squeal and identification of the squeal frequencies. The approaches used for the detection and identification of squeal frequencies are also presented. The paper, further, briefly addresses some practical issues related to array design and placement. (C) 2019 Author(s)

    Robust Near-Field Adaptive Beamforming with Distance Discrimination

    Get PDF
    This paper proposes a robust near-field adaptive beamformer for microphone array applications in small rooms. Robustness against location errors is crucial for near-field adaptive beamforming due to the difficulty in estimating near-field signal locations especially the radial distances. A near-field regionally constrained adaptive beamformer is proposed to design a set of linear constraints by filtering on a low rank subspace of the near-field signal over a spatial region and frequency band such that the beamformer response over the designed spatial-temporal region can be accurately controlled by a small number of linear constraint vectors. The proposed constraint design method is a systematic approach which guarantees real arithmetic implementation and direct time domain algorithms for broadband beamforming. It improves the robustness against large errors in distance and directions of arrival, and achieves good distance discrimination simultaneously. We show with a nine-element uniform linear array that the proposed near-field adaptive beamformer is robust against distance errors as large as ±32% of the presumed radial distance and angle errors up to ±20⁰. It can suppress a far field interfering signal with the same angle of incidence as a near-field target by more than 20 dB with no loss of the array gain at the near-field target. The significant distance discrimination of the proposed near-field beamformer also helps to improve the dereverberation gain and reduce the desired signal cancellation in reverberant environments

    Implementation and evaluation of a low complexity microphone array for speaker recognition

    Get PDF
    Includes bibliographical references (leaves 83-86).This thesis discusses the application of a microphone array employing a noise canceling beamforming technique for improving the robustness of speaker recognition systems in a diffuse noise field

    Enhancements to the Generalized Sidelobe Canceller for Audio Beamforming in an Immersive Environment

    Get PDF
    The Generalized Sidelobe Canceller is an adaptive algorithm for optimally estimating the parameters for beamforming, the signal processing technique of combining data from an array of sensors to improve SNR at a point in space. This work focuses on the algorithm’s application to widely-separated microphone arrays with irregular distributions used for human voice capture. Methods are presented for improving the performance of the algorithm’s blocking matrix, a stage that creates a noise reference for elimination, by proposing a stochastic model for amplitude correction and enhanced use of cross correlation for phase correction and time-difference of arrival estimation via a correlation coefficient threshold. This correlation technique is also applied to a multilateration algorithm for an efficient method of explicit target tracking. In addition, the underlying microphone array geometry is studied with parameters and guidelines for evaluation proposed. Finally, an analysis of the stability of the system is performed with respect to its adaptation parameters

    Improved change prediction for combined beamforming and echo cancellation with application to a generalized sidelobe canceler

    Get PDF
    Adaptive beamforming and echo cancellation are often necessary in hands-free situations in order to enhance the communication quality. Unfortunately, the combination of both algorithms leads to problems. Performing echo cancellation before the beamformer (AEC-first) leads to a high complexity. In the other case (BF-first) the echo reduction is drastically decreased due to the changes of the beam-former, which have to be tracked by the echo canceler. Recently, the authors presented the directed change prediction algorithm with directed recovery, which predicts the effective impulse response after the next beamformer change and therefore allows to maintain the low complexity of the BF-first structure and to guarantee a robust echo cancellation. However, the algorithm assumes an only slowly changing acoustical environment which can be problematic in typical time-variant scenarios. In this paper an improved change prediction is presented, which uses adaptive shadow filters to reduce the convergence time of the change prediction. For this enhanced algorithm, it is shown how it can be applied to more advanced beamformer structures like the generalized sidelobe canceler and how the information provided by the improved change prediction can also be used to enhance the performance of the overall interference cancellation
    corecore