1,586 research outputs found

    Blind MultiChannel Identification and Equalization for Dereverberation and Noise Reduction based on Convolutive Transfer Function

    Get PDF
    This paper addresses the problems of blind channel identification and multichannel equalization for speech dereverberation and noise reduction. The time-domain cross-relation method is not suitable for blind room impulse response identification, due to the near-common zeros of the long impulse responses. We extend the cross-relation method to the short-time Fourier transform (STFT) domain, in which the time-domain impulse responses are approximately represented by the convolutive transfer functions (CTFs) with much less coefficients. The CTFs suffer from the common zeros caused by the oversampled STFT. We propose to identify CTFs based on the STFT with the oversampled signals and the critical sampled CTFs, which is a good compromise between the frequency aliasing of the signals and the common zeros problem of CTFs. In addition, a normalization of the CTFs is proposed to remove the gain ambiguity across sub-bands. In the STFT domain, the identified CTFs is used for multichannel equalization, in which the sparsity of speech signals is exploited. We propose to perform inverse filtering by minimizing the 1\ell_1-norm of the source signal with the relaxed 2\ell_2-norm fitting error between the micophone signals and the convolution of the estimated source signal and the CTFs used as a constraint. This method is advantageous in that the noise can be reduced by relaxing the 2\ell_2-norm to a tolerance corresponding to the noise power, and the tolerance can be automatically set. The experiments confirm the efficiency of the proposed method even under conditions with high reverberation levels and intense noise.Comment: 13 pages, 5 figures, 5 table

    Change prediction for low complexity combined beamforming and acoustic echo cancellation

    Get PDF
    Time-variant beamforming (BF) and acoustic echo cancellation (AEC) are two techniques that are frequently employed for improving the quality of hands-free speech communication. However, the combined application of both is quite challenging as it either introduces high computational complexity or insufficient tracking. We propose a new method to improve the performance of the low-complexity beamformer first (BF-first) structure, which we call change prediction(ChaP). ChaP gathers information on several BF changes to predict the effective impulse response seen by the AEC after the next BF change. To account for uncertain data and convergence states in the predictions, reliability measures are introduced to improve ChaP in realistic scenarios

    Single- and multi-microphone speech dereverberation using spectral enhancement

    Get PDF
    In speech communication systems, such as voice-controlled systems, hands-free mobile telephones, and hearing aids, the received microphone signals are degraded by room reverberation, background noise, and other interferences. This signal degradation may lead to total unintelligibility of the speech and decreases the performance of automatic speech recognition systems. In the context of this work reverberation is the process of multi-path propagation of an acoustic sound from its source to one or more microphones. The received microphone signal generally consists of a direct sound, reflections that arrive shortly after the direct sound (commonly called early reverberation), and reflections that arrive after the early reverberation (commonly called late reverberation). Reverberant speech can be described as sounding distant with noticeable echo and colouration. These detrimental perceptual effects are primarily caused by late reverberation, and generally increase with increasing distance between the source and microphone. Conversely, early reverberations tend to improve the intelligibility of speech. In combination with the direct sound it is sometimes referred to as the early speech component. Reduction of the detrimental effects of reflections is evidently of considerable practical importance, and is the focus of this dissertation. More specifically the dissertation deals with dereverberation techniques, i.e., signal processing techniques to reduce the detrimental effects of reflections. In the dissertation, novel single- and multimicrophone speech dereverberation algorithms are developed that aim at the suppression of late reverberation, i.e., at estimation of the early speech component. This is done via so-called spectral enhancement techniques that require a specific measure of the late reverberant signal. This measure, called spectral variance, can be estimated directly from the received (possibly noisy) reverberant signal(s) using a statistical reverberation model and a limited amount of a priori knowledge about the acoustic channel(s) between the source and the microphone(s). In our work an existing single-channel statistical reverberation model serves as a starting point. The model is characterized by one parameter that depends on the acoustic characteristics of the environment. We show that the spectral variance estimator that is based on this model, can only be used when the source-microphone distance is larger than the so-called critical distance. This is, crudely speaking, the distance where the direct sound power is equal to the total reflective power. A generalization of the statistical reverberation model in which the direct sound is incorporated is developed. This model requires one additional parameter that is related to the ratio between the direct sound energy and the sound energy of all reflections. The generalized model is used to derive a novel spectral variance estimator. When the novel estimator is used for dereverberation rather than the existing estimator, and the source-microphone distance is smaller than the critical distance, the dereverberation performance is significantly increased. Single-microphone systems only exploit the temporal and spectral diversity of the received signal. Reverberation, of course, also induces spatial diversity. To additionally exploit this diversity, multiple microphones must be used, and their outputs must be combined by a suitable spatial processor such as the so-called delay and sum beamformer. It is not a priori evident whether spectral enhancement is best done before or after the spatial processor. For this reason we investigate both possibilities, as well as a merge of the spatial processor and the spectral enhancement technique. An advantage of the latter option is that the spectral variance estimator can be further improved. Our experiments show that the use of multiple microphones affords a significant improvement of the perceptual speech quality. The applicability of the theory developed in this dissertation is demonstrated using a hands-free communication system. Since hands-free systems are often used in a noisy and reverberant environment, the received microphone signal does not only contain the desired signal but also interferences such as room reverberation that is caused by the desired source, background noise, and a far-end echo signal that results from a sound that is produced by the loudspeaker. Usually an acoustic echo canceller is used to cancel the far-end echo. Additionally a post-processor is used to suppress background noise and residual echo, i.e., echo which could not be cancelled by the echo canceller. In this work a novel structure and post-processor for an acoustic echo canceller are developed. The post-processor suppresses late reverberation caused by the desired source, residual echo, and background noise. The late reverberation and late residual echo are estimated using the generalized statistical reverberation model. Experimental results convincingly demonstrate the benefits of the proposed system for suppressing late reverberation, residual echo and background noise. The proposed structure and post-processor have a low computational complexity, a highly modular structure, can be seamlessly integrated into existing hands-free communication systems, and affords a significant increase of the listening comfort and speech intelligibility

    Spatial Noise-Field Control With Online Secondary Path Modeling: A Wave-Domain Approach

    Get PDF
    Due to strong interchannel interference in multichannel active noise control (ANC), there are fundamental problems associated with the filter adaptation and online secondary path modeling remains a major challenge. This paper proposes a wave-domain adaptation algorithm for multichannel ANC with online secondary path modelling to cancel tonal noise over an extended region of two-dimensional plane in a reverberant room. The design is based on exploiting the diagonal-dominance property of the secondary path in the wave domain. The proposed wave-domain secondary path model is applicable to both concentric and nonconcentric circular loudspeakers and microphone array placement, and is also robust against array positioning errors. Normalized least mean squares-type algorithms are adopted for adaptive feedback control. Computational complexity is analyzed and compared with the conventional time-domain and frequency-domain multichannel ANCs. Through simulation-based verification in comparison with existing methods, the proposed algorithm demonstrates more efficient adaptation with low-level auxiliary noise.DP14010341

    Wireless recording of the calls of Rousettus aegyptiacus and their reproduction using electrostatic transducers

    Get PDF
    Bats are capable of imaging their surroundings in great detail using echolocation. To apply similar methods to human engineering systems requires the capability to measure and recreate the signals used, and to understand the processing applied to returning echoes. In this work, the emitted and reflected echolocation signals of Rousettus aegyptiacus are recorded while the bat is in flight, using a wireless sensor mounted on the bat. The sensor is designed to replicate the acoustic gain control which bats are known to use, applying a gain to returning echoes that is dependent on the incurred time delay. Employing this technique allows emitted and reflected echolocation calls, which have a wide dynamic range, to be recorded. The recorded echoes demonstrate the complexity of environment reconstruction using echolocation. The sensor is also used to make accurate recordings of the emitted calls, and these calls are recreated in the laboratory using custom-built wideband electrostatic transducers, allied with a spectral equalization technique. This technique is further demonstrated by recreating multi-harmonic bioinspired FM chirps. The ability to record and accurately synthesize echolocation calls enables the exploitation of biological signals in human engineering systems for sonar, materials characterization and imaging

    Infinite non-causality in active cancellation of random noise

    Full text link
    Active cancellation of broadband random noise requires the detection of the incoming noise with some time advance. In an duct for example this advance must be larger than the delays in the secondary path from the control source to the error sensor. In this paper it is shown that, in some cases, the advance required for perfect noise cancellation is theoretically infinite because the inverse of the secondary path, which is required for control, can include an infinite non-causal response. This is shown to be the result of two mechanisms: in the single-channel case (one control source and one error sensor), this can arise because of strong echoes in the control path. In the multi-channel case this can arise even in free field simply because of an unfortunate placing of sensors and actuators. In the present paper optimal feedforward control is derived through analytical and numerical computations, in the time and frequency domains. It is shown that, in practice, the advance required for significant noise attenuation can be much larger than the secondary path delays. Practical rules are also suggested in order to prevent infinite non-causality from appearing

    In Car Audio

    Get PDF
    This chapter presents implementations of advanced in Car Audio Applications. The system is composed by three main different applications regarding the In Car listening and communication experience. Starting from a high level description of the algorithms, several implementations on different levels of hardware abstraction are presented, along with empirical results on both the design process undergone and the performance results achieved
    corecore