2,573 research outputs found

    Underdetermined-order recursive least-squares adaptive filtering: The concept and algorithms

    No full text
    Published versio

    Linear and nonlinear adaptive filtering and their applications to speech intelligibility enhancement

    Get PDF

    Speech Enhancement using Fiber Acoustic Sensor

    Get PDF
    With the development of IoT (Internet of Things) services and devices, the voice command becomes a more and more important tool for human computer interaction. However, the audio signal recorded by the conventional omni-directional microphone is easy to be corrupted by the environmental noise like interference speech. Although the conventional beamforming techniques are able to point the main lobe of beam pattern at the desired speaker, it requires several omni microphones to form a microphone array, which will occupy large space on an IoT device. Many researchers are devoting their efforts to inventing a microphone of small size that can create directional beam pattern. Recently, researchers get inspirations from the spider’s way to sense the acoustic wave. They invented a new small-size acoustic sensor made of spider silks. This acoustic sensor has a frequency-independent dipole beam pattern for wideband audio signal. Utilizing this fiber acoustic sensor, two compact microphone arrays and corresponding speech enhancement systems can be constructed. The first microphone array consists of one omni-microphone collocated with one fiber acoustic sensor. And the second one consists of two collocated fiber acoustic sensors with orthogonal dipole beam patterns. By using the first microphone array, a first-order adaptive beamformer is designed in this thesis to reduce speech interference effects and separate speeches. In this design, an adaptive first-order beam pattern is formed by means of normalized least mean square method. Considering a scenario where the desired speech and interference speech are present at the same time, this adaptive beamformer is able to point the null angle of beam pattern at the undesired speaker to achieve speech interference reduction. In order to verify this idea, numerical simulations are conducted in an ideal condition (clean speech without reverberation) and real scenario (clean speech corrupted by white noise and reverberation). The results show that this design is able to improve speech quality significantly in ideal case. Under the condition suffering from white noise and reverberation, the improvement is achieved as well but at a much smaller scale. By using the second collocated microphone array, a speech enhancement system is proposed to make the collocated fiber acoustic sensors be able to capture speech from any directions. This system includes three main parts. The first part conducts DOA (direction of arrival) estimation empowered by a machine learning method. Here the inter-channel acoustic intensity difference is employed to compute raw DOA estimates with the presence of white noise and reverberation. After obtaining the raw DOA estimates, the machine learning method (wrapped Gaussian mixture model) is used to give a more accurate DOA estimation. This proposed method is robust to both white noise and reverberation with a low computational complexity and solves the phase ambiguity problem (0 and π are identical). In the second part, by using the orthogonality of the dipoles of the two collocated fiber acoustic sensors (one is sin⁡θ and the other is cos⁡θ), along with the DOA (θ) estimated by the wrapped Gaussian mixture model, a steerable dipole beam pattern is generated to point the main lobe at the speaker. In the third part, a noise reduction procedure is applied to the output signal of the steerable beamformer. The proposed method is based on a time-frequency mask, which is used to filter out time-frequency bins of white noise and keep those of speech signal. In order to verify the effectiveness of the designed system, numerical simulations are conducted in the existence of both white noise and reverberation. The result shows that the proposed DOA estimation method is robust to both white noise and reverberation. It implies that this type of microphone array is able to obtain precise speaker spatial information. Meanwhile, the audio quality of the output signal of this system is improved by at least 50%

    Using a low-bit rate speech enhancement variable post-filter as a speech recognition system pre-filter to improve robustness to GSM speech

    Get PDF
    Includes bibliographical references.Performance of speech recognition systems degrades when they are used to recognize speech that has been transmitted through GS1 (Global System for Mobile Communications) voice communication channels (GSM speech). This degradation is mainly due to GSM speech coding and GSM channel noise on speech signals transmitted through the network. This poor recognition of GSM channel speech limits the use of speech recognition applications over GSM networks. If speech recognition technology is to be used unlimitedly over GSM networks recognition accuracy of GSM channel speech has to be improved. Different channel normalization techniques have been developed in an attempt to improve recognition accuracy of voice channel modified speech in general (not specifically for GSM channel speech). These techniques can be classified into three broad categories, namely, model modification, signal pre-processing and feature processing techniques. In this work, as a contribution toward improving the robustness of speech recognition systems to GSM speech, the use of a low-bit speech enhancement post-filter as a speech recognition system pre-filter is proposed. This filter is to be used in recognition systems in combination with channel normalization techniques

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

    Spatial Noise-Field Control With Online Secondary Path Modeling: A Wave-Domain Approach

    Get PDF
    Due to strong interchannel interference in multichannel active noise control (ANC), there are fundamental problems associated with the filter adaptation and online secondary path modeling remains a major challenge. This paper proposes a wave-domain adaptation algorithm for multichannel ANC with online secondary path modelling to cancel tonal noise over an extended region of two-dimensional plane in a reverberant room. The design is based on exploiting the diagonal-dominance property of the secondary path in the wave domain. The proposed wave-domain secondary path model is applicable to both concentric and nonconcentric circular loudspeakers and microphone array placement, and is also robust against array positioning errors. Normalized least mean squares-type algorithms are adopted for adaptive feedback control. Computational complexity is analyzed and compared with the conventional time-domain and frequency-domain multichannel ANCs. Through simulation-based verification in comparison with existing methods, the proposed algorithm demonstrates more efficient adaptation with low-level auxiliary noise.DP14010341

    Techniques for the enhancement of linear predictive speech coding in adverse conditions

    Get PDF
    corecore