466 research outputs found

    DESIGN AND EVALUATION OF HARMONIC SPEECH ENHANCEMENT AND BANDWIDTH EXTENSION

    Get PDF
    Improving the quality and intelligibility of speech signals continues to be an important topic in mobile communications and hearing aid applications. This thesis explored the possibilities of improving the quality of corrupted speech by cascading a log Minimum Mean Square Error (logMMSE) noise reduction system with a Harmonic Speech Enhancement (HSE) system. In HSE, an adaptive comb filter is deployed to harmonically filter the useful speech signal and suppress the noisy components to noise floor. A Bandwidth Extension (BWE) algorithm was applied to the enhanced speech for further improvements in speech quality. Performance of this algorithm combination was evaluated using objective speech quality metrics across a variety of noisy and reverberant environments. Results showed that the logMMSE and HSE combination enhanced the speech quality in any reverberant environment and in the presence of multi-talker babble. The objective improvements associated with the BWE were found to be minima

    LACE: A light-weight, causal model for enhancing coded speech through adaptive convolutions

    Full text link
    Classical speech coding uses low-complexity postfilters with zero lookahead to enhance the quality of coded speech, but their effectiveness is limited by their simplicity. Deep Neural Networks (DNNs) can be much more effective, but require high complexity and model size, or added delay. We propose a DNN model that generates classical filter kernels on a per-frame basis with a model of just 300~K parameters and 100~MFLOPS complexity, which is a practical complexity for desktop or mobile device CPUs. The lack of added delay allows it to be integrated into the Opus codec, and we demonstrate that it enables effective wideband encoding for bitrates down to 6 kb/s.Comment: 5 pages, accepted at WASPAA 202

    Fundamental Frequency and Model Order Estimation Using Spatial Filtering

    Get PDF

    Model-based Analysis and Processing of Speech and Audio Signals

    Get PDF

    Speech Enhancement By Exploiting The Baseband Phase Structure Of Voiced Speech For Effective Non-Stationary Noise Estimation

    Get PDF
    Speech enhancement is one of the most important and challenging issues in the speech communication and signal processing field. It aims to minimize the effect of additive noise on the quality and intelligibility of the speech signal. Speech quality is the measure of noise remaining after the processing on the speech signal and of how pleasant the resulting speech sounds, while intelligibility refers to the accuracy of understanding speech. Speech enhancement algorithms are designed to remove the additive noise with minimum speech distortion.The task of speech enhancement is challenging due to lack of knowledge about the corrupting noise. Hence, the most challenging task is to estimate the noise which degrades the speech. Several approaches has been adopted for noise estimation which mainly fall under two categories: single channel algorithms and multiple channel algorithms. Due to this, the speech enhancement algorithms are also broadly classified as single and multiple channel enhancement algorithms.In this thesis, speech enhancement is studied in acoustic and modulation domains along with both amplitude and phase enhancement. We propose a noise estimation technique based on the spectral sparsity, detected by using the harmonic property of voiced segment of the speech. We estimate the frame to frame phase difference for the clean speech from available corrupted speech. This estimated frame-to-frame phase difference is used as a means of detecting the noise-only frequency bins even in voiced frames. This gives better noise estimation for the highly non-stationary noises like babble, restaurant and subway noise. This noise estimation along with the phase difference as an additional prior is used to extend the standard spectral subtraction algorithm. We also verify the effectiveness of this noise estimation technique when used with the Minimum Mean Squared Error Short Time Spectral Amplitude Estimator (MMSE STSA) speech enhancement algorithm. The combination of MMSE STSA and spectral subtraction results in further improvement of speech quality

    Pitch-scaled estimation of simultaneous voiced and turbulence-noise components in speech

    Full text link

    Nonlinear Spectral Subtraction Berbasis Tsallis Statistics Untuk Peningkatan Kualitas Sinyal Ucapan

    Get PDF
    Adanya derau (noise) mengurangi kualitas dan inteligibilitas dari sinyal ucapan dan ini berakibat menurunnya performa dari aplikasi berbasis sinyal ucapan. Pengurangan spektral (spectral subtraction) adalah salah satu metode yang populer untuk menghilangkan derau tersebut. Akan tetapi, pengurangan spektral memiliki kelemahan, yaitu memperkenalkan musical noise. Telah banyak turunan dari pengurangan spektral yang diusulkan untuk mengurangi musical noise. Salah satunya adalah menggunakan oversubtraction dalam formulasi pengurangan spektral. Pendekatan ini disebut nonlinear pengurangan spektral. Akan tetapi, penentuan faktor ini secara heuristik. Dengan menggunakan Tsallis statistics, nonlinear subtraksi dapat diturunkan secara matematis. Varian baru spectral subtraction yang disebut q-spectral subtraction telah diturunkan. Metode ini telah terbukti efektif untuk meningkatkan performa sistem pengenalan ucapan terhadap noise. Akan tetapi, evaluasi metode ini untuk meningkatkan kualitas sinyal ucapan pada speech enhancement belum diinvestigasi. Pada paper ini, performa q-SS untuk speech enhancement akan diivestigasi. Dari hasil percobaan, ditemukan bahwa q-SS lebih baik dalam meningkatkan kualitas sinyal ucapan dibandingkan metode pengurangan spektral lain

    Offline and real time noise reduction in speech signals using the discrete wavelet packet decomposition

    Get PDF
    This thesis describes the development of an offline and real time wavelet based speech enhancement system to process speech corrupted with various amounts of white Gaussian noise and other different noise types

    Fundamental Frequency and Direction-of-Arrival Estimation for Multichannel Speech Enhancement

    Get PDF
    • …
    corecore