466 research outputs found
DESIGN AND EVALUATION OF HARMONIC SPEECH ENHANCEMENT AND BANDWIDTH EXTENSION
Improving the quality and intelligibility of speech signals continues to be an important topic in mobile communications and hearing aid applications. This thesis explored the possibilities of improving the quality of corrupted speech by cascading a log Minimum Mean Square Error (logMMSE) noise reduction system with a Harmonic Speech Enhancement (HSE) system. In HSE, an adaptive comb filter is deployed to harmonically filter the useful speech signal and suppress the noisy components to noise floor. A Bandwidth Extension (BWE) algorithm was applied to the enhanced speech for further improvements in speech quality. Performance of this algorithm combination was evaluated using objective speech quality metrics across a variety of noisy and reverberant environments. Results showed that the logMMSE and HSE combination enhanced the speech quality in any reverberant environment and in the presence of multi-talker babble. The objective improvements associated with the BWE were found to be minima
LACE: A light-weight, causal model for enhancing coded speech through adaptive convolutions
Classical speech coding uses low-complexity postfilters with zero lookahead
to enhance the quality of coded speech, but their effectiveness is limited by
their simplicity. Deep Neural Networks (DNNs) can be much more effective, but
require high complexity and model size, or added delay. We propose a DNN model
that generates classical filter kernels on a per-frame basis with a model of
just 300~K parameters and 100~MFLOPS complexity, which is a practical
complexity for desktop or mobile device CPUs. The lack of added delay allows it
to be integrated into the Opus codec, and we demonstrate that it enables
effective wideband encoding for bitrates down to 6 kb/s.Comment: 5 pages, accepted at WASPAA 202
Speech Enhancement By Exploiting The Baseband Phase Structure Of Voiced Speech For Effective Non-Stationary Noise Estimation
Speech enhancement is one of the most important and challenging issues in the speech communication and signal processing field. It aims to minimize the effect of additive noise on the quality and intelligibility of the speech signal. Speech quality is the measure of noise remaining after the processing on the speech signal and of how pleasant the resulting speech sounds, while intelligibility refers to the accuracy of understanding speech. Speech enhancement algorithms are designed to remove the additive noise with minimum speech distortion.The task of speech enhancement is challenging due to lack of knowledge about the corrupting noise. Hence, the most challenging task is to estimate the noise which degrades the speech. Several approaches has been adopted for noise estimation which mainly fall under two categories: single channel algorithms and multiple channel algorithms. Due to this, the speech enhancement algorithms are also broadly classified as single and multiple channel enhancement algorithms.In this thesis, speech enhancement is studied in acoustic and modulation domains along with both amplitude and phase enhancement. We propose a noise estimation technique based on the spectral sparsity, detected by using the harmonic property of voiced segment of the speech. We estimate the frame to frame phase difference for the clean speech from available corrupted speech. This estimated frame-to-frame phase difference is used as a means of detecting the noise-only frequency bins even in voiced frames. This gives better noise estimation for the highly non-stationary noises like babble, restaurant and subway noise. This noise estimation along with the phase difference as an additional prior is used to extend the standard spectral subtraction algorithm. We also verify the effectiveness of this noise estimation technique when used with the Minimum Mean Squared Error Short Time Spectral Amplitude Estimator (MMSE STSA) speech enhancement algorithm. The combination of MMSE STSA and spectral subtraction results in further improvement of speech quality
Nonlinear Spectral Subtraction Berbasis Tsallis Statistics Untuk Peningkatan Kualitas Sinyal Ucapan
Adanya derau (noise) mengurangi kualitas dan inteligibilitas dari sinyal ucapan dan ini berakibat menurunnya performa dari aplikasi berbasis sinyal ucapan. Pengurangan spektral (spectral subtraction) adalah salah satu metode yang populer untuk menghilangkan derau tersebut. Akan tetapi, pengurangan spektral memiliki kelemahan, yaitu memperkenalkan musical noise. Telah banyak turunan dari pengurangan spektral yang diusulkan untuk mengurangi musical noise. Salah satunya adalah menggunakan oversubtraction dalam formulasi pengurangan spektral. Pendekatan ini disebut nonlinear pengurangan spektral. Akan tetapi, penentuan faktor ini secara heuristik. Dengan menggunakan Tsallis statistics, nonlinear subtraksi dapat diturunkan secara matematis. Varian baru spectral subtraction yang disebut q-spectral subtraction telah diturunkan. Metode ini telah terbukti efektif untuk meningkatkan performa sistem pengenalan ucapan terhadap noise. Akan tetapi, evaluasi metode ini untuk meningkatkan kualitas sinyal ucapan pada speech enhancement belum diinvestigasi. Pada paper ini, performa q-SS untuk speech enhancement akan diivestigasi. Dari hasil percobaan, ditemukan bahwa q-SS lebih baik dalam meningkatkan kualitas sinyal ucapan dibandingkan metode pengurangan spektral lain
Offline and real time noise reduction in speech signals using the discrete wavelet packet decomposition
This thesis describes the development of an offline and real time wavelet based speech enhancement system to process speech corrupted with various amounts of white Gaussian noise and other different noise types
- …