Search CORE

466 research outputs found

DESIGN AND EVALUATION OF HARMONIC SPEECH ENHANCEMENT AND BANDWIDTH EXTENSION

Author: Venkatasubramanian Arvind
Publication venue: Scholarship@Western
Publication date: 01/01/2011
Field of study

Improving the quality and intelligibility of speech signals continues to be an important topic in mobile communications and hearing aid applications. This thesis explored the possibilities of improving the quality of corrupted speech by cascading a log Minimum Mean Square Error (logMMSE) noise reduction system with a Harmonic Speech Enhancement (HSE) system. In HSE, an adaptive comb filter is deployed to harmonically filter the useful speech signal and suppress the noisy components to noise floor. A Bandwidth Extension (BWE) algorithm was applied to the enhanced speech for further improvements in speech quality. Performance of this algorithm combination was evaluated using objective speech quality metrics across a variety of noisy and reverberant environments. Results showed that the logMMSE and HSE combination enhanced the speech quality in any reverberant environment and in the presence of multi-talker babble. The objective improvements associated with the BWE were found to be minima

Scholarship@Western

LACE: A light-weight, causal model for enhancing coded speech through adaptive convolutions

Author: Büthe Jan
Mustafa Ahmed
Valin Jean-Marc
Publication venue
Publication date: 13/07/2023
Field of study

Classical speech coding uses low-complexity postfilters with zero lookahead to enhance the quality of coded speech, but their effectiveness is limited by their simplicity. Deep Neural Networks (DNNs) can be much more effective, but require high complexity and model size, or added delay. We propose a DNN model that generates classical filter kernels on a per-frame basis with a model of just 300~K parameters and 100~MFLOPS complexity, which is a practical complexity for desktop or mobile device CPUs. The lack of added delay allows it to be integrated into the Opus codec, and we demonstrate that it enables effective wideband encoding for bitrates down to 6 kb/s.Comment: 5 pages, accepted at WASPAA 202

arXiv.org e-Print Archive

Fundamental Frequency and Model Order Estimation Using Spatial Filtering

Author: Christensen Mads Græsbøll
Jensen Jesper Rindom
Karimian-Azari Sam
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Crossref

VBN

Model-based Analysis and Processing of Speech and Audio Signals

Author: Christensen Mads Græsbøll
Publication venue: Aalborg Universitetsforlag
Publication date: 01/01/2020
Field of study

VBN

Speech Enhancement By Exploiting The Baseband Phase Structure Of Voiced Speech For Effective Non-Stationary Noise Estimation

Author: Patil Sanjay
Publication venue: Clemson University Libraries
Publication date: 01/12/2013
Field of study

Speech enhancement is one of the most important and challenging issues in the speech communication and signal processing field. It aims to minimize the effect of additive noise on the quality and intelligibility of the speech signal. Speech quality is the measure of noise remaining after the processing on the speech signal and of how pleasant the resulting speech sounds, while intelligibility refers to the accuracy of understanding speech. Speech enhancement algorithms are designed to remove the additive noise with minimum speech distortion.The task of speech enhancement is challenging due to lack of knowledge about the corrupting noise. Hence, the most challenging task is to estimate the noise which degrades the speech. Several approaches has been adopted for noise estimation which mainly fall under two categories: single channel algorithms and multiple channel algorithms. Due to this, the speech enhancement algorithms are also broadly classified as single and multiple channel enhancement algorithms.In this thesis, speech enhancement is studied in acoustic and modulation domains along with both amplitude and phase enhancement. We propose a noise estimation technique based on the spectral sparsity, detected by using the harmonic property of voiced segment of the speech. We estimate the frame to frame phase difference for the clean speech from available corrupted speech. This estimated frame-to-frame phase difference is used as a means of detecting the noise-only frequency bins even in voiced frames. This gives better noise estimation for the highly non-stationary noises like babble, restaurant and subway noise. This noise estimation along with the phase difference as an additional prior is used to extend the standard spectral subtraction algorithm. We also verify the effectiveness of this noise estimation technique when used with the Minimum Mean Squared Error Short Time Spectral Amplitude Estimator (MMSE STSA) speech enhancement algorithm. The combination of MMSE STSA and spectral subtraction results in further improvement of speech quality

Clemson University: TigerPrints

Pitch-scaled estimation of simultaneous voiced and turbulence-noise components in speech

Author: C.H. Shadle
P.J.B. Jackson
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Nonlinear Spectral Subtraction Berbasis Tsallis Statistics Untuk Peningkatan Kualitas Sinyal Ucapan

Author: Pardede H. F. (Hilman)
Publication venue: 'Indonesian Institute of Sciences'
Publication date: 01/01/2013
Field of study

Adanya derau (noise) mengurangi kualitas dan inteligibilitas dari sinyal ucapan dan ini berakibat menurunnya performa dari aplikasi berbasis sinyal ucapan. Pengurangan spektral (spectral subtraction) adalah salah satu metode yang populer untuk menghilangkan derau tersebut. Akan tetapi, pengurangan spektral memiliki kelemahan, yaitu memperkenalkan musical noise. Telah banyak turunan dari pengurangan spektral yang diusulkan untuk mengurangi musical noise. Salah satunya adalah menggunakan oversubtraction dalam formulasi pengurangan spektral. Pendekatan ini disebut nonlinear pengurangan spektral. Akan tetapi, penentuan faktor ini secara heuristik. Dengan menggunakan Tsallis statistics, nonlinear subtraksi dapat diturunkan secara matematis. Varian baru spectral subtraction yang disebut q-spectral subtraction telah diturunkan. Metode ini telah terbukti efektif untuk meningkatkan performa sistem pengenalan ucapan terhadap noise. Akan tetapi, evaluasi metode ini untuk meningkatkan kualitas sinyal ucapan pada speech enhancement belum diinvestigasi. Pada paper ini, performa q-SS untuk speech enhancement akan diivestigasi. Dari hasil percobaan, ditemukan bahwa q-SS lebih baik dalam meningkatkan kualitas sinyal ucapan dibandingkan metode pengurangan spektral lain

Neliti

INKOM Journal

Offline and real time noise reduction in speech signals using the discrete wavelet packet decomposition

Author: Oktar Mehmet Alper
Publication venue
Publication date
Field of study

This thesis describes the development of an offline and real time wavelet based speech enhancement system to process speech corrupted with various amounts of white Gaussian noise and other different noise types

UWE Bristol Research Repository

Fundamental Frequency and Direction-of-Arrival Estimation for Multichannel Speech Enhancement

Author: Karimian-Azari Sam
Publication venue: Aalborg Universitetsforlag
Publication date: 01/01/2016
Field of study

VBN