Search CORE

392 research outputs found

Adaptive wavelet thresholding with robust hybrid features for text-independent speaker identification system

Author: Alabbasi Hesham A.
Hasan Fadhil S.
Jalil Ali M.
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/10/2020
Field of study

The robustness of speaker identification system over additive noise channel is crucial for real-world applications. In speaker identification (SID) systems, the extracted features from each speech frame are an essential factor for building a reliable identification system. For clean environments, the identification system works well; in noisy environments, there is an additive noise, which is affect the system. To eliminate the problem of additive noise and to achieve a high accuracy in speaker identification system a proposed algorithm for feature extraction based on speech enhancement and a combined features is presents. In this paper, a wavelet thresholding pre-processing stage, and feature warping (FW) techniques are used with two combined features named power normalized cepstral coefficients (PNCC) and gammatone frequency cepstral coefficients (GFCC) to improve the identification system robustness against different types of additive noises. Universal Background Model Gaussian Mixture Model (UBM-GMM) is used for features matching between the claim and actual speakers. The results showed performance improvement for the proposed feature extraction algorithm of identification system comparing with conventional features over most types of noises and different SNR ratios

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

Wavelet-based techniques for speech recognition

Author: Omar Farooq (7204418)
Publication venue
Publication date: 01/01/2002
Field of study

In this thesis, new wavelet-based techniques have been developed for the extraction of features from speech signals for the purpose of automatic speech recognition (ASR). One of the advantages of the wavelet transform over the short time Fourier transform (STFT) is its capability to process non-stationary signals. Since speech signals are not strictly stationary the wavelet transform is a better choice for time-frequency transformation of these signals. In addition it has compactly supported basis functions, thereby reducing the amount of computation as opposed to STFT where an overlapping window is needed. [Continues.

Loughborough University Institutional Repository

Wavelet speech enhancement based on time-scale adaptation

Author: Bahoura
Bahoura
Chen
Cohen
Deller
Donoho
Donoho
Donoho
Donoho
Ephraim
Ephraim
Gulzow
Jabloun
Jean Rouat
Johnstone
Mahmoudi
Mahmoudi
Mallat
Mohammed Bahoura
Pan
Sarikaya
Seok
Sika
Vidakovic
Xu
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

Abstract : We propose a new speech enhancement method based on time and scale adaptation of wavelet thresholds. The time dependency is introduced by approximating the Teager Energy of the wavelet coefficients, while the scale dependency is introduced by extending the principle of level dependent threshold to Wavelet Packet Thresholding. This technique does not require an explicit estimation of the noise level or of the apriori knowledge of the SNR, as is usually needed in most of the popular enhancement methods. Performance of the proposed method is evaluated on speech recorded in real conditions (plane, sawmill, tank, subway, babble, car, exhibition hall, restaurant, street, airport, and train station) and artificially added noise. MELscale decomposition based on wavelet packets is also compared to the common wavelet packet scale. Comparison in terms of Signal-to-Noise Ratio (SNR) is reported for time adaptation and time-scale adaptation thresholding of the wavelet coefficients thresholding. Visual inspection of spectrograms and listening experiments are also used to support the results. Hidden Markov Models Speech recognition experiments are conducted on the AURORA–2 database and show that the proposed method improves the speech recognition rates for low SNRs

Crossref

Savoirs UdeS

Speech enhancement by perceptual adaptive wavelet de-noising

Author: Xu Lan
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2007
Field of study

This thesis work summarizes and compares the existing wavelet de-noising methods. Most popular methods of wavelet transform, adaptive thresholding, and musical noise suppression have been analyzed theoretically and evaluated through Matlab simulation. Based on the above work, a new speech enhancement system using adaptive wavelet de-noising is proposed. Each step of the standard wavelet thresholding is improved by optimized adaptive algorithms. The Quantile based adaptive noise estimate and the posteriori SNR based threshold adjuster are compensatory to each other. The combination of them integrates the advantages of these two approaches and balances the effects of noise removal and speech preservation. In order to improve the final perceptual quality, an innovative musical noise analysis and smoothing algorithm and a Teager Energy Operator based silent segment smoothing module are also introduced into the system. The experimental results have demonstrated the capability of the proposed system in both stationary and non-stationary noise environments

Scholarship at UWindsor

ROBUST HYBRID FEATURES BASED TEXT INDEPENDENT SPEAKER IDENTIFICATION SYSTEM OVER NOISY ADDITIVE CHANNEL

Author: Ali Muayad Jalil
Fadhel Sahib Hasan
Hesham Adnan Alabbasi
Publication venue: Mustansiriyah University/College of Engineering
Publication date: 01/07/2020
Field of study

Robustness of speaker identification systems over additive noise is crucial for real-world applications. In this paper, two robust features named Power Normalized Cepstral Coefficients (PNCC) and Gammatone Frequency Cepstral Coefficients (GFCC) are combined together to improve the robustness of speaker identification system over different types of noise. Universal Background Model Gaussian Mixture Model (UBM-GMM) is used as a feature matching and a classifier to identify the claim speakers. Evaluation results show that the proposed hybrid feature improves the performance of identification system when compared to conventional features over most types of noise and different signal-to-noise ratios

Directory of Open Access Journals

Paper on Frequency based audio Noise Reduction using Butter Worth, Chebyshev & Elliptical Filters

Author: Er. Mannu Singla, Mr. Harpal Singh
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/10/2015
Field of study

Noise from the audio signal is removed by using Audio Noise Reduction System. Audio noise reduction systems uses filters for removal of noise. Filters are the manipulation of the amplitude and/or phase response of a signal according to their frequency. These are the basic components of all signal processing and -telecommunication systems. There are two kinds of filters- fixed and tunable. Fixed filters are those in which passband frequencies and stopband frequencies are fixed whereas in case of tunable filters, passband and stopband frequencies are variable. These frequencies can be changed according to the requirement of the applications. Tunable digital filters are widely employed in telecommunications, medical electronics, digital audio equipment and control systems. This is the basic need for removal of noise from the audio signal

International Journal on Recent and Innovation Trends in Computing and Communication

Audio source separation with one sensor for robust speech recognition

Author: Benaroya Laurent
Bimbot Frédéric
Gravier Guillaume
Gribonval Rémi
Publication venue: HAL CCSD
Publication date: 20/05/2003
Field of study

International audienceIn this paper, we address the problem of noise compensation in speech signals for robust speech recognition. Several classical denoising methods in the field of speech and signal processing are compared on speech corrupted by music, which correspond to a frequent situation in broadcast news transcription tasks. We also present two new source separation techniques, namely adaptive Wiener filtering and adaptive shrinkage. These techniques rely on the use of a dictionary of spectral shapes to deal with the non stationarity of the signals. The algorithms are first compared on the source separation task and assessed in terms of average distortion. Their effect on the entire transcription system is eventually compared in terms of word error rate. Results show that the proposed adaptive Wiener filter approach yields a significant improvement of the transcription accuracy at signal/noise ratios greater than 15 dB

INRIA a CCSD electronic archive server