392 research outputs found

    Adaptive wavelet thresholding with robust hybrid features for text-independent speaker identification system

    Get PDF
    The robustness of speaker identification system over additive noise channel is crucial for real-world applications. In speaker identification (SID) systems, the extracted features from each speech frame are an essential factor for building a reliable identification system. For clean environments, the identification system works well; in noisy environments, there is an additive noise, which is affect the system. To eliminate the problem of additive noise and to achieve a high accuracy in speaker identification system a proposed algorithm for feature extraction based on speech enhancement and a combined features is presents. In this paper, a wavelet thresholding pre-processing stage, and feature warping (FW) techniques are used with two combined features named power normalized cepstral coefficients (PNCC) and gammatone frequency cepstral coefficients (GFCC) to improve the identification system robustness against different types of additive noises. Universal Background Model Gaussian Mixture Model (UBM-GMM) is used for features matching between the claim and actual speakers. The results showed performance improvement for the proposed feature extraction algorithm of identification system comparing with conventional features over most types of noises and different SNR ratios

    Wavelet-based techniques for speech recognition

    Get PDF
    In this thesis, new wavelet-based techniques have been developed for the extraction of features from speech signals for the purpose of automatic speech recognition (ASR). One of the advantages of the wavelet transform over the short time Fourier transform (STFT) is its capability to process non-stationary signals. Since speech signals are not strictly stationary the wavelet transform is a better choice for time-frequency transformation of these signals. In addition it has compactly supported basis functions, thereby reducing the amount of computation as opposed to STFT where an overlapping window is needed. [Continues.

    Wavelet speech enhancement based on time-scale adaptation

    Get PDF
    Abstract : We propose a new speech enhancement method based on time and scale adaptation of wavelet thresholds. The time dependency is introduced by approximating the Teager Energy of the wavelet coefficients, while the scale dependency is introduced by extending the principle of level dependent threshold to Wavelet Packet Thresholding. This technique does not require an explicit estimation of the noise level or of the apriori knowledge of the SNR, as is usually needed in most of the popular enhancement methods. Performance of the proposed method is evaluated on speech recorded in real conditions (plane, sawmill, tank, subway, babble, car, exhibition hall, restaurant, street, airport, and train station) and artificially added noise. MELscale decomposition based on wavelet packets is also compared to the common wavelet packet scale. Comparison in terms of Signal-to-Noise Ratio (SNR) is reported for time adaptation and time-scale adaptation thresholding of the wavelet coefficients thresholding. Visual inspection of spectrograms and listening experiments are also used to support the results. Hidden Markov Models Speech recognition experiments are conducted on the AURORA–2 database and show that the proposed method improves the speech recognition rates for low SNRs

    Speech enhancement by perceptual adaptive wavelet de-noising

    Get PDF
    This thesis work summarizes and compares the existing wavelet de-noising methods. Most popular methods of wavelet transform, adaptive thresholding, and musical noise suppression have been analyzed theoretically and evaluated through Matlab simulation. Based on the above work, a new speech enhancement system using adaptive wavelet de-noising is proposed. Each step of the standard wavelet thresholding is improved by optimized adaptive algorithms. The Quantile based adaptive noise estimate and the posteriori SNR based threshold adjuster are compensatory to each other. The combination of them integrates the advantages of these two approaches and balances the effects of noise removal and speech preservation. In order to improve the final perceptual quality, an innovative musical noise analysis and smoothing algorithm and a Teager Energy Operator based silent segment smoothing module are also introduced into the system. The experimental results have demonstrated the capability of the proposed system in both stationary and non-stationary noise environments

    ROBUST HYBRID FEATURES BASED TEXT INDEPENDENT SPEAKER IDENTIFICATION SYSTEM OVER NOISY ADDITIVE CHANNEL

    Get PDF
    Robustness of speaker identification systems over additive noise is crucial for real-world applications. In this paper, two robust features named Power Normalized Cepstral Coefficients (PNCC) and Gammatone Frequency Cepstral Coefficients (GFCC) are combined together to improve the robustness of speaker identification system over different types of noise. Universal Background Model Gaussian Mixture Model (UBM-GMM) is used as a feature matching and a classifier to identify the claim speakers. Evaluation results show that the proposed hybrid feature improves the performance of identification system when compared to conventional features over most types of noise and different signal-to-noise ratios

    Paper on Frequency based audio Noise Reduction using Butter Worth, Chebyshev & Elliptical Filters

    Get PDF
    Noise from the audio signal is removed by using Audio Noise Reduction System. Audio noise reduction systems uses filters for removal of noise. Filters are the manipulation of the amplitude and/or phase response of a signal according to their frequency. These are the basic components of all signal processing and -telecommunication systems. There are two kinds of filters- fixed and tunable. Fixed filters are those in which passband frequencies and stopband frequencies are fixed whereas in case of tunable filters, passband and stopband frequencies are variable. These frequencies can be changed according to the requirement of the applications. Tunable digital filters are widely employed in telecommunications, medical electronics, digital audio equipment and control systems. This is the basic need for removal of noise from the audio signal

    Audio source separation with one sensor for robust speech recognition

    Get PDF
    International audienceIn this paper, we address the problem of noise compensation in speech signals for robust speech recognition. Several classical denoising methods in the field of speech and signal processing are compared on speech corrupted by music, which correspond to a frequent situation in broadcast news transcription tasks. We also present two new source separation techniques, namely adaptive Wiener filtering and adaptive shrinkage. These techniques rely on the use of a dictionary of spectral shapes to deal with the non stationarity of the signals. The algorithms are first compared on the source separation task and assessed in terms of average distortion. Their effect on the entire transcription system is eventually compared in terms of word error rate. Results show that the proposed adaptive Wiener filter approach yields a significant improvement of the transcription accuracy at signal/noise ratios greater than 15 dB
    • …
    corecore