1,968 research outputs found

    Adaptive wavelet thresholding with robust hybrid features for text-independent speaker identification system

    Get PDF
    The robustness of speaker identification system over additive noise channel is crucial for real-world applications. In speaker identification (SID) systems, the extracted features from each speech frame are an essential factor for building a reliable identification system. For clean environments, the identification system works well; in noisy environments, there is an additive noise, which is affect the system. To eliminate the problem of additive noise and to achieve a high accuracy in speaker identification system a proposed algorithm for feature extraction based on speech enhancement and a combined features is presents. In this paper, a wavelet thresholding pre-processing stage, and feature warping (FW) techniques are used with two combined features named power normalized cepstral coefficients (PNCC) and gammatone frequency cepstral coefficients (GFCC) to improve the identification system robustness against different types of additive noises. Universal Background Model Gaussian Mixture Model (UBM-GMM) is used for features matching between the claim and actual speakers. The results showed performance improvement for the proposed feature extraction algorithm of identification system comparing with conventional features over most types of noises and different SNR ratios

    A New Wavelet Denoising Method for Noise Threshold

    Get PDF
    A new method is used wavelet 1-D experimental signal for denoising. It is provided the optimal adaptive threshold of sub-band based on input signals. The new method: 1) use a new method with low complexity that calculates thresholds; 2) use threshold for each sub-bands; 3) divide three sub-band with range of human hearing and range of the hearing tests are often displayed in the form of an audiogram; 4) use a new denoising algorithm depends on attribute of signal for wavelet coefficients; 5) applies denoising to the detail coefficients. The new method called Adaptive Thresholding with Mean for hybrid Denoising method of hard and soft function (ATMDe) and applied to hearing loss and it is found that it increases the signal-to-noise ratio by more than 114 % and decreases the mean-square-error (MSE). The result of new method with SNR and MSE is higher than standard denoising methods. Hence, the new method was found that has good performance and adaptive threshold value is better than other methods.This study is proposed a new adaptive threshold based on noisy speech for each sub-bands with low complex and it is suitability for range of human hearing and range of hearing test. A new method is used wavelet 1-D experimental signal for denoising. It provided the optimal adaptive threshold of three sub-band with applies to the detail coefficients. The speech enhancement is used of threshoding on the adpated wavelet coefficients, and the results are compared a variety of noisy speech and four well-known benchmark signals. The results, measured objectively by Signal-to-Noise ratio (SNR) and Mean Square Error (MSE), are given for additive white Gaussian noise as well as two different types of noisy environment. The new method called Adaptive Thresholding with Mean for hybrid Denoising method of hard and soft function (ATMDe) and applied to hearing loss and it is found that it increases the signal-to-noise ratio by more than 114% and decreases the mean-square-error (MSE). The result of new method with SNR and MSE is higher than standard denoising methods. Hence, the new method was found that has good performance and adaptive threshold value is better than other methods

    Speech Signal Enhancement through Adaptive Wavelet Thresholding

    Get PDF
    This paper demonstrates the application of the Bionic Wavelet Transform (BWT), an adaptive wavelet transform derived from a non-linear auditory model of the cochlea, to the task of speech signal enhancement. Results, measured objectively by Signal-to-Noise ratio (SNR) and Segmental SNR (SSNR) and subjectively by Mean Opinion Score (MOS), are given for additive white Gaussian noise as well as four different types of realistic noise environments. Enhancement is accomplished through the use of thresholding on the adapted BWT coefficients, and the results are compared to a variety of speech enhancement techniques, including Ephraim Malah filtering, iterative Wiener filtering, and spectral subtraction, as well as to wavelet denoising based on a perceptually scaled wavelet packet transform decomposition. Overall results indicate that SNR and SSNR improvements for the proposed approach are comparable to those of the Ephraim Malah filter, with BWT enhancement giving the best results of all methods for the noisiest (βˆ’10 db and βˆ’5 db input SNR) conditions. Subjective measurements using MOS surveys across a variety of 0 db SNR noise conditions indicate enhancement quality competitive with but still lower than results for Ephraim Malah filtering and iterative Wiener filtering, but higher than the perceptually scaled wavelet method

    Translation-Invariant Shrinkage/Thresholding of Group Sparse Signals

    Full text link
    This paper addresses signal denoising when large-amplitude coefficients form clusters (groups). The L1-norm and other separable sparsity models do not capture the tendency of coefficients to cluster (group sparsity). This work develops an algorithm, called 'overlapping group shrinkage' (OGS), based on the minimization of a convex cost function involving a group-sparsity promoting penalty function. The groups are fully overlapping so the denoising method is translation-invariant and blocking artifacts are avoided. Based on the principle of majorization-minimization (MM), we derive a simple iterative minimization algorithm that reduces the cost function monotonically. A procedure for setting the regularization parameter, based on attenuating the noise to a specified level, is also described. The proposed approach is illustrated on speech enhancement, wherein the OGS approach is applied in the short-time Fourier transform (STFT) domain. The denoised speech produced by OGS does not suffer from musical noise.Comment: 33 pages, 7 figures, 5 table
    • …
    corecore