25,040 research outputs found

    Wavelet analysis of speech signal

    Get PDF
    This paper concerns the issue of wavelet analysis of signals by continuous and discrete wavelettransforms (CWT – Continous Wavelet Transform, DWT – Discrete Wavelet Transform). Themain goal of our work was to develop a program which, through the CWT and the DWT analyses,would obtain graph of time-scale changes and would transform it into the spectrum, that is a graphof frequency changes. In this program we also obtain spectra of Fourier Transform and LinearPrediction. Owing to this, we can compare the Wavelet Transform results to those from the FourierTransform and Linear Prediction

    Speech Compression Using Discrete Wavelet Transform

    Get PDF
    Speech compression is an area of digital processing that is focusing on reducing bit rate of the speech signal for transmission or storage without significant loss of quality. Wavelet transform has been recently proposed for signal analysis. Speech signal compression using wavelet transform is given a considerable attention in this thesis. Speech coding is a lossy scheme and is implemented here to compress onedimensional speech signal. Basically, this scheme consists of four operations which are the transform, threshold techniques (by level and global threshold), quantization, and entropy encoding operations. The reconstruction of the compressed signal as well as the detailed steps needed are discussed.The performance of wavelet compression is compared against linear Productive Coding and Global System for Mobile Communication (GSM) algorithms using SNR, PSNR, NRMSE and compression ratio. Software simulating the lossy compression scheme is developed using Matlab 6. This software provides the basic speech analysis as well as the compression and decompression operations. The results obtained show reasonably high compression ratio and good signal quality

    Identification of Transient Speech Using Wavelet Transforms

    Get PDF
    It is generally believed that abrupt stimulus changes, which in speech may be time-varying frequency edges associated with consonants, transitions between consonants and vowels and transitions within vowels are critical to the perception of speech by humans and for speech recognition by machines. Noise affects speech transitions more than it affects quasi-steady-state speech. I believe that identifying and selectively amplifying speech transitions may enhance the intelligibility of speech in noisy conditions. The purpose of this study is to evaluate the use of wavelet transforms to identify speech transitions. Using wavelet transforms may be computationally efficient and allow for real-time applications. The discrete wavelet transform (DWT), stationary wavelet transform (SWT) and wavelet packets (WP) are evaluated. Wavelet analysis is combined with variable frame rate processing to improve the identification process. Variable frame rate can identify time segments when speech feature vectors are changing rapidly and when they are relatively stationary. Energy profiles for words, which show the energy in each node of a speech signal decomposed using wavelets, are used to identify nodes that include predominately transient information and nodes that include predominately quasi-steady-state information, and these are used to synthesize transient and quasi-steady-state speech components. These speech components are estimates of the tonal and nontonal speech components, which Yoo et al identified using time-varying band-pass filters. Comparison of spectra, a listening test and mean-squared-errors between the transient components synthesized using wavelets and Yoo's nontonal components indicated that wavelet packets identified the best estimates of Yoo's components. An algorithm that incorporates variable frame rate analysis into wavelet packet analysis is proposed. The development of this algorithm involves the processes of choosing a wavelet function and a decomposition level to be used. The algorithm itself has 4 steps: wavelet packet decomposition; classification of terminal nodes; incorporation of variable frame rate processing; synthesis of speech components. Combining wavelet analysis with variable frame rate analysis provides the best estimates of Yoo's speech components

    Analisis Fungsi Wavelet Daubechies untuk Sinyal Suara dengan Panjang Segmen Berbeda

    Get PDF
    Wavelets Daubechies have been widely applied to signal processing, such as automatic speech recognition system. Wavelet Daubechies, which is one of the wavelet families distinguished by its order, defined as N. The magnitude of the order N value has an influence on the wavelet decomposition where with the greater N value there is an increase in the smoothness of multiresolution analysis results. However, not all order Daubechies wavelet can give the same good recognition results so that its application still such as trial and error. Therefore, it is necessary to determine the order of the Daubechies wavelet base function on the Indonesian voice signal through its similarity level. The method can be used to determine the similarity level between speech signal and wavelet Daubechies function N order by calculating its crosscorrelation coefficient. The result shows that there is inconcistency of the best wavelet daubechies basis function for Indonesian vowels a,i,u,e,è,o, and ò. Which db45 and db44 are the best wavelet Daubechies basis function on 2048 and 1024 segmentation length respectively

    Wavelet methods in speech recognition

    Get PDF
    In this thesis, novel wavelet techniques are developed to improve parametrization of speech signals prior to classification. It is shown that non-linear operations carried out in the wavelet domain improve the performance of a speech classifier and consistently outperform classical Fourier methods. This is because of the localised nature of the wavelet, which captures correspondingly well-localised time-frequency features within the speech signal. Furthermore, by taking advantage of the approximation ability of wavelets, efficient representation of the non-stationarity inherent in speech can be achieved in a relatively small number of expansion coefficients. This is an attractive option when faced with the so-called 'Curse of Dimensionality' problem of multivariate classifiers such as Linear Discriminant Analysis (LDA) or Artificial Neural Networks (ANNs). Conventional time-frequency analysis methods such as the Discrete Fourier Transform either miss irregular signal structures and transients due to spectral smearing or require a large number of coefficients to represent such characteristics efficiently. Wavelet theory offers an alternative insight in the representation of these types of signals. As an extension to the standard wavelet transform, adaptive libraries of wavelet and cosine packets are introduced which increase the flexibility of the transform. This approach is observed to be yet more suitable for the highly variable nature of speech signals in that it results in a time-frequency sampled grid that is well adapted to irregularities and transients. They result in a corresponding reduction in the misclassification rate of the recognition system. However, this is necessarily at the expense of added computing time. Finally, a framework based on adaptive time-frequency libraries is developed which invokes the final classifier to choose the nature of the resolution for a given classification problem. The classifier then performs dimensionaIity reduction on the transformed signal by choosing the top few features based on their discriminant power. This approach is compared and contrasted to an existing discriminant wavelet feature extractor. The overall conclusions of the thesis are that wavelets and their relatives are capable of extracting useful features for speech classification problems. The use of adaptive wavelet transforms provides the flexibility within which powerful feature extractors can be designed for these types of application

    Discrete Wavelet Transform Based Cancelable Biometric System for Speaker Recognition

    Get PDF
    The biometric template characteristics and privacy conquest are challenging issues. To resolve such limitations, the cancelable biometric systems have been briefed. In this paper, the efficient cancelable biometric system based on the cryptosystem is introduced. It depends on permutation using a chaotic Baker map and substitution using masks in various transform domains. The proposed cancelable system features extraction phase is based on the Cepstral analysis from the encrypted speech signal in the time domain combined with the encrypted speech signal in the discrete wavelet transform (DWT). Then, the resultant features are applied to the artificial neural network for classification. Furthermore, wavelet denoising is used at the receiver side to enhance the proposed system. The cryptosystem provides a robust protection level of the speech template. This speech template can be replaced and recertified if it is breached. Our proposed system enables the generation of various templates from the same speech signal under the constraint of linkability between them. The simulation results confirmed that the proposed cancelable biometric system achieved higher a level of performance than traditional biometric systems, which achieved 97.5% recognition rate at low signal to noise ratio (SNR) of -25dB and 100% with -15dB and above

    A Fully Time-domain Neural Model for Subband-based Speech Synthesizer

    Full text link
    This paper introduces a deep neural network model for subband-based speech synthesizer. The model benefits from the short bandwidth of the subband signals to reduce the complexity of the time-domain speech generator. We employed the multi-level wavelet analysis/synthesis to decompose/reconstruct the signal into subbands in time domain. Inspired from the WaveNet, a convolutional neural network (CNN) model predicts subband speech signals fully in time domain. Due to the short bandwidth of the subbands, a simple network architecture is enough to train the simple patterns of the subbands accurately. In the ground truth experiments with teacher-forcing, the subband synthesizer outperforms the fullband model significantly in terms of both subjective and objective measures. In addition, by conditioning the model on the phoneme sequence using a pronunciation dictionary, we have achieved the fully time-domain neural model for subband-based text-to-speech (TTS) synthesizer, which is nearly end-to-end. The generated speech of the subband TTS shows comparable quality as the fullband one with a slighter network architecture for each subband.Comment: 5 pages, 3 figur

    A comparison of soft and hard thresholding by using discrete wavelet transforms

    Get PDF
    This paper  about to reduce the  noise by Adaptive time-frequency Block Thresholding procedure using discrete wavelet transform to achieve better SNR of the audio signal. .  Discrete-wavelet transforms based algorithms are used for audio signal denoising. The resulting algorithm is robust to variations of signal structures such as short transients and long harmonics.  Analysis is done on noisy speech signal corrupted by white noise at 0dB, 5dB, 10dB and 15dB signal to noise ratio levels. Here both hard thresholding and soft thresholding are used for denoising. Simulation & results are performed in MATLAB 7.10.0 (R2010a).  In this paper we are comparing results of soft thresholding and hard thresholding

    Image compression using discrete cosine transform and wavelet transform and performance comparison

    Get PDF
    Image compression deals with reducing the size of image which is performed with the help of transforms. In this project we have taken the Input image and applied wavelet techniques for image compression and have compared the result with the popular DCT image compression. WT provided better result as far as properties like RMS error, image intensity and execution time is concerned. Now a days wavelet theory based technique has emerged in different signal and image processing application including speech, image processing and computer vision. In particular Wavelet Transform is of interest for the analysis of non-stationary signals. In the WT at high frequencies short windows and at low frequencies long windows are used. Since discrete wavelet is essentially sub band–coding system, sub band coders have been quit successful in speech and image compression. It is clear that DWT has potential application in compression problem
    corecore