100 research outputs found

    A new Automatic Formant Tracking approach based on scalogram maxima detection using complex wavelets

    Get PDF
    International audienceIn this paper we present a new formant tracking algorithm where the formant frequencies estimation was based on local maxima detection of a time frequency representation. This representation can be shown by a scalogram issued from a complex wavelet transform. The formant frequency candidates are validated as local maxima of scalogram which correspond to wavelet ridges. Then in the proposed algorithm, we have introduced the computation of center of gravity as tracking constraint. We tested our new algorithm by applying it on synthesized and natural voiced speech signals. The formant trajectories obtained by our algorithm were compared to those of manually-edited ones of our Arabic database as reference; those given by Fourier transform method and the LPC analysis used in Praat. The comparison of the results showed globally the adequacy of the first three formant trajectories using complex Morlet wavelet refers to the manually-edited formant tracks

    An Evaluation of Formant Tracking methods on an Arabic Database

    Get PDF
    International audienceIn this paper we present a formant database of Arabic used to evaluate our new automatic formant tracking algorithm based on Fourier ridges detection. In this method we have introduced a continuity constraint based on the computation of centres of gravity for a set of formant candidates. This leads to connect a frame of speech to its neighbours and thus improves the robustness of tracking. The formant trajectories obtained by the algorithm proposed are compared to those of the hand edited formant database and those given by Praat with LPC data

    Power-Weighted LPC Formant Estimation

    Get PDF
    A power-weighted formant frequency estimation procedure based on Linear Predictive Coding (LPC) is presented. It works by pre-emphasizing the dominant spectral components of an input signal, which allows a subsequent estimation step to extract formant frequencies with greater accuracy. The accuracy of traditional LPC formant estimation is improved by this new power-weighted formant estimator for different classes of synthetic signals and for speech. Power-weighted LPC significantly and reliably outperforms LPC and variants of LPC at the task of formant estimation using the VTR formants dataset, a database consisting of the Vocal Tract Resonance (VTR) frequency trajectories obtained by human experts for the first three formant frequencies. This performance gain is evident over a range of filter orders

    Evaluation d'une nouvelle méthode de suivi de formants sur un corpus Arabe

    Get PDF
    National audienceThis paper develops a formant tracking technique based on Fourier ridges detection. In this method we have introduced a constraint of tracking based on the computation of centre of gravity for a set of frequency formant candidates which leads to connect a frame of speech to its neighbours and thus to improve the robustness of tracking. The formant trajectories obtained by the algorithm proposed are compared to those of a hand edited formant Arabic database, created especially for this work, and those given by Praat with LPC data

    Wavelet methods in speech recognition

    Get PDF
    In this thesis, novel wavelet techniques are developed to improve parametrization of speech signals prior to classification. It is shown that non-linear operations carried out in the wavelet domain improve the performance of a speech classifier and consistently outperform classical Fourier methods. This is because of the localised nature of the wavelet, which captures correspondingly well-localised time-frequency features within the speech signal. Furthermore, by taking advantage of the approximation ability of wavelets, efficient representation of the non-stationarity inherent in speech can be achieved in a relatively small number of expansion coefficients. This is an attractive option when faced with the so-called 'Curse of Dimensionality' problem of multivariate classifiers such as Linear Discriminant Analysis (LDA) or Artificial Neural Networks (ANNs). Conventional time-frequency analysis methods such as the Discrete Fourier Transform either miss irregular signal structures and transients due to spectral smearing or require a large number of coefficients to represent such characteristics efficiently. Wavelet theory offers an alternative insight in the representation of these types of signals. As an extension to the standard wavelet transform, adaptive libraries of wavelet and cosine packets are introduced which increase the flexibility of the transform. This approach is observed to be yet more suitable for the highly variable nature of speech signals in that it results in a time-frequency sampled grid that is well adapted to irregularities and transients. They result in a corresponding reduction in the misclassification rate of the recognition system. However, this is necessarily at the expense of added computing time. Finally, a framework based on adaptive time-frequency libraries is developed which invokes the final classifier to choose the nature of the resolution for a given classification problem. The classifier then performs dimensionaIity reduction on the transformed signal by choosing the top few features based on their discriminant power. This approach is compared and contrasted to an existing discriminant wavelet feature extractor. The overall conclusions of the thesis are that wavelets and their relatives are capable of extracting useful features for speech classification problems. The use of adaptive wavelet transforms provides the flexibility within which powerful feature extractors can be designed for these types of application

    Seismic characterisation based on time-frequency spectral analysis

    Get PDF
    We present high-resolution time-frequency spectral analysis schemes to better resolve seismic images for the purpose of seismic and petroleum reservoir characterisation. Seismic characterisation is based on the physical properties of the Earth's subsurface media, and these properties are represented implicitly by seismic attributes. Because seismic traces originally presented in the time domain are non-stationary signals, for which the properties vary with time, we characterise those signals by obtaining seismic attributes which are also varying with time. Among the widely used attributes are spectral attributes calculated through time-frequency decomposition. Time-frequency spectral decomposition methods are employed to capture variations of a signal within the time-frequency domain. These decomposition methods generate a frequency vector at each time sample, referred to as the spectral component. The computed spectral component enables us to explore the additional frequency dimension which exists jointly with the original time dimension enabling localisation and characterisation of patterns within the seismic section. Conventional time-frequency decomposition methods include the continuous wavelet transform and the Wigner-Ville distribution. These methods suffer from challenges that hinder accurate interpretation when used for seismic interpretation. Continuous wavelet transform aims to decompose signals on a basis of elementary signals which have to be localised in time and frequency, but this method suffers from resolution and localisation limitations in the time-frequency spectrum. In addition to smearing, it often emerges from ill-localisation. The Wigner-Ville distribution distributes the energy of the signal over the two variables time and frequency and results in highly localised signal components. Yet, the method suffers from spurious cross-term interference due to its quadratic nature. This interference is misleading when the spectrum is used for interpretation purposes. For the specific application on seismic data the interference obscures geological features and distorts geophysical details. This thesis focuses on developing high fidelity and high-resolution time-frequency spectral decomposition methods as an extension to the existing conventional methods. These methods are then adopted as means to resolve seismic images for petroleum reservoirs. These methods are validated in terms of physics, robustness, and accurate energy localisation, using an extensive set of synthetic and real data sets including both carbonate and clastic reservoir settings. The novel contributions achieved in this thesis include developing time-frequency analysis algorithms for seismic data, allowing improved interpretation and accurate characterisation of petroleum reservoirs. The first algorithm established in this thesis is the Wigner-Ville distribution (WVD) with an additional masking filter. The standard WVD spectrum has high resolution but suffers the cross-term interference caused by multiple components in the signal. To suppress the cross-term interference, I designed a masking filter based on the spectrum of the smoothed-pseudo WVD (SP-WVD). The original SP-WVD incorporates smoothing filters in both time and frequency directions to suppress the cross-term interference, which reduces the resolution of the time-frequency spectrum. In order to overcome this side-effect, I used the SP-WVD spectrum as a reference to design a masking filter, and apply it to the standard WVD spectrum. Therefore, the mask-filtered WVD (MF-WVD) can preserve the high-resolution feature of the standard WVD while suppressing the cross-term interference as effectively as the SP-WVD. The second developed algorithm in this thesis is the synchrosqueezing wavelet transform (SWT) equipped with a directional filter. A transformation algorithm such as the continuous wavelet transform (CWT) might cause smearing in the time-frequency spectrum, i.e. the lack of localisation. The SWT attempts to improve the localisation of the time-frequency spectrum generated by the CWT. The real part of the complex SWT spectrum, after directional filtering, is capable to resolve the stratigraphic boundaries of thin layers within target reservoirs. In terms of seismic characterisation, I tested the high-resolution spectral results on a complex clastic reservoir interbedded with coal seams from the Ordos basin, northern China. I used the spectral results generated using the MF-WVD method to facilitate the interpretation of the sand distribution within the dataset. In another implementation I used the SWT spectral data results and the original seismic data together as the input to a deep convolutional neural network (dCNN), to track the horizons within a 3D volume. Using these application-based procedures, I have effectively extracted the spatial variation and the thickness of thinly layered sandstone in a coal-bearing reservoir. I also test the algorithm on a carbonate reservoir from the Tarim basin, western China. I used the spectrum generated by the synchrosqueezing wavelet transform equipped with directional filtering to characterise faults, karsts, and direct hydrocarbon indicators within the reservoir. Finally, I investigated pore-pressure prediction in carbonate layers. Pore-pressure variation generates subtle changes in the P-wave velocity of carbonate rocks. This suggests that existing empirical relations capable of predicting pore-pressure in clastic rocks are unsuitable for the prediction in carbonate rocks. I implemented the prediction based on the P-wave velocity and the wavelet transform multi-resolution analysis (WT-MRA). The WT-MRA method can unfold information within the frequency domain via decomposing the P-wave velocity. This enables us to extract and amplify hidden information embedded in the signal. Using Biot's theory, WT-MRA decomposition results can be divided into contributions from the pore-fluid and the rock framework. Therefore, I proposed a pore-pressure prediction model which is based on the pore-fluid contribution, calculated through WT-MRA, to the P-wave velocity.Open Acces
    corecore