92 research outputs found
An Evaluation of Formant Tracking methods on an Arabic Database
International audienceIn this paper we present a formant database of Arabic used to evaluate our new automatic formant tracking algorithm based on Fourier ridges detection. In this method we have introduced a continuity constraint based on the computation of centres of gravity for a set of formant candidates. This leads to connect a frame of speech to its neighbours and thus improves the robustness of tracking. The formant trajectories obtained by the algorithm proposed are compared to those of the hand edited formant database and those given by Praat with LPC data
A new Automatic Formant Tracking approach based on scalogram maxima detection using complex wavelets
International audienceIn this paper we present a new formant tracking algorithm where the formant frequencies estimation was based on local maxima detection of a time frequency representation. This representation can be shown by a scalogram issued from a complex wavelet transform. The formant frequency candidates are validated as local maxima of scalogram which correspond to wavelet ridges. Then in the proposed algorithm, we have introduced the computation of center of gravity as tracking constraint. We tested our new algorithm by applying it on synthesized and natural voiced speech signals. The formant trajectories obtained by our algorithm were compared to those of manually-edited ones of our Arabic database as reference; those given by Fourier transform method and the LPC analysis used in Praat. The comparison of the results showed globally the adequacy of the first three formant trajectories using complex Morlet wavelet refers to the manually-edited formant tracks
Evaluation d'une nouvelle méthode de suivi de formants sur un corpus Arabe
National audienceThis paper develops a formant tracking technique based on Fourier ridges detection. In this method we have introduced a constraint of tracking based on the computation of centre of gravity for a set of frequency formant candidates which leads to connect a frame of speech to its neighbours and thus to improve the robustness of tracking. The formant trajectories obtained by the algorithm proposed are compared to those of a hand edited formant Arabic database, created especially for this work, and those given by Praat with LPC data
Wavelet methods in speech recognition
In this thesis, novel wavelet techniques are developed to improve parametrization of
speech signals prior to classification. It is shown that non-linear operations carried out
in the wavelet domain improve the performance of a speech classifier and consistently
outperform classical Fourier methods. This is because of the localised nature of the
wavelet, which captures correspondingly well-localised time-frequency features
within the speech signal. Furthermore, by taking advantage of the approximation
ability of wavelets, efficient representation of the non-stationarity inherent in speech
can be achieved in a relatively small number of expansion coefficients. This is an
attractive option when faced with the so-called 'Curse of Dimensionality' problem of
multivariate classifiers such as Linear Discriminant Analysis (LDA) or Artificial
Neural Networks (ANNs). Conventional time-frequency analysis methods such as the
Discrete Fourier Transform either miss irregular signal structures and transients due to
spectral smearing or require a large number of coefficients to represent such
characteristics efficiently. Wavelet theory offers an alternative insight in the
representation of these types of signals.
As an extension to the standard wavelet transform, adaptive libraries of wavelet and
cosine packets are introduced which increase the flexibility of the transform. This
approach is observed to be yet more suitable for the highly variable nature of speech
signals in that it results in a time-frequency sampled grid that is well adapted to
irregularities and transients. They result in a corresponding reduction in the
misclassification rate of the recognition system. However, this is necessarily at the
expense of added computing time.
Finally, a framework based on adaptive time-frequency libraries is developed which
invokes the final classifier to choose the nature of the resolution for a given
classification problem. The classifier then performs dimensionaIity reduction on the
transformed signal by choosing the top few features based on their discriminant power. This approach is compared and contrasted to an existing discriminant wavelet
feature extractor.
The overall conclusions of the thesis are that wavelets and their relatives are capable
of extracting useful features for speech classification problems. The use of adaptive
wavelet transforms provides the flexibility within which powerful feature extractors
can be designed for these types of application
Models and Analysis of Vocal Emissions for Biomedical Applications
The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the newborn to the adult and elderly. Over the years the initial issues have grown and spread also in other fields of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years in Firenze, Italy. This edition celebrates twenty-two years of uninterrupted and successful research in the field of voice analysis
- …