5,870 research outputs found
Extraction of vocal-tract system characteristics from speechsignals
We propose methods to track natural variations in the characteristics of the vocal-tract system from speech signals. We are especially interested in the cases where these characteristics vary over time, as happens in dynamic sounds such as consonant-vowel transitions. We show that the selection of appropriate analysis segments is crucial in these methods, and we propose a selection based on estimated instants of significant excitation. These instants are obtained by a method based on the average group-delay property of minimum-phase signals. In voiced speech, they correspond to the instants of glottal closure. The vocal-tract system is characterized by its formant parameters, which are extracted from the analysis segments. Because the segments are always at the same relative position in each pitch period, in voiced speech the extracted formants are consistent across successive pitch periods. We demonstrate the results of the analysis for several difficult cases of speech signals
Analysis of a Modern Voice Morphing Approach using Gaussian Mixture Models for Laryngectomees
This paper proposes a voice morphing system for people suffering from
Laryngectomy, which is the surgical removal of all or part of the larynx or the
voice box, particularly performed in cases of laryngeal cancer. A primitive
method of achieving voice morphing is by extracting the source's vocal
coefficients and then converting them into the target speaker's vocal
parameters. In this paper, we deploy Gaussian Mixture Models (GMM) for mapping
the coefficients from source to destination. However, the use of the
traditional/conventional GMM-based mapping approach results in the problem of
over-smoothening of the converted voice. Thus, we hereby propose a unique
method to perform efficient voice morphing and conversion based on GMM,which
overcomes the traditional-method effects of over-smoothening. It uses a
technique of glottal waveform separation and prediction of excitations and
hence the result shows that not only over-smoothening is eliminated but also
the transformed vocal tract parameters match with the target. Moreover, the
synthesized speech thus obtained is found to be of a sufficiently high quality.
Thus, voice morphing based on a unique GMM approach has been proposed and also
critically evaluated based on various subjective and objective evaluation
parameters. Further, an application of voice morphing for Laryngectomees which
deploys this unique approach has been recommended by this paper.Comment: 6 pages, 4 figures, 4 tables; International Journal of Computer
Applications Volume 49, Number 21, July 201
Generalized Perceptual Linear Prediction (gPLP) Features for Animal Vocalization Analysis
A new feature extraction model, generalized perceptual linear prediction (gPLP), is developed to calculate a set of perceptually relevant features for digital signal analysis of animalvocalizations. The gPLP model is a generalized adaptation of the perceptual linear prediction model, popular in human speech processing, which incorporates perceptual information such as frequency warping and equal loudness normalization into the feature extraction process. Since such perceptual information is available for a number of animal species, this new approach integrates that information into a generalized model to extract perceptually relevant features for a particular species. To illustrate, qualitative and quantitative comparisons are made between the species-specific model, generalized perceptual linear prediction (gPLP), and the original PLP model using a set of vocalizations collected from captive African elephants (Loxodonta africana) and wild beluga whales (Delphinapterus leucas). The models that incorporate perceptional information outperform the original human-based models in both visualization and classification tasks
- …