10,477 research outputs found
Wavelet transforms for non-uniform speech recognition
An algorithm for nonuniform speech segmentation and its application in speech recognition systems is presented. A method based on the Modulated Gaussian Wavelet Transform based Speech Analyser (MGWTSA) and the subsequent parametrization block is used to transform a uniform signal into a set of nonuniformly separated frames, with the accurate information being fed into a speech recognition system. The algorithm needs a frame characterizing the signal where necessary, trying to reduce the number of frames per signal as much as possible, without an appreciable reduction in the recognition rate of the system.Peer ReviewedPostprint (published version
Novel Pitch Detection Algorithm With Application to Speech Coding
This thesis introduces a novel method for accurate pitch detection and speech segmentation, named Multi-feature, Autocorrelation (ACR) and Wavelet Technique (MAWT). MAWT uses feature extraction, and ACR applied on Linear Predictive Coding (LPC) residuals, with a wavelet-based refinement step. MAWT opens the way for a unique approach to modeling: although speech is divided into segments, the success of voicing decisions is not crucial. Experiments demonstrate the superiority of MAWT in pitch period detection accuracy over existing methods, and illustrate its advantages for speech segmentation. These advantages are more pronounced for gain-varying and transitional speech, and under noisy conditions
Segmentation of a Speech Signal with Application of Fast Wavelet Transformation
The article describes the method of preliminary segmentation of a speech signal with wavelet
transformation use, consisting of two stages. At the first stage there is an allocation of sibilants and pauses, at
the second – the further segmentation of the rest signal parts
Analisis Fungsi Wavelet Daubechies untuk Sinyal Suara dengan Panjang Segmen Berbeda
Wavelets Daubechies have been widely applied to signal processing, such as automatic speech recognition system. Wavelet Daubechies, which is one of the wavelet families distinguished by its order, defined as N. The magnitude of the order N value has an influence on the wavelet decomposition where with the greater N value there is an increase in the smoothness of multiresolution analysis results. However, not all order Daubechies wavelet can give the same good recognition results so that its application still such as trial and error. Therefore, it is necessary to determine the order of the Daubechies wavelet base function on the Indonesian voice signal through its similarity level. The method can be used to determine the similarity level between speech signal and wavelet Daubechies function N order by calculating its crosscorrelation coefficient. The result shows that there is inconcistency of the best wavelet daubechies basis function for Indonesian vowels a,i,u,e,è,o, and ò. Which db45 and db44 are the best wavelet Daubechies basis function on 2048 and 1024 segmentation length respectively
Multiscale Discriminant Saliency for Visual Attention
The bottom-up saliency, an early stage of humans' visual attention, can be
considered as a binary classification problem between center and surround
classes. Discriminant power of features for the classification is measured as
mutual information between features and two classes distribution. The estimated
discrepancy of two feature classes very much depends on considered scale
levels; then, multi-scale structure and discriminant power are integrated by
employing discrete wavelet features and Hidden markov tree (HMT). With wavelet
coefficients and Hidden Markov Tree parameters, quad-tree like label structures
are constructed and utilized in maximum a posterior probability (MAP) of hidden
class variables at corresponding dyadic sub-squares. Then, saliency value for
each dyadic square at each scale level is computed with discriminant power
principle and the MAP. Finally, across multiple scales is integrated the final
saliency map by an information maximization rule. Both standard quantitative
tools such as NSS, LCC, AUC and qualitative assessments are used for evaluating
the proposed multiscale discriminant saliency method (MDIS) against the
well-know information-based saliency method AIM on its Bruce Database wity
eye-tracking data. Simulation results are presented and analyzed to verify the
validity of MDIS as well as point out its disadvantages for further research
direction.Comment: 16 pages, ICCSA 2013 - BIOCA sessio
Combining local regularity estimation and total variation optimization for scale-free texture segmentation
Texture segmentation constitutes a standard image processing task, crucial to
many applications. The present contribution focuses on the particular subset of
scale-free textures and its originality resides in the combination of three key
ingredients: First, texture characterization relies on the concept of local
regularity ; Second, estimation of local regularity is based on new multiscale
quantities referred to as wavelet leaders ; Third, segmentation from local
regularity faces a fundamental bias variance trade-off: In nature, local
regularity estimation shows high variability that impairs the detection of
changes, while a posteriori smoothing of regularity estimates precludes from
locating correctly changes. Instead, the present contribution proposes several
variational problem formulations based on total variation and proximal
resolutions that effectively circumvent this trade-off. Estimation and
segmentation performance for the proposed procedures are quantified and
compared on synthetic as well as on real-world textures
- …