4 research outputs found

    Improving SVF with DISTBIC for Phoneme Segmentation

    Get PDF
    International audienceIn this paper we examine an application for phoneme segmentation of DISTBIC, a two-pass, textindependent method traditionally used for speaker segmentation. The novelty of this paper is its experimentation with use of the spectral variation function (SVF), a simple non-parametric method for phone segmentation, as a replacement for the distance measure of the first pass of DISTBIC. In doing so we aim to produce a computationally efficient method for text-independent phoneme segmentation that provides good performance. Experiments are carried out on the TIMIT database. We give a performance comparison between the SVF as previously used for segmentation, our DISTBIC-SVF algorithm, and another state-of-the-art algorithm

    Malay articulation system for early screening diagnostic using hidden markov model and genetic algorithm

    Get PDF
    Speech recognition is an important technology and can be used as a great aid for individuals with sight or hearing disabilities today. There are extensive research interest and development in this area for over the past decades. However, the prospect in Malaysia regarding the usage and exposure is still immature even though there is demand from the medical and healthcare sector. The aim of this research is to assess the quality and the impact of using computerized method for early screening of speech articulation disorder among Malaysian such as the omission, substitution, addition and distortion in their speech. In this study, the statistical probabilistic approach using Hidden Markov Model (HMM) has been adopted with newly designed Malay corpus for articulation disorder case following the SAMPA and IPA guidelines. Improvement is made at the front-end processing for feature vector selection by applying the silence region calibration algorithm for start and end point detection. The classifier had also been modified significantly by incorporating Viterbi search with Genetic Algorithm (GA) to obtain high accuracy in recognition result and for lexical unit classification. The results were evaluated by following National Institute of Standards and Technology (NIST) benchmarking. Based on the test, it shows that the recognition accuracy has been improved by 30% to 40% using Genetic Algorithm technique compared with conventional technique. A new corpus had been built with verification and justification from the medical expert in this study. In conclusion, computerized method for early screening can ease human effort in tackling speech disorders and the proposed Genetic Algorithm technique has been proven to improve the recognition performance in terms of search and classification task

    Using Explicit Segmentation To Improve Hmm Phone Recognition

    No full text
    We show that many of the errors in a context-dependent phone recognition system are due to poor segmentation. We then suggest a method to incorporate explicit segmentation information directly into the HMM paradigm. The utility of explicit segmentation information is illustrated with experiments involving five types of segmentation information and three methods of smoothing. 1. INTRODUCTION One of the most attractive features of HMMs for speech recognition is that segmentation and classification are solved simultaneously. However, the maximum likelihood training criterion may not lead to a model that best utilizes the acoustic information for segmentation. In this study, we investigate the possibility of improving HMM performance by providing explicit segmentation information. We define a change function as a function that directly measures the spectral variation of the acoustic signal. The change function is integrated into the HMM as the cost of making a transition from one phone to..

    Novel multiscale methods for nonlinear speech analysis

    Get PDF
    Cette thèse présente une recherche exploratoire sur l'application du Formalisme Microcanonique Multiéchelles (FMM) à l'analyse de la parole. Dérivé de principes issus en physique statistique, le FMM permet une analyse géométrique précise de la dynamique non linéaire des signaux complexes. Il est fondé sur l'estimation des paramètres géométriques locaux (les exposants de singularité) qui quantifient le degré de prédictibilité à chaque point du signal. Si correctement définis est estimés, ils fournissent des informations précieuses sur la dynamique locale de signaux complexes. Nous démontrons le potentiel du FMM dans l'analyse de la parole en développant: un algorithme performant pour la segmentation phonétique, un nouveau codeur, un algorithme robuste pour la détection précise des instants de fermeture glottale, un algorithme rapide pour l analyse par prédiction linéaire parcimonieuse et une solution efficace pour l approximation multipulse du signal source d'excitation.This thesis presents an exploratory research on the application of a nonlinear multiscale formalism, called the Microcanonical Multiscale Formalism (the MMF), to the analysis of speech signals. Derived from principles in Statistical Physics, the MMF allows accurate analysis of the nonlinear dynamics of complex signals. It relies on the estimation of local geometrical parameters, the singularity exponents (SE), which quantify the degree of predictability at each point of the signal domain. When correctly defined and estimated, these exponents can provide valuable information about the local dynamics of complex signals and has been successfully used in many applications ranging from signal representation to inference and prediction.We show the relevance of the MMF to speech analysis and develop several applications to show the strength and potential of the formalism. Using the MMF, in this thesis we introduce: a novel and accurate text-independent phonetic segmentation algorithm, a novel waveform coder, a robust accurate algorithm for detection of the Glottal Closure Instants, a closed-form solution for the problem of sparse linear prediction analysis and finally, an efficient algorithm for estimation of the excitation source signal.BORDEAUX1-Bib.electronique (335229901) / SudocSudocFranceF
    corecore