2,116 research outputs found

    Glottal Spectral Separation for Speech Synthesis

    Get PDF

    Identification of persons via voice imprint

    Get PDF
    Tato prĂĄce se zabĂœvĂĄ textově zĂĄvislĂœm rozpoznĂĄvĂĄnĂ­m ƙečnĂ­kĆŻ v systĂ©mech, kde existuje pouze omezenĂ© mnoĆŸstvĂ­ trĂ©novacĂ­ch vzorkĆŻ. Pro Ășčel rozpoznĂĄvĂĄnĂ­ je navrĆŸen otisk hlasu zaloĆŸenĂœ na rĆŻznĂœch pƙíznacĂ­ch (napƙ. MFCC, PLP, ACW atd.). Na začátku prĂĄce je zmĂ­něn zpĆŻsob vytváƙenĂ­ ƙečovĂ©ho signĂĄlu. NěkterĂ© charakteristiky ƙeči, dĆŻleĆŸitĂ© pro rozpoznĂĄvĂĄnĂ­ ƙečnĂ­kĆŻ, jsou rovnÄ›ĆŸ zmĂ­něny. DalĆĄĂ­ část prĂĄce se zabĂœvĂĄ analĂœzou ƙečovĂ©ho signĂĄlu. Je zde zmĂ­něno pƙedzpracovĂĄnĂ­ a takĂ© metody extrakce pƙíznakĆŻ. NĂĄsledujĂ­cĂ­ část popisuje proces rozpoznĂĄvĂĄnĂ­ ƙečnĂ­kĆŻ a zmiƈuje zpĆŻsoby ohodnocenĂ­ pouĆŸĂ­vanĂœch metod: identifikace a verifikace ƙečnĂ­kĆŻ. PoslednĂ­ teoreticky zaloĆŸenĂĄ část prĂĄce se zabĂœvĂĄ klasifikĂĄtory vhodnĂœmi pro textově zĂĄvislĂ© rozpoznĂĄvĂĄnĂ­. Jsou zmĂ­něny klasifikĂĄtory zaloĆŸenĂ© na zlomkovĂœch vzdĂĄlenostech, dynamickĂ©m borcenĂ­ časovĂ© osy, vyrovnĂĄvĂĄnĂ­ rozptylu a vektorovĂ© kvantizaci. Tato prĂĄce pokračuje nĂĄvrhem a realizacĂ­ systĂ©mu, kterĂœ hodnotĂ­ vĆĄechny zmĂ­něnĂ© klasifikĂĄtory pro otisk hlasu zaloĆŸenĂœ na rĆŻznĂœch pƙíznacĂ­ch.This work deals with the text-dependent speaker recognition in systems, where just a few training samples exist. For the purpose of this recognition, the voice imprint based on different features (e.g. MFCC, PLP, ACW etc.) is proposed. At the beginning, there is described the way, how the speech signal is produced. Some speech characteristics important for speaker recognition are also mentioned. The next part of work deals with the speech signal analysis. There is mentioned the preprocessing and also the feature extraction methods. The following part describes the process of speaker recognition and mentions the evaluation of the used methods: speaker identification and verification. Last theoretically based part of work deals with the classifiers which are suitable for the text-dependent recognition. The classifiers based on fractional distances, dynamic time warping, dispersion matching and vector quantization are mentioned. This work continues by design and realization of system, which evaluates all described classifiers for voice imprint based on different features.

    Determination of articulatory parameters from speech waveforms

    Get PDF
    Imperial Users onl

    Analysis and correction of the helium speech effect by autoregressive signal processing

    Get PDF
    SIGLELD:D48902/84 / BLDSC - British Library Document Supply CentreGBUnited Kingdo

    Spectral analysis of pathological acoustic speech waveforms

    Full text link
    Biomedical engineering is the application of engineering principles and techniques to the medical field. The design and problem solving skills of engineering are combined with medical and biological science, which improves medical disorder diagnosis and treatment. The purpose of this study is to develop an automated procedure for detecting excessive jitter in speech signals, which is useful for differentiating normal from pathologic speech. The fundamental motivation for this research is that tools are needed by speech pathologists and laryngologists for use in the early detection and treatment of laryngeal disorders. Acoustical analysis of speech was performed to analyze various features of a speech signal. Earlier research established a relation between pitch period jitter and harmonic bandwidth. This concept was used for detecting laryngeal disorders in speech since pathologic speech has been found to have larger amounts of jitter than normal speech. Our study was performed using vowel samples from the voice disorder database recorded at the Massachusetts Eye and Ear Infirmary (MEEI) in1994. The KAYPENTAX company markets this database. Software development was conducted using MATLAB, a user-friendly programming language which has been applied widely for signal processing. An algorithm was developed to compute harmonic bandwidths for various speech samples of sustained vowel sounds. Open and closed tests were conducted on 23 samples of pathologic and normal speech samples each. Classification results showed 69.56% probability of correct detection of pathologic speech samples during an open test

    Alternating minimisation for glottal inverse filtering

    Get PDF
    A new method is proposed for solving the glottal inverse filtering (GIF) problem. The goal of GIF is to separate an acoustical speech signal into two parts: the glottal airflow excitation and the vocal tract filter. To recover such information one has to deal with a blind deconvolution problem. This ill-posed inverse problem is solved under a deterministic setting, considering unknowns on both sides of the underlying operator equation. A stable reconstruction is obtained using a double regularization strategy, alternating between fixing either the glottal source signal or the vocal tract filter. This enables not only splitting the nonlinear and nonconvex problem into two linear and convex problems, but also allows the use of the best parameters and constraints to recover each variable at a time. This new technique, called alternating minimization glottal inverse filtering (AM-GIF), is compared with two other approaches: Markov chain Monte Carlo glottal inverse filtering (MCMC-GIF), and iterative adaptive inverse filtering (IAIF), using synthetic speech signals. The recent MCMC-GIF has good reconstruction quality but high computational cost. The state-of-the-art IAIF method is computationally fast but its accuracy deteriorates, particularly for speech signals of high fundamental frequency (F0). The results show the competitive performance of the new method: With high F0, the reconstruction quality is better than that of IAIF and close to MCMC-GIF while reducing the computational complexity by two orders of magnitude.Peer reviewe
    • 

    corecore