879 research outputs found

    Phoneme Recognition on the TIMIT Database

    Get PDF

    Assessment of severe apnoea through voice analysis, automatic speech, and speaker recognition techniques

    Full text link
    The electronic version of this article is the complete one and can be found online at: http://asp.eurasipjournals.com/content/2009/1/982531This study is part of an ongoing collaborative effort between the medical and the signal processing communities to promote research on applying standard Automatic Speech Recognition (ASR) techniques for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patients can receive early treatment. Effective ASR-based detection could dramatically cut medical testing time. Working with a carefully designed speech database of healthy and apnoea subjects, we describe an acoustic search for distinctive apnoea voice characteristics. We also study abnormal nasalization in OSA patients by modelling vowels in nasal and nonnasal phonetic contexts using Gaussian Mixture Model (GMM) pattern recognition on speech spectra. Finally, we present experimental findings regarding the discriminative power of GMMs applied to severe apnoea detection. We have achieved an 81% correct classification rate, which is very promising and underpins the interest in this line of inquiry.The activities described in this paper were funded by the Spanish Ministry of Science and Technology as part of the TEC2006-13170-C02-02 Project

    Spectral analysis of pathological acoustic speech waveforms

    Full text link
    Biomedical engineering is the application of engineering principles and techniques to the medical field. The design and problem solving skills of engineering are combined with medical and biological science, which improves medical disorder diagnosis and treatment. The purpose of this study is to develop an automated procedure for detecting excessive jitter in speech signals, which is useful for differentiating normal from pathologic speech. The fundamental motivation for this research is that tools are needed by speech pathologists and laryngologists for use in the early detection and treatment of laryngeal disorders. Acoustical analysis of speech was performed to analyze various features of a speech signal. Earlier research established a relation between pitch period jitter and harmonic bandwidth. This concept was used for detecting laryngeal disorders in speech since pathologic speech has been found to have larger amounts of jitter than normal speech. Our study was performed using vowel samples from the voice disorder database recorded at the Massachusetts Eye and Ear Infirmary (MEEI) in1994. The KAYPENTAX company markets this database. Software development was conducted using MATLAB, a user-friendly programming language which has been applied widely for signal processing. An algorithm was developed to compute harmonic bandwidths for various speech samples of sustained vowel sounds. Open and closed tests were conducted on 23 samples of pathologic and normal speech samples each. Classification results showed 69.56% probability of correct detection of pathologic speech samples during an open test

    CLASS - A Study of methods for coarse phonetic classification

    Get PDF
    The objective of this thesis was to examine computer techniques for classifying speech signals into four coarse phonetic classes: vowel-like, strong fricative, weak fricative and silence. The study compared classification results from the K-means clustering algorithm using Euclidian distance measurements with classification using a multivariate maximum likelihood distance measure. In addition to the comparison of statistical methods, this study compared classification using several tree-structured decision making processes. The system was trained on ten speakers using 98 utterances with both known and unknown speakers. Results showed very little difference between the Euclidian distance and maximum likelihood; however, the introduction of the tree structure on both systems had a positive influence on their performance

    A detection-based pattern recognition framework and its applications

    Get PDF
    The objective of this dissertation is to present a detection-based pattern recognition framework and demonstrate its applications in automatic speech recognition and broadcast news video story segmentation. Inspired by the studies of modern cognitive psychology and real-world pattern recognition systems, a detection-based pattern recognition framework is proposed to provide an alternative solution for some complicated pattern recognition problems. The primitive features are first detected and the task-specific knowledge hierarchy is constructed level by level; then a variety of heterogeneous information sources are combined together and the high-level context is incorporated as additional information at certain stages. A detection-based framework is a â divide-and-conquerâ design paradigm for pattern recognition problems, which will decompose a conceptually difficult problem into many elementary sub-problems that can be handled directly and reliably. Some information fusion strategies will be employed to integrate the evidence from a lower level to form the evidence at a higher level. Such a fusion procedure continues until reaching the top level. Generally, a detection-based framework has many advantages: (1) more flexibility in both detector design and fusion strategies, as these two parts can be optimized separately; (2) parallel and distributed computational components in primitive feature detection. In such a component-based framework, any primitive component can be replaced by a new one while other components remain unchanged; (3) incremental information integration; (4) high level context information as additional information sources, which can be combined with bottom-up processing at any stage. This dissertation presents the basic principles, criteria, and techniques for detector design and hypothesis verification based on the statistical detection and decision theory. In addition, evidence fusion strategies were investigated in this dissertation. Several novel detection algorithms and evidence fusion methods were proposed and their effectiveness was justified in automatic speech recognition and broadcast news video segmentation system. We believe such a detection-based framework can be employed in more applications in the future.Ph.D.Committee Chair: Lee, Chin-Hui; Committee Member: Clements, Mark; Committee Member: Ghovanloo, Maysam; Committee Member: Romberg, Justin; Committee Member: Yuan, Min
    • …
    corecore