7,055 research outputs found

    Automatic pronunciation verification for speech recognition

    Full text link

    Towards an Automated Screening Tool for Pediatric Speech Delay

    Get PDF
    Speech delay is a childhood language problem that sometimes is resolved on its own but sometimes may cause more serious language difficulties later. This leads therapists to screen children for detection at early ages in order to eliminate future problems. Using the Goldman-Fristoe Test of Articulation (GFTA) method, therapists listen to a child\u27s pronunciation of certain phonemes and phoneme pairs in specified words and judge the child\u27s stage of speech development. The goal of this paper is to develop an Automatic Speech Recognition (ASR) tool and related speech processing methods which emulate the knowledge of speech therapists. In this paper two methods of feature extraction (MFCC and DCTC) were used as the baseline for training an HMM-based utterance verification system which was later used for testing the utterances of 63 young children (ages 4-10), both typically developed and speech delayed. The ASR results show the value of augmenting static spectral information with spectral trajectory information for better prediction of therapist\u27s judgments

    Automatic Phonetic Transcription of Non-Prompted Speech

    Get PDF
    A reliable method for automatic phonetic transcription of non− prompted German speech has been developed at th

    Alcohol Language Corpus

    Get PDF
    The Alcohol Language Corpus (ALC) is the first publicly available speech corpus comprising intoxicated and sober speech of 162 female and male German speakers. Recordings are done in the automotive environment to allow for the development of automatic alcohol detection and to ensure a consistent acoustic environment for the alcoholized and the sober recording. The recorded speech covers a variety of contents and speech styles. Breath and blood alcohol concentration measurements are provided for all speakers. A transcription according to SpeechDat/Verbmobil standards and disfluency tagging as well as an automatic phonetic segmentation are part of the corpus. An Emu version of ALC allows easy access to basic speech parameters as well as the us of R for statistical analysis of selected parts of ALC. ALC is available without restriction for scientific or commercial use at the Bavarian Archive for Speech Signals

    Subjective tests of speaker recognition for selected voice disguise techniques

    Get PDF
    Research work on the effectiveness of voice disguise techniques is important for the development of biometric systems (surveillance) as well as phonoscopic research (forensics). A speaker recognition system or a listener can be deliberately or non-deliberately misled by technical or natural methods. It is important to determine the impact of these techniques on both automatic systems and live listeners. This paper presents the results of listening tests conducted on a group of 40 people. The effectiveness of speaker recognition was investigated using selected natural (chosen from four groups of deliberate natural techniques: phonation, phonemic, prosodic and deformation) and technical (pitch shifting, GSM coding) voice disguise techniques. The results were related to the previously obtained outcomes for the automatic method of verification carried out using a classical speaker recognition system based on MFCC (Mel Frequency Cepstral Coefficients) parameterisation and GMM (Gaussian Mixture Models) classification