Search CORE

7,055 research outputs found

Automatic pronunciation verification for speech recognition

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Towards an Automated Screening Tool for Pediatric Speech Delay

Author: Sadeghian Roozbeh
Zahorian Stephen A.
Publication venue: Digital Commons at Harrisburg University
Publication date: 01/09/2015
Field of study

Speech delay is a childhood language problem that sometimes is resolved on its own but sometimes may cause more serious language difficulties later. This leads therapists to screen children for detection at early ages in order to eliminate future problems. Using the Goldman-Fristoe Test of Articulation (GFTA) method, therapists listen to a child\u27s pronunciation of certain phonemes and phoneme pairs in specified words and judge the child\u27s stage of speech development. The goal of this paper is to develop an Automatic Speech Recognition (ASR) tool and related speech processing methods which emulate the knowledge of speech therapists. In this paper two methods of feature extraction (MFCC and DCTC) were used as the baseline for training an HMM-based utterance verification system which was later used for testing the utterances of 63 young children (ages 4-10), both typically developed and speech delayed. The ASR results show the value of augmenting static spectral information with spectral trajectory information for better prediction of therapist\u27s judgments

Digital Commons @ Harrisburg University of Science and Technology

Automatic Phonetic Transcription of Non-Prompted Speech

Author: Ohala John J.
Schiel Florian
Publication venue
Publication date: 01/01/1999
Field of study

A reliable method for automatic phonetic transcription of non− prompted German speech has been developed at th

CiteSeerX

Open Access LMU

Alcohol Language Corpus

Author: Barfüßer Sabine
Heinrich Christian
Schiel Florian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

The Alcohol Language Corpus (ALC) is the first publicly available speech corpus comprising intoxicated and sober speech of 162 female and male German speakers. Recordings are done in the automotive environment to allow for the development of automatic alcohol detection and to ensure a consistent acoustic environment for the alcoholized and the sober recording. The recorded speech covers a variety of contents and speech styles. Breath and blood alcohol concentration measurements are provided for all speakers. A transcription according to SpeechDat/Verbmobil standards and disfluency tagging as well as an automatic phonetic segmentation are part of the corpus. An Emu version of ALC allows easy access to basic speech parameters as well as the us of R for statistical analysis of selected parts of ALC. ALC is available without restriction for scientific or commercial use at the Bavarian Archive for Speech Signals

CiteSeerX

Crossref

Open Access LMU

The Production of Speech Corpora

Author: Baumann Angela
Draxler Christoph
Ellbogen Tania
Schiel Florian
Steffen Alexander
Publication venue
Publication date: 21/03/2012
Field of study

Open Access LMU

Subjective tests of speaker recognition for selected voice disguise techniques

Author: Staroniewicz Piotr
Publication venue: Electronics and Telecommunications Committee
Publication date: 18/07/2024
Field of study

Research work on the effectiveness of voice disguise techniques is important for the development of biometric systems (surveillance) as well as phonoscopic research (forensics). A speaker recognition system or a listener can be deliberately or non-deliberately misled by technical or natural methods. It is important to determine the impact of these techniques on both automatic systems and live listeners. This paper presents the results of listening tests conducted on a group of 40 people. The effectiveness of speaker recognition was investigated using selected natural (chosen from four groups of deliberate natural techniques: phonation, phonemic, prosodic and deformation) and technical (pitch shifting, GSM coding) voice disguise techniques. The results were related to the previously obtained outcomes for the automatic method of verification carried out using a classical speaker recognition system based on MFCC (Mel Frequency Cepstral Coefficients) parameterisation and GMM (Gaussian Mixture Models) classification

International Journal of Electronics and Telecommunications (Warsaw University of Technology)