2,724 research outputs found

    Acoustic-Phonetic Features for the Automatic Classification of Stop Consonants

    Get PDF
    In this paper, the acoustic–phonetic characteristics of American English stop consonants are investigated. Features studied in the literature are evaluated for their information content and new features are proposed. A statistically guided, knowledge-based, acoustic–phonetic system for the automatic classification of stops, in speaker independent continuous speech, is proposed. The system uses a new auditory-based front-end processing and incorporates new algorithms for the extraction and manipulation of the acoustic–phonetic features that proved to be rich in their information content. Recognition experiments are performed using hard decision algorithms on stops extracted from the TIMIT database continuous speech of 60 speakers (not used in the design process) from seven different dialects of American English. An accuracy of 96% is obtained for voicing detection, 90% for place articulation detection and 86% for the overall classification of stops

    Acoustic-phonetic features for the automatic classification of stop consonants

    Full text link

    Robust Classification of Stop Consonants Using Auditory-Based Speech Processing

    Get PDF
    In this work, a feature-based system for the automatic classification of stop consonants, in speaker independent continuous speech, is reported. The system uses a new auditory-based speech processing front-end that is based on the biologically rooted property of average localized synchrony detection (ALSD). It incorporates new algorithms for the extraction and manipulation of the acoustic-phonetic features that proved, statistically, to be rich in their information content. The experiments are performed on stop consonants extracted from the TIMIT database with additive white Gaussian noise at various signal-to-noise ratios. The obtained classification accuracy compares favorably with previous work. The results also showed a consistent improvement of 3% in the place detection over the Generalized Synchrony Detector (GSD) system under identical circumstances on clean and noisy speech. This illustrates the superior ability of the ALSD to suppress the spurious peaks and produce a consistent and robust formant (peak) representation

    Phonetic drift

    Get PDF
    This chapter provides an overview of research on the phonetic changes that occur in one’s native language (L1) due to recent experience in another language (L2), a phenomenon known as phonetic drift. Through a survey of empirical findings on segmental and suprasegmental acoustic properties, the chapter examines the features of the L1 that are subject to phonetic drift, the cognitive mechanism(s) behind phonetic drift, and the various factors that influence the likelihood of phonetic drift. In short, virtually all aspects of L1 speech are subject to drift, but different aspects do not drift in the same manner, possibly due to multiple routes of L2 influence coexisting at different levels of L1 phonological structure. In addition to the timescale of these changes, the chapter discusses the relationship between phonetic drift and attrition as well as some of the enduring questions in this area.https://drive.google.com/open?id=1eQbh17Z4YsH8vY_XjCHGqi5QChfBKcAZhttps://drive.google.com/open?id=1eQbh17Z4YsH8vY_XjCHGqi5QChfBKcAZhttps://drive.google.com/open?id=1eQbh17Z4YsH8vY_XjCHGqi5QChfBKcAZAccepted manuscriptAccepted manuscrip

    Computer classification of stop consonants in a speaker independent continuous speech environment

    Get PDF
    In the English language there are six stop consonants, /b,d,g,p,t,k/. They account for over 17% of all phonemic occurrences. In continuous speech, phonetic recognition of stop consonants requires the ability to explicitly characterize the acoustic signal. Prior work has shown that high classification accuracy of discrete syllables and words can be achieved by characterizing the shape of the spectrally transformed acoustic signal. This thesis extends this concept to include a multispeaker continuous speech database and statistical moments of a distribution to characterize shape. A multivariate maximum likelihood classifier was used to discriminate classes. To reduce the number of features used by the discriminant model a dynamic programming scheme was employed to optimize subset combinations. The top six moments were the mean, variance, and skewness in both frequency and energy. Results showed 85% classification on the full database of 952 utterances. Performance improved to 97% when the discriminant model was trained separately for male and female talkers

    Phonetics of segmental FO and machine recognition of Korean speech

    Get PDF
    • …
    corecore