35 research outputs found

    Pitch-scaled estimation of simultaneous voiced and turbulence-noise components in speech

    No full text
    Almost all speech contains simultaneous contributions from more than one acoustic source within the speaker's vocal tract. In this paper, we propose a method -- the pitch-scaled harmonic filter (PSHF) -- which aims to separate the voiced and turbulence-noise components of the speech signal during phonation, based on a maximum-likelihood approach. The PSHF outputs periodic and aperiodic components that are estimates of the respective contributions of the different types of acoustic source. It produces four reconstructed time series signals by decomposing the original speech signal, first, according to amplitude, and then according to power of the Fourier coefficients. Thus, one pair of periodic and aperiodic signals is optimized for subsequent time-series analysis, and another pair for spectral analysis. The performance of the PSHF algorithm was tested on synthetic signals, using three forms of disturbance (jitter, shimmer and additive noise), and the results were used to predict the performance on real speech. Processing recorded speech examples elicited latent features from the signals, demonstrating the PSHF's potential for analysis of mixed-source speech

    Speech Communication

    Get PDF
    Contains reports on two research projects.National Institutes of Health (Grant 2 ROl1 NS04332)National Institutes of Health (Training Grant 5 T32 NS07040)C.J. LeBel FellowshipsNational Science Foundation (Grant BNS77-26871

    Multitaper analysis of fundamental frequency variations during voiced fricatives

    No full text
    A method for tracking fundamental frequency variations in speech is proposed, based on multitaper analysis. Using the multitaper technique, a statistical test is developed for detecting the presence of harmonic components at multiples of a fundamental frequency, embedded in coloured noise. It is shown that this can be applied to speech to estimate the fundamental frequency, when present, as well as the amplitude and phase of each harmonic. The method is validated on synthetic data, to determine accuracy and robustness, and evaluated on a small corpus of real speech data, comparing simultaneous acoustic and electroglottographic measurements to assess performance. Acoustic measurements are marginally less accurate than electroglottographic measurements, but often continue to provide useful fundamental frequency estimates in situations where electroglottography fails

    A parametric study of the spectral characteristics of European Portuguese fricatives

    No full text
    Studies of Portuguese phonetics and phonology indicate that fricatives are central to some interesting features of the language, yet studies of Portuguese fricatives have been few and limited. In this study, Portuguese fricatives were analyzed in ways designed to enhance our description of the language and to increase our understanding of the production of fricatives. Corpora of Portuguese words containing /f, v, s, z, sh, zh/, nonsense words of the pattern /V1FV2/ that follow Portuguese phonological rules, and sustained fricatives were recorded by four native speakers of European Portuguese (two men, two women). Results of analysis show that more than half of the voiced fricatives devoice; devoicing occurs more often in word-final fricatives. Averaged power spectra were computed for all fricatives and parameterized in order to aid comparisons across speaker and across corpus, and to gain insight into the production mechanisms underlying the language-specific variations. Substantial differences were found between spectra of voiced and unvoiced, same-place fricatives. The parameters spectral slope, frequency of maximum amplitude, and dynamic amplitude, derived from previous studies, behaved as predicted for changes in effort level, voicing, and location within the fricative. Changes in syllable stress, however, did not affect the fricatives in a manner consistent with effort level variation. Some combinations were also useful for separating the fricatives by place or by sibilance

    The Aerodynamics of Speech

    No full text

    Modelling the acoustics of fricative consonants using distributed noise sources with spatial and temporal correlation

    No full text
    Most existing computer models of noise sources in fricatives simulate either point sources or distributed sources composed of independent source elements. This paper describes a model that allows spatial and temporal correlations within a distributed source to be manipulated in a time-domain finite-difference model of vocal tract acoustics. Results from mechanical modelling experiments are used to validate the model where possible. Computer simulations using the model show that the sharpness of spectral zeros depends strongly on the source length and the correlation length of the source, and this is consistent with measurements of real speech

    Aerodynamic and aeroacoustic aspects of vowel production

    No full text
    Speech communication is fundamental to human society yet, as a result of the relative inaccessibility of the larynx and vocal tract to instrumentation and the complexity of the aerodynamics therein, the mechanism of sound production during speech has not been fully quantified. In this paper we consider vowel production modelling. We describe those aspects of the anatomy and physiology of the speech system that are relevant to the generation of vowel sounds and outline the means by which the larynx is caused to vibrate. We then discuss the equations of fluid mechanics required to model the laryngeal airflow, describing the approximations commonly used to reduce them to a solvable set and assessing the validity of those approximations for speech air-flows during voicing. We next consider the sources of sound that generate the radiated vowel waveforms; we include the traditional glottal-waveform source and a number of other mechanisms that may contribute to the output speech wave. Finally we outline the difficulties in obtaining data from live subjects and some of the methods used to overcome these difficulties

    Temporal and Devoicing Analysis of European Portuguese Fricatives

    No full text
    Duration and devoicing of Portuguese fricatives have been studied using a set of corpora that include nonsense words following Portuguese phonological rules, and real words; these were recorded by four subjects (2 male, 2 female). Results show that fricative duration varies most with voicing (voiceless are longer), and also significantly by speaker, place, and position within word. Devoicing occurs most often word-finally, and varies significantly by place; devoicing occurs more often than in English
    corecore