64 research outputs found

    Testing the assumptions of linear prediction analysis in normal vowels

    Get PDF
    This paper develops an improved surrogate data test to show experimental evidence, for all the simple vowels of US English, for both male and female speakers, that Gaussian linear prediction analysis, a ubiquitous technique in current speech technologies, cannot be used to extract all the dynamical structure of real speech time series. The test provides robust evidence undermining the validity of these linear techniques, supporting the assumptions of either dynamical nonlinearity and/or non-Gaussianity common to more recent, complex, efforts at dynamical modelling speech time series. However, an additional finding is that the classical assumptions cannot be ruled out entirely, and plausible evidence is given to explain the success of the linear Gaussian theory as a weak approximation to the true, nonlinear/non-Gaussian dynamics. This supports the use of appropriate hybrid linear/nonlinear/non-Gaussian modelling. With a calibrated calculation of statistic and particular choice of experimental protocol, some of the known systematic problems of the method of surrogate data testing are circumvented to obtain results to support the conclusions to a high level of significance

    Frication and Voicing Classification

    Full text link

    Auditory-inspired morphological processing of speech spectrograms: applications in automatic speech recognition and speech enhancement

    Get PDF
    New auditory-inspired speech processing methods are presented in this paper, combining spectral subtraction and two-dimensional non-linear filtering techniques originally conceived for image processing purposes. In particular, mathematical morphology operations, like erosion and dilation, are applied to noisy speech spectrograms using specifically designed structuring elements inspired in the masking properties of the human auditory system. This is effectively complemented with a pre-processing stage including the conventional spectral subtraction procedure and auditory filterbanks. These methods were tested in both speech enhancement and automatic speech recognition tasks. For the first, time-frequency anisotropic structuring elements over grey-scale spectrograms were found to provide a better perceptual quality than isotropic ones, revealing themselves as more appropriate—under a number of perceptual quality estimation measures and several signal-to-noise ratios on the Aurora database—for retaining the structure of speech while removing background noise. For the second, the combination of Spectral Subtraction and auditory-inspired Morphological Filtering was found to improve recognition rates in a noise-contaminated version of the Isolet database.This work has been partially supported by the Spanish Ministry of Science and Innovation CICYT Project No. TEC2008-06382/TEC.Publicad

    Tone burst-evoked otoacoustic emissions in neonates: normative data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Tone-burst otoacoustic emissions (TBOAEs) have not been routinely studied in pediatric populations, although tone burst stimuli have greater frequency specificity compared with click sound stimuli. The present study aimed (1) to determine an appropriate stimulus level for neonatal TBOAE measurements when the stimulus center frequency was 1 kHz, (2) to explore the characteristics of 1 kHz TBOAEs in a neonatal population.</p> <p>Methods</p> <p>A total of 395 normal neonates (745 ears) were recruited. The study consisted of two parts, reflecting the two study aims. Part I included 40 normal neonatal ears, and TBOAE measurement was performed at five stimulus levels in the range 60–80 dB peSPL, with 5 dB incremental steps. Part II investigated the characteristics of the 1 kHz TBOAE response in a large group of 705 neonatal ears, and provided clinical reference criteria based on these characteristics.</p> <p>Results</p> <p>The study provided a series of reference parameters for 1 kHz TBOAE measurement in neonates. Based on the results, a suggested stimulus level and reference criteria for 1 kHz TBOAE measures with neonates were established. In addition, time-frequency analysis of the data gave new insight into the energy distribution of the neonatal TBOAE response.</p> <p>Conclusion</p> <p>TBOAE measures may be a useful method for investigating cochlear function at specific frequency ranges in neonates. However, further studies of both TBOAE time-frequency analysis and measurements in newborns are needed.</p

    Phase estimation with application to speech analysis-synthesis

    No full text
    Thesis (Sc.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1980.MICROFICHE COPY AVAILABLE IN ARCHIVES AND ENGINEERING.Vita.Includes bibliographical references.by Thomas F. Quatieri, Jr.Sc.D

    The design of two-dimensional digital filters by generalized McClellan transformations

    No full text
    Thesis. 1975. M.S.--Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.Includes bibliographical references.by Thomas Francis Quatieri, Jr.M.S
    • …
    corecore