6 research outputs found

    Automated Classification of Vowel Category and Speaker Type in the High-Frequency Spectrum

    Get PDF
    The high-frequency region of vowel signals (above the third formant or F3) has received little research attention. Recent evidence, however, has documented the perceptual utility of high-frequency information in the speech signal above the traditional frequency bandwidth known to contain important cues for speech and speaker recognition. The purpose of this study was to determine if high-pass filtered vowels could be separated by vowel category and speaker type in a supervised learning framework. Mel frequency cepstral coefficients (MFCCs) were extracted from productions of six vowel categories produced by two male, two female, and two child speakers. Results revealed that the filtered vowels were well separated by vowel category and speaker type using MFCCs from the high-frequency spectrum. This demonstrates the presence of useful information for automated classification from the high-frequency region and is the first study to report findings of this nature in a supervised learning framework

    Effects of signal bandwidth and noise on individual speaker identification

    Get PDF
    Two experiments were conducted to evaluate the effects of increasing spectral bandwidth from 3 to 10 kHz on individual speaker recognition in noisy conditions (þ5, 0, and 5 dB signal-to-noise ratio). Experiment 1 utilized h(Vowel)d (hVd) signals, while experiment 2 utilized sentences from the Rainbow Passage. Both experiments showed significant improvements in individual speaker identification in the 10 kHz bandwidth condition (6% for hVds; 10% for sentences). These results coincide with the extant machine recognition literature demonstrating significant amounts of individual speaker information present in the speech signal above approximately 3–4 kHz. Cues from the highfrequency region for speaker identity warrant further study

    Automated classification of vowel category and speaker type in the high-frequency spectrum

    No full text
    The high-frequency region of vowel signals (above the third formant or F3) has received little research attention. Recent evidence, however, has documented the perceptual utility of high-frequency information in the speech signal above the traditional frequency bandwidth known to contain important cues for speech and speaker recognition. The purpose of this study was to determine if high-pass filtered vowels could be separated by vowel category and speaker type in a supervised learning framework. Mel frequency cepstral coefficients (MFCCs) were extracted from productions of six vowel categories produced by two male, two female, and two child speakers. Results revealed that the filtered vowels were well separated by vowel category and speaker type using MFCCs from the high-frequency spectrum. This demonstrates the presence of useful information for automated classification from the high-frequency region and is the first study to report findings of this nature in a supervised learning framework

    Hearing and orally mimicking different acoustic-semantic categories of natural sound engage distinct left hemisphere cortical regions

    Get PDF
    Oral mimicry is thought to represent an essential process for the neurodevelopment of spoken language systems in infants, the evolution of language in hominins, and a process that could possibly aid recovery in stroke patients. Using functional magnetic resonance imaging (fMRI), we previously reported a divergence of auditory cortical pathways mediating perception of specific categories of natural sounds. However, it remained unclear if or how this fundamental sensory organization by the brain might relate to motor output, such as sound mimicry. Here, using fMRI, we revealed a dissociation of activated brain regions preferential for hearing with the intent to imitate and the oral mimicry of animal action sounds versus animal vocalizations as distinct acoustic-semantic categories. This functional dissociation may reflect components of a rudimentary cortical architecture that links systems for processing acoustic-semantic universals of natural sound with motor-related systems mediating oral mimicry at a category level. The observation of different brain regions involved in different aspects of oral mimicry may inform targeted therapies for rehabilitation of functional abilities after stroke
    corecore