1,292 research outputs found

    Physiologically-Motivated Feature Extraction Methods for Speaker Recognition

    Get PDF
    Speaker recognition has received a great deal of attention from the speech community, and significant gains in robustness and accuracy have been obtained over the past decade. However, the features used for identification are still primarily representations of overall spectral characteristics, and thus the models are primarily phonetic in nature, differentiating speakers based on overall pronunciation patterns. This creates difficulties in terms of the amount of enrollment data and complexity of the models required to cover the phonetic space, especially in tasks such as identification where enrollment and testing data may not have similar phonetic coverage. This dissertation introduces new features based on vocal source characteristics intended to capture physiological information related to the laryngeal excitation energy of a speaker. These features, including RPCC, GLFCC and TPCC, represent the unique characteristics of speech production not represented in current state-of-the-art speaker identification systems. The proposed features are evaluated through three experimental paradigms including cross-lingual speaker identification, cross song-type avian speaker identification and mono-lingual speaker identification. The experimental results show that the proposed features provide information about speaker characteristics that is significantly different in nature from the phonetically-focused information present in traditional spectral features. The incorporation of the proposed glottal source features offers significant overall improvement to the robustness and accuracy of speaker identification tasks

    Language, perception and production in profoundly deaf children

    Get PDF
    Prelingually profoundly deaf children usually experience problems with language learning (Webster, 1986; Campbell, Burden & Wright, 1992). The acquisition of written language would be no problem for them if normal development of reading and writing was not dependent on spoken language (Pattison, 1986). However, such children cannot be viewed as a homogeneous group since some, the minority, do develop good linguistic skills. Group studies have identified several factors relating to language skills: hearing loss and level of loss, I.Q., intelligibility, lip-reading, use of phonology and memory capacity (Furth, 1966; Conrad, 1979; Trybus & Karchmer, 1977; Jensema, 1975; Baddeley, Papagno & Vallar, 1988; Baddeley & Wilson, 1988; Hanson, 1989; Lake, 1980; Daneman & Carpenter,1980). These various factors appear to be interrelated, with phonological awareness being implicated in most. So to understand behaviour, measures of all these factors must be obtained. The present study aimed to achieve this whilst investigating the prediction that performance success may be due to better use of phonological information. Because linguistic success for the deaf child is exceptional, a case study approach was taken to avoid obscuring subtle differences in performance. Subjects were screened to meet 6 research criteria: profound prelingual deafness, no other known handicap, English the first language in the home, at least average non-verbal IQ , reading age 7-9 years and inter-subject dissimilarities between chronological reading age discrepancies. Case histories were obtained from school records and home interviews. Six subjects with diverse linguistic skills were selected, four of which undertook all tests. Phonological awareness and development was assessed across several variables: immediate memory span, intelligibility, spelling, rhyme judgement, speech discrimination and production. There was considerable inter-subject performance difference. One boy's speech production was singled out for a more detailed analysis. Useful aided hearing and consistent contrastive speech appear to be implicated in other English language skills. It was concluded that for phonological awareness to develop, the deaf child must receive useful inputs from as many media as possible (e.g., vision, audition, articulation, sign and orthography). When input is biassed toward the more reliable modalities of audition and articulation, there is a greater possibility of a robust and useful phonology being derived and thus better access to the English language
    • …
    corecore