11,559 research outputs found

    Employing Emotion Cues to Verify Speakers in Emotional Talking Environments

    Full text link
    Usually, people talk neutrally in environments where there are no abnormal talking conditions such as stress and emotion. Other emotional conditions that might affect people talking tone like happiness, anger, and sadness. Such emotions are directly affected by the patient health status. In neutral talking environments, speakers can be easily verified, however, in emotional talking environments, speakers cannot be easily verified as in neutral talking ones. Consequently, speaker verification systems do not perform well in emotional talking environments as they do in neutral talking environments. In this work, a two-stage approach has been employed and evaluated to improve speaker verification performance in emotional talking environments. This approach employs speaker emotion cues (text-independent and emotion-dependent speaker verification problem) based on both Hidden Markov Models (HMMs) and Suprasegmental Hidden Markov Models (SPHMMs) as classifiers. The approach is comprised of two cascaded stages that combines and integrates emotion recognizer and speaker recognizer into one recognizer. The architecture has been tested on two different and separate emotional speech databases: our collected database and Emotional Prosody Speech and Transcripts database. The results of this work show that the proposed approach gives promising results with a significant improvement over previous studies and other approaches such as emotion-independent speaker verification approach and emotion-dependent speaker verification approach based completely on HMMs.Comment: Journal of Intelligent Systems, Special Issue on Intelligent Healthcare Systems, De Gruyter, 201

    Factor analysis modelling for speaker verification with short utterances

    Get PDF
    This paper examines combining both relevance MAP and subspace speaker adaptation processes to train GMM speaker models for use in speaker verification systems with a particular focus on short utterance lengths. The subspace speaker adaptation method involves developing a speaker GMM mean supervector as the sum of a speaker-independent prior distribution and a speaker dependent offset constrained to lie within a low-rank subspace, and has been shown to provide improvements in accuracy over ordinary relevance MAP when the amount of training data is limited. It is shown through testing on NIST SRE data that combining the two processes provides speaker models which lead to modest improvements in verification accuracy for limited data situations, in addition to improving the performance of the speaker verification system when a larger amount of available training data is available

    A Corpus-Based, Pilot Study of Lexical Stress Variation in American English

    Get PDF
    Phonological free variation describes the phenomenon of there being more than one pronunciation for a word without any change in meaning (e.g. because, schedule, vehicle). The term also applies to words that exhibit different stress patterns (e.g. academic, resources, comparable) with no change in meaning or grammatical category. A corpus-based analysis of free variation is a useful tool for testing the validity of surveys of speakers' pronunciation preferences for certain variants. The current paper presents the results of a corpus-based pilot study of American English, in an attempt to replicate Mompéan's 2009 study of British English
    corecore