1 research outputs found
Improved I-vector-based Speaker Recognition for Utterances with Speaker Generated Non-speech sounds
Conversational speech not only contains several variants of neutral speech
but is also prominently interlaced with several speaker generated non-speech
sounds such as laughter and breath. A robust speaker recognition system should
be capable of recognizing a speaker irrespective of these variations in his
speech. An understanding of whether the speaker-specific information
represented by these variations is similar or not helps build a good speaker
recognition system. In this paper, speaker variations captured by neutral
speech of a speaker is analyzed by considering speech-laugh (a variant of
neutral speech) and laughter (non-speech) sounds of the speaker. We study an
i-vector-based speaker recognition system trained only on neutral speech and
evaluate its performance on speech-laugh and laughter. Further, we analyze the
effect of including laughter sounds during training of an i-vector-basedspeaker
recognition system. Our experimental results show that the inclusion of
laughter sounds during training seem to provide complementary speaker-specific
information which results in an overall improved performance of the speaker
recognition system, especially on the utterances with speech-laugh segments