26,496 research outputs found
Human abnormal behavior impact on speaker verification systems
Human behavior plays a major role in improving human-machine communication. The performance must be affected by abnormal behavior as systems are trained using normal utterances. The abnormal behavior is often associated with a change in the human emotional state. Different emotional states cause physiological changes in the human body that affect the vocal tract. Fear, anger, or even happiness we recognize as a deviation from a normal behavior. The whole spectrum of human-machine application is susceptible to behavioral changes. Abnormal behavior is a major factor, especially for security applications such as verification systems. Face, fingerprint, iris, or speaker verification is a group of the most common approaches to biometric authentication today. This paper discusses human normal and abnormal behavior and its impact on the accuracy and effectiveness of automatic speaker verification (ASV). The support vector machines classifier inputs are Mel-frequency cepstral coefficients and their dynamic changes. For this purpose, the Berlin Database of Emotional Speech was used. Research has shown that abnormal behavior has a major impact on the accuracy of verification, where the equal error rate increase to 37 %. This paper also describes a new design and application of the ASV system that is much more immune to the rejection of a target user with abnormal behavior.Web of Science6401274012
Human and Machine Speaker Recognition Based on Short Trivial Events
Trivial events are ubiquitous in human to human conversations, e.g., cough,
laugh and sniff. Compared to regular speech, these trivial events are usually
short and unclear, thus generally regarded as not speaker discriminative and so
are largely ignored by present speaker recognition research. However, these
trivial events are highly valuable in some particular circumstances such as
forensic examination, as they are less subjected to intentional change, so can
be used to discover the genuine speaker from disguised speech. In this paper,
we collect a trivial event speech database that involves 75 speakers and 6
types of events, and report preliminary speaker recognition results on this
database, by both human listeners and machines. Particularly, the deep feature
learning technique recently proposed by our group is utilized to analyze and
recognize the trivial events, which leads to acceptable equal error rates
(EERs) despite the extremely short durations (0.2-0.5 seconds) of these events.
Comparing different types of events, 'hmm' seems more speaker discriminative.Comment: ICASSP 201
- …