8,667 research outputs found
Speaker verification using sequence discriminant support vector machines
This paper presents a text-independent speaker verification system using support vector machines (SVMs) with score-space kernels. Score-space kernels generalize Fisher kernels and are based on underlying generative models such as Gaussian mixture models (GMMs). This approach provides direct discrimination between whole sequences, in contrast with the frame-level approaches at the heart of most current systems. The resultant SVMs have a very high dimensionality since it is related to the number of parameters in the underlying generative model. To address problems that arise in the resultant optimization we introduce a technique called spherical normalization that preconditions the Hessian matrix. We have performed speaker verification experiments using the PolyVar database. The SVM system presented here reduces the relative error rates by 34% compared to a GMM likelihood ratio system
Effectiveness in the Realisation of Speaker Authentication
Ā© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.An important consideration for the deployment of speaker recognition in authentication applications is the approach to the formation of training and testing utterances . Whilst defining this for a specific scenario is influenced by the associated requirements and conditions, the process can be further guided through the establishment of the relative usefulness of alternative frameworks for composing the training and testing material. In this regard, the present paper provides an analysis of the effects, on the speaker recognition accuracy, of various bases for the formation of the training and testing data. The experimental investigations are conducted based on the use of digit utterances taken from the XM2VTS database. The paper presents a detailed description of the individual approaches considered and discusses the experimental results obtained in different cases
Investigation of Frame Alignments for GMM-based Digit-prompted Speaker Verification
Frame alignments can be computed by different methods in GMM-based speaker
verification. By incorporating a phonetic Gaussian mixture model (PGMM), we are
able to compare the performance using alignments extracted from the deep neural
networks (DNN) and the conventional hidden Markov model (HMM) in digit-prompted
speaker verification. Based on the different characteristics of these two
alignments, we present a novel content verification method to improve the
system security without much computational overhead. Our experiments on the
RSR2015 Part-3 digit-prompted task show that, the DNN based alignment performs
on par with the HMM alignment. The results also demonstrate the effectiveness
of the proposed Kullback-Leibler (KL) divergence based scoring to reject speech
with incorrect pass-phrases.Comment: accepted by APSIPA ASC 201
- ā¦