Search CORE

3,885 research outputs found

Cross match-CHMM fusion for speaker adaptation of voice biometric

Author: Ariff A. K.
Kamarulafizam I.
Noor A. M.
Salleh S. H.
Publication venue: Asian Research Publishing Network
Publication date: 01/01/2017
Field of study

The most significant factor affecting automatic voice biometric performance is the variation in the signal characteristics, due to speaker-based variability, conversation-based variability and technology variability. These variations give great challenge in accurately modeling and verifying a speaker. To solve this variability effects, the cross match (CM) technique is proposed to provide a speaker model that can adapt to variability over periods of time. Using limited amount of enrollment utterances, a client barcode is generated and can be updated by cross matching the client barcode with new data. Furthermore, CM adds the dimension of multimodality at the fusion-level when the similarity score from CM can be fused with the score from the default speaker modeling. The scores need to be normalized before the fusion takes place. By fusing the CM with continuous Hidden Markov Model (CHMM), the new adapted model gave significant improvement in identification and verification task, where the equal error rate (EER) decreased from 6.51% to 1.23% in speaker identification and from 5.87% to 1.04% in speaker verification. EER also decreased over time (across five sessions) when the CM is applied. The best combination of normalization and fusion technique methods is piecewise-linear method and weighted sum

Universiti Teknologi Malaysia Institutional Repository

HMM-based on-line signature verification: Feature extraction and signature modeling

Author: Fiérrez Julián
González-Rodríguez Joaquín
Ortega-García Javier
Ramos Daniel
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

This is the author’s version of a work that was accepted for publication in Pattern Recognition Letters. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Pattern Recognition Letters 28.16 (2007): 2325 – 2334, DOI: 10.1016/j.patrec.2007.07.012A function-based approach to on-line signature verification is presented. The system uses a set of time sequences and Hidden Markov Models (HMMs). Development and evaluation experiments are reported on a subcorpus of the MCYT bimodal biometric database comprising more than 7,000 signatures from 145 subjects. The system is compared to other state-of-the-art systems based on the results of the First International Signature Verification Competition (SVC 2004). A number of practical findings related to feature extraction and modeling are obtained.This work has been supported by the Spanish projects TIC2003-08382-C05- 01 and TEC2006-13141-C03-03, and by the European NoE Biosecure

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Biblos-e Archivo

Latent Class Model with Application to Speaker Diarization

Author: Chen Xianhong
He Liang
Johnson Michael T
Liu Jia
Liu Yi
Xu Can
Publication venue
Publication date: 24/04/2019
Field of study

In this paper, we apply a latent class model (LCM) to the task of speaker diarization. LCM is similar to Patrick Kenny's variational Bayes (VB) method in that it uses soft information and avoids premature hard decisions in its iterations. In contrast to the VB method, which is based on a generative model, LCM provides a framework allowing both generative and discriminative models. The discriminative property is realized through the use of i-vector (Ivec), probabilistic linear discriminative analysis (PLDA), and a support vector machine (SVM) in this work. Systems denoted as LCM-Ivec-PLDA, LCM-Ivec-SVM, and LCM-Ivec-Hybrid are introduced. In addition, three further improvements are applied to enhance its performance. 1) Adding neighbor windows to extract more speaker information for each short segment. 2) Using a hidden Markov model to avoid frequent speaker change points. 3) Using an agglomerative hierarchical cluster to do initialization and present hard and soft priors, in order to overcome the problem of initial sensitivity. Experiments on the National Institute of Standards and Technology Rich Transcription 2009 speaker diarization database, under the condition of a single distant microphone, show that the diarization error rate (DER) of the proposed methods has substantial relative improvements compared with mainstream systems. Compared to the VB method, the relative improvements of LCM-Ivec-PLDA, LCM-Ivec-SVM, and LCM-Ivec-Hybrid systems are 23.5%, 27.1%, and 43.0%, respectively. Experiments on our collected database, CALLHOME97, CALLHOME00 and SRE08 short2-summed trial conditions also show that the proposed LCM-Ivec-Hybrid system has the best overall performance

arXiv.org e-Print Archive

University of Kentucky