Search CORE

541 research outputs found

Speaker verification using sequence discriminant support vector machines

Author: Renals S.
Wan V.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

This paper presents a text-independent speaker verification system using support vector machines (SVMs) with score-space kernels. Score-space kernels generalize Fisher kernels and are based on underlying generative models such as Gaussian mixture models (GMMs). This approach provides direct discrimination between whole sequences, in contrast with the frame-level approaches at the heart of most current systems. The resultant SVMs have a very high dimensionality since it is related to the number of parameters in the underlying generative model. To address problems that arise in the resultant optimization we introduce a technique called spherical normalization that preconditions the Hessian matrix. We have performed speaker verification experiments using the PolyVar database. The SVM system presented here reduces the relative error rates by 34% compared to a GMM likelihood ratio system

Crossref

Edinburgh Research Archive

Edinburgh Research Explorer

White Rose Research Online

CLIENT / WORLD MODEL SYNCHRONOUS ALIGNEMENT FOR SPEAKER VERIFICATION

Author: Bimbot Frédéric
Genoud Dominique
Mariéthoz Johnny
Mokbel Chafic
Publication venue: Budapest, Hungary
Publication date: 10/03/2006
Field of study

In speaker verification, two independent stochastic models, i.e. a client model and a non-client (world) model, are generally used to verify the claimed identity using a likelihood ratio score. This paper investigates a variant of this approach based on a common hidden process for both models. In this framework, both models share the same topology, which is conditioned by the underlying phonetic structure of the utterance. Then, two different output distributions are defined corresponding to the client vs. world hypotheses. Based on this idea, a synchronous decoding algorithm and the corresponding training algorithm are derived. Our first experiments on the SESP telephone database indicate a slight improvement with respect to a baseline system using independent alignments. Moreover, synchronous alignment offers a reduced complexity during the decoding process. Interesting perspectives can be expected. Keywords : Stochastic Modeling, HMM, Synchronous Alignment, EM algorith

Infoscience - École polytechnique fédérale de Lausanne

Synchronous Alignment

Author: Mariéthoz Johnny
Mokbel Chafic
Publication venue: IDIAP
Publication date: 10/03/2006
Field of study

In speaker verification, the maximum Likelihood between criterion is generally used to verify the claimed identity. This is done using two independent models, i.e. a Client model and a World model. It may be interesting to make both models share the same topology, which represent the phonetic underlying structure, and then to consider two different output distributions corresponding to the Client/World hypotheses. Based on this idea, a decoding algorithm and the corresponding training algorithm were derived. The first experiments show, on a significant telephone database, a small improvement with respect to the reference system, we can conclude that at least synchronous alignment provides equivalent results to the reference system with a reduced complexity decoding algorithm. Other important perspectives can be derived

Infoscience - École polytechnique fédérale de Lausanne

Exploration of small enrollment speaker verification on handheld devices

Author: Woo Ram H. (Ram Han)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2005
Field of study

Thesis (M. Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (p. 77-78).This thesis explores the problem of robust speaker verification for handheld devices under the context of extremely limited training data. Although speaker verification technology is an area of great promise for security applications, the implementation of such a system on handheld devices presents its own unique challenges arising from the highly mobile nature of the devices. This work first independently analyzes the impact of a number of key factors, such as speech features, basic modeling techniques, as well as highly variable environmental/microphone conditions on speaker verification accuracy. We then present and evaluate methods for improving speaker verification robustness. In particular, we focus on normalization techniques, such as handset normalization (H-norm), zero normalization (Z-norm) as well as model training methodologies (multistyle training) to minimize the detrimental impact of highly variable environment and microphone conditions on speaker verification robustness.by Ram H. Woo.M.Eng.and S.B

DSpace@MIT

DIAS Research Report 2006

Author: DIAS Council
Publication venue
Publication date: 01/01/2005
Field of study

DIAS Access to Institutional Repository

Faculty Resources Handbook 2004

Author: The Pedro Arrupe S.J. Center for Community-Based Learning
Publication venue: Scholar Commons
Publication date: 01/05/2004
Field of study

Scholar Commons - Santa Clara University

Information Outlook, October 2007

Author: Special Libraries Association
Publication venue: SJSU ScholarWorks
Publication date: 01/10/2007
Field of study

Volume 11, Issue 10https://scholarworks.sjsu.edu/sla_io_2007/1009/thumbnail.jp

SJSU ScholarWorks

Reconceptualizing Privacy: An Examination Of The Developing Regulatory Regime For Facial Recognition Technology

Author: Potter David James
Publication venue: UND Scholarly Commons
Publication date: 01/01/2015
Field of study

ABSTRACT The National Telecommunications and Information Administration have convened a series of meetings to create a voluntary code of conduct for the commercial use of facial recognition technology. This research asks and answers three questions related to the creation of the voluntary code of conduct: 1) How is the regulatory regime of FRT emerging in the U.S.? 2) What are the roles of the various stakeholders in shaping the commercial regulation of FRT? 3) How does FRT challenge our current conceptions of privacy? Data has been gathered to answer these questions using participant observation and semi-structured interviews. The data was analyzed via mediated discourse analysis. Findings of the research include: the highly sensitive nature of the biometric data that facial recognition technology collects, the data’s ability to be linked across multiple databases, the surreptitious way the data can be collected, the potential chilling effect the technology can have on the First Amendment, and the various threats the technology poses to privacy. Keywords: Privacy, Facial Recognition Technology, Multistakeholder, and Biometric Dat

UND Scholarly Commons (University of North Dakota)

Design and semantics of form and movement : DeSForM 2007

Author
Publication venue: Koninklijke Philips Electronics
Publication date: 01/01/2008
Field of study

A strong theme that has emerged in our previous two conferences in the importance of narrative to the process of generating, developing and communicating new modalities of interaction between people, things and environments. Our researches have identified aspects of importance in the design and have begun to establish orders of, priority of approach and representation for these aspects as components of interaction. We have begun to grapple with the growth in the complexity of the interaction design process for truly ‘animated’ functionality in products, especially where this manifests itself as apparent behavioural characteristics resident in or portrayed by products. The findings and experience of researchers is that this increase in complexity is likely to be exponential compared to the rigours relating to the resolution of static physical product configuration or even system operated product with screen based interfaces. The emerging sense is that narrative in the process is essential to bring meaning and to ‘touch’ our humanity or connect with human experience. ‘The science of the artificial in conversation with the poetics of human experience’! Through this conference we will once again engage in presentations, debate and demonstrations on these issues. In this respect we, the conference co-chairs, have sought to bring together researchers from academia, industry and professional design practice and related disciplines connected with interactive product service and system development to share our latest thinking in the field, to asses its outcomes and to identify further research questions, opportunities and territories for future investigation and exploration

Pure OAI Repository

Design and semantics of form and movement : DeSForM 2007

Author
Publication venue: Koninklijke Philips Electronics
Publication date: 01/01/2008
Field of study

Pure OAI Repository