Search CORE

25 research outputs found

A New Database for Speaker Recognition

Author: Feng Ling
Hansen Lars Kai
Publication venue
Publication date: 01/01/2005
Field of study

T-Norm y desajuste léxico y acústico en reconocimiento de locutor dependiente de texto

Author: Esteve Elizalde Cristina
Fernández Pozo Rubén
González-Rodríguez Joaquín
Hernández Gómez Luis
Hernández López Daniel
Toledano Doroteo T.
Publication venue: Universidad del País Vasco
Publication date: 01/01/2008
Field of study

Actas de las V Jornadas en Tecnología del Habla (JTH 2008)Este trabajo presenta un estudio extenso sobre T-norm aplicado a Reconocimiento de Locutor Dependiente de Texto, analizando también los problemas del desajuste léxico y acústico. Veremos cómo varían los resultados teniendo en cuenta la dependencia de género y realizando T-norm a nivel de frase, fonema y estado con cohortes de impostores de distintos tamaños. El estudio demuestra que implementar T-norm por fonema o estado puede llegar a conseguir mejoras relativas de hasta un 16% y que realizar una selección de cohorte basada en el género puede mejorar más aún los resultados con respecto al caso independiente de género

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

Speaker verification using sequence discriminant support vector machines

Author: Renals S.
Wan V.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

This paper presents a text-independent speaker verification system using support vector machines (SVMs) with score-space kernels. Score-space kernels generalize Fisher kernels and are based on underlying generative models such as Gaussian mixture models (GMMs). This approach provides direct discrimination between whole sequences, in contrast with the frame-level approaches at the heart of most current systems. The resultant SVMs have a very high dimensionality since it is related to the number of parameters in the underlying generative model. To address problems that arise in the resultant optimization we introduce a technique called spherical normalization that preconditions the Hessian matrix. We have performed speaker verification experiments using the PolyVar database. The SVM system presented here reduces the relative error rates by 34% compared to a GMM likelihood ratio system

Crossref

Edinburgh Research Archive

Edinburgh Research Explorer

White Rose Research Online

Phoneme and Sub-Phoneme T-Normalization for Text-Dependent Speaker Recognition

Author: Esteve-Elizalde Cristina
Fernández Pozo Rubén
Gonzalez-Rodriguez Joaquin
Hernández Gómez Luis Alfonso
Torre Toledano Doroteo
Publication venue: E.T.S.I. Telecomunicación (UPM)
Publication date: 01/01/2008
Field of study

Test normalization (T-Norm) is a score normalization technique that is regularly and successfully applied in the context of text-independent speaker recognition. It is less frequently applied, however, to text-dependent or textprompted speaker recognition, mainly because its improvement in this context is more modest. In this paper we present a novel way to improve the performance of T-Norm for text-dependent systems. It consists in applying score TNormalization at the phoneme or sub-phoneme level instead of at the sentence level. Experiments on the YOHO corpus show that, while using standard sentence-level T-Norm does not improve equal error rate (EER), phoneme and sub-phoneme level T-Norm produce a relative EER reduction of 18.9% and 20.1% respectively on a state-of-the-art HMM based textdependent speaker recognition system. Results are even better for working points with low false acceptance rates

CiteSeerX

Archivo Digital UPM

Enabling Massive Deep Neural Networks with the GraphBLAS

Author: Kepner Jeremy
Kumar Manoj
Moreira José
Pattnaik Pratap
Serrano Mauricio
Tufo Henry
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/08/2017
Field of study

Deep Neural Networks (DNNs) have emerged as a core tool for machine learning. The computations performed during DNN training and inference are dominated by operations on the weight matrices describing the DNN. As DNNs incorporate more stages and more nodes per stage, these weight matrices may be required to be sparse because of memory limitations. The GraphBLAS.org math library standard was developed to provide high performance manipulation of sparse weight matrices and input/output vectors. For sufficiently sparse matrices, a sparse matrix library requires significantly less memory than the corresponding dense matrix implementation. This paper provides a brief description of the mathematics underlying the GraphBLAS. In addition, the equations of a typical DNN are rewritten in a form designed to use the GraphBLAS. An implementation of the DNN is given using a preliminary GraphBLAS C library. The performance of the GraphBLAS implementation is measured relative to a standard dense linear algebra library implementation. For various sizes of DNN weight matrices, it is shown that the GraphBLAS sparse implementation outperforms a BLAS dense implementation as the weight matrix becomes sparser.Comment: 10 pages, 7 figures, to appear in the 2017 IEEE High Performance Extreme Computing (HPEC) conferenc

arXiv.org e-Print Archive

Crossref

AI Enabled Maneuver Identification via the Maneuver Identification Challenge

Author: Kepner Jeremy
LaRosa Matthew
McAlpin Kyle
Samuel Kaira
Schaefer Morgan
Swenson Brandon
Wasilefsky Devin
Wu Yan
Zhao Dan
Publication venue
Publication date: 28/11/2022
Field of study

Artificial intelligence (AI) has enormous potential to improve Air Force pilot training by providing actionable feedback to pilot trainees on the quality of their maneuvers and enabling instructor-less flying familiarization for early-stage trainees in low-cost simulators. Historically, AI challenges consisting of data, problem descriptions, and example code have been critical to fueling AI breakthroughs. The Department of the Air Force-Massachusetts Institute of Technology AI Accelerator (DAF-MIT AI Accelerator) developed such an AI challenge using real-world Air Force flight simulator data. The Maneuver ID challenge assembled thousands of virtual reality simulator flight recordings collected by actual Air Force student pilots at Pilot Training Next (PTN). This dataset has been publicly released at Maneuver-ID.mit.edu and represents the first of its kind public release of USAF flight training data. Using this dataset, we have applied a variety of AI methods to separate "good" vs "bad" simulator data and categorize and characterize maneuvers. These data, algorithms, and software are being released as baselines of model performance for others to build upon to enable the AI ecosystem for flight simulator training.Comment: 10 pages, 7 figures, 4 tables, accepted to and presented at I/ITSE

arXiv.org e-Print Archive