Search CORE

12 research outputs found

Analysis of Large-Scale SVM Training Algorithms for Language and Speaker Recognition

Author: Cumani Sandro
Laface Pietro
Publication venue: Piscataway, N.J. : IEEE
Publication date: 01/01/2012
Field of study

This paper compares a set of large scale support vector machine (SVM) training algorithms for language and speaker recognition tasks.We analyze five approaches for training phonetic and acoustic SVM models for language recognition. We compare the performance of these approaches as a function of the training time required by each of them to reach convergence, and we discuss their scalability towards large corpora. Two of these algorithms can be used in speaker recognition to train a SVM that classifies pairs of utterances as either belonging to the same speaker or to two different speakers. Our results show that the accuracy of these algorithms is asymptotically equivalent, but they have different behavior with respect to the time required to converge. Some of these algorithms not only scale linearly with the training set size, but are also able to give their best results after just a few iterations. State-of-the-art performance has been obtained in the female subset of the NIST 2010 Speaker Recognition Evaluation extended core test using a single SVM syste

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Nouveaux paradigmes en traitement de la parole et de ses troubles

Author: Daoudi Khalid
Publication venue: HAL CCSD
Publication date: 17/03/2021
Field of study

INRIA a CCSD electronic archive server

A study into automatic speaker verification with aspects of deep learning

Author: Jellyman Keith Andrew
Publication venue
Publication date: 01/07/2018
Field of study

Advancements in automatic speaker verification (ASV) can be considered to be primarily limited to improvements in modelling and classification techniques, capable of capturing ever larger amounts of speech data. This thesis begins by presenting a fairly extensive review of developments in ASV, up to the current state-of-the-art with i-vectors and PLDA. A series of practical tuning experiments then follows. It is found somewhat surprisingly, that even the training of the total variability matrix required for i-vector extraction, is potentially susceptible to unwanted variabilities. The thesis then explores the use of deep learning in ASV. A literature review is first made, with two training methodologies appearing evident: indirectly using a deep neural network trained for automatic speech recognition, and directly with speaker related output classes. The review finds that interest in direct training appears to be increasing, underpinned with the intent to discover new robust 'speaker embedding' representations. Last a preliminary experiment is presented, investigating the use of a deep convolutional network for speaker identification. The small set of results show that the network successfully identifies two test speakers, out of 84 possible speakers enrolled. It is hoped that subsequent research might lead to new robust speaker representations or features

University of Birmingham Research Archive, E-theses Repository

XVII. Magyar Számítógépes Nyelvészeti Konferencia

Author
Publication venue
Publication date: 01/01/2021
Field of study

University of Szeged

Robust gesture recognition

Author: Cheng You-Chi
Publication venue: Georgia Institute of Technology
Publication date: 08/06/2015
Field of study

It is a challenging problem to make a general hand gesture recognition system work in a practical operation environment. In this study, it is mainly focused on recognizing English letters and digits performed near the steering wheel of a car and captured by a video camera. Like most human computer interaction (HCI) scenarios, the in-car gesture recognition suffers from various robustness issues, including multiple human factors and highly varying lighting conditions. It therefore brings up quite a few research issues to be addressed. First, multiple gesturing alternatives may share the same meaning, which is not typical in most previous systems. Next, gestures may not be the same as expected because users cannot see what exactly has been written, which increases the gesture diversity significantly.In addition, varying illumination conditions will make hand detection trivial and thus result in noisy hand gestures. And most severely, users will tend to perform letters at a fast pace, which may result in lack of frames for well-describing gestures. Since users are allowed to perform gestures in free-style, multiple alternatives and variations should be considered while modeling gestures. The main contribution of this work is to analyze and address these challenging issues step-by-step such that eventually the robustness of the whole system can be effectively improved. By choosing color-space representation and performing the compensation techniques for varying recording conditions, the hand detection performance for multiple illumination conditions is first enhanced. Furthermore, the issues of low frame rate and different gesturing tempo will be separately resolved via the cubic B-spline interpolation and i-vector method for feature extraction. Finally, remaining issues will be handled by other modeling techniques such as sub-letter stroke modeling. According to experimental results based on the above strategies, the proposed framework clearly improved the system robustness and thus encouraged the future research direction on exploring more discriminative features and modeling techniques.Ph.D

Scholarly Materials And Research @ Georgia Tech

XVII. Magyar Számítógépes Nyelvészeti Konferencia

Author: Berend Gábor
Gosztolya Gábor
Vincze Veronika
Publication venue: Szegedi Tudományegyetem, Informatikai Intézet
Publication date: 01/01/2021
Field of study

SZTE Publicatio Repozitórium - SZTE - Repository of Publications

EVALUATION OF SCIENTIFIC EVIDENCE : A PROPOSAL ON ONTOLOGICAL AND EPISTEMOLOGICAL BASES, AND SOME STATISTICAL APPLICATIONS

Author: Lucena Molina José Juan
Publication venue: Université de Lausanne, Faculté de droit, des sciences criminelles et d'administration publique
Publication date: 01/01/2017
Field of study

Serveur académique lausannois

Proceedings of the Sixteenth Australasian International Conference on Speech Science and Technology

Author
Publication venue: ASSTA
Publication date: 31/12/2016
Field of study

UCL Discovery

Preface

Author: Pape-Haugaard Louise B.
Scott Philip
Publication venue: 'IOS Press'
Publication date: 16/06/2020
Field of study

Portsmouth University Research Portal (Pure)

XXV Congreso Argentino de Ciencias de la Computación - CACIC 2019: libro de actas

Author: Arroyo Marcelo
Pesado Patricia Mabel
Publication venue: UniRío Editora
Publication date: 06/03/2020
Field of study

Trabajos presentados en el XXV Congreso Argentino de Ciencias de la Computación (CACIC), celebrado en la ciudad de Río Cuarto los días 14 al 18 de octubre de 2019 organizado por la Red de Universidades con Carreras en Informática (RedUNCI) y Facultad de Ciencias Exactas, Físico-Químicas y Naturales - Universidad Nacional de Río CuartoRed de Universidades con Carreras en Informátic

Servicio de Difusión de la Creación Intelectual