Search CORE

170 research outputs found

New time-frequency derived cepstral coefficients for automatic speech recognition

Author: Chollet Gérard
Wassner Hubert
Publication venue
Publication date: 10/03/2006
Field of study

The goal is to improve recognition rate by optimisation of Mel Frequency Cepstral Coefficients (MFCCs): modifications concern the time-frequency representation used to estimate these coefficients. There are many ways to obtain a spectrum out of a signal which differ in the method itself (Fourier, Wavelets,...), and in the normalisation. We show here that we can obtain noise resistant cepstral coefficients, for speaker independent connected word recognition.The recognition system is based on a continuous whole word hidden Markov model. An error reduction rate of approximately 50\% is achieved. Moreover evaluation tests demonstrate that these results can be obtained with smaller databases: halving the training database have small effects on recognition rates (which is not the case with traditional MFCCs)

Infoscience - École polytechnique fédérale de Lausanne

Quantification de séquences spectrales de longueurs variables pour le codage de la parole à très bas débit

Author: BAUDOIN Geneviève
CERNOCKY Jan
CHOLLET Gérard
Publication venue: GRETSI, Groupe d’Etudes du Traitement du Signal et des Images
Publication date: 01/01/1997
Field of study

Ce papier traite du codage des paramètres spectraux pour le codage de parole à très bas débit. Nous présentons une nouvelle interprétation de recherches précédemment publiées par Chou-Lockabaugh et Cemocky-Baudoin-Chollet sur la quantification de séquences spectrales de longueurs variables, sous les noms respectifs de « Variable to Variable length Vector Quantization » (VVVQ) et de quantification par multigrammes (MGQ). Nous avons, d'autre part étudié l'influence de la limitation du retard introduit par la méthode et proposé une technique pour optimiser les performances en présence d'un retard maximum imposé. Nous avons ainsi trouvé qu'un retard de 400 ms est généralement suffisant. Enfin, nous proposons l'introduction de longues séquences dans le dictionnaire par interpolation linéaire des séquences courtes

I-Revues

Home monitoring for frailty detection through sound and speaker diarization analysis

Author: Boudy Jérôme
Boutamine Sami
Chollet Gérard
Istrate Dan
Petitpont Frédéric
Tevissen Yannis
Zalc Vincent
Publication venue
Publication date: 17/08/2023
Field of study

As the French, European and worldwide populations are aging, there is a strong interest for new systems that guarantee a reliable and privacy preserving home monitoring for frailty prevention. This work is a part of a global environmental audio analysis system which aims to help identification of Activities of Daily Life (ADL) through human and everyday life sounds recognition, speech presence and number of speakers detection. The focus is made on the number of speakers detection. In this article, we present how recent advances in sound processing and speaker diarization can improve the existing embedded systems. We study the performances of two new methods and discuss the benefits of DNN based approaches which improve performances by about 100%.Comment: JETSAN, Jun 2023, Aubervilliers & Paris, Franc

arXiv.org e-Print Archive

Combining methods to improve speaker verification decision

Author: Bimbot Frédéric
Chollet Gérard
Genoud Dominique
Gravier Guillaume
Publication venue: IDIAP
Publication date: 10/03/2006
Field of study

The aim of this paper is to describe how the combination of speaker verification algorithms with a priori decision thresholds can improve the overall robustness of a real application. The evaluation is performed in the context of a field application where each client is verified from a 7 digit pin code. This paper demonstrate that it is possible to increase the global performances of the system on combining the result of several algorithms

Infoscience - École polytechnique fédérale de Lausanne

Towards a Practical Silent Speech Interface Based on Vocal Tract Imaging

Author: Cai Jun
Chollet Gérard
Crevier-Buchman Lise
Denby Bruce
Dreyfus Gérard
Hueber Thomas
Manitsaris Sotiris
Pillot-Loiseau Claire
Roussel Pierre
Stone Maureen
Publication venue: HAL CCSD
Publication date: 20/07/2011
Field of study

Intégralité des actes de cette conférence disponible au lien suivant: http://www.issp2011.uqam.ca/upload/files/proceedings.pdfInternational audienceThe paper describes advances in the development of an ultrasound silent speech interface for use in silent communications applications or as a speaking aid for persons who have undergone a laryngectomy. It reports some first steps towards making such a device lightweight, portable, interactive, and practical to use. Simple experimental tests of an interactive silent speech interface for everyday applications are described. Possible future improvements including extension to continuous speech and real time operation are discussed.Cet article décrit les avancements dans le développement d'une interface ultrasonore de parole silencieuse, pour des applications en communication silencieuse ou comme une aide pour les personnes laryngectomisées. Nous rapportons les premiers pas pour réaliser une telle interface portable, interactive, et pratique à utiliser. De simples tests expérimentaux de cette interface pour des applications quotidiennes sont décrits. Des améliorations futures possibles incluant l'extension à la parole continue et aux traitements en temps réels sont discutées

Hal - Université Grenoble Alpes

Secured vocal access to telephone servers

Author: Bornet Olivier
Chollet Gérard
Cochard Jean-Luc
Constantinescu Andrei
Genoud Dominique
Publication venue: IDIAP / CNRS
Publication date: 10/03/2006
Field of study

A number of applications of man-machine interaction over the telephone requires a combination of speech recognition and speaker verification. This paper describes current work carried out at IDIAP in the framework of national and European projects. A generic Interactive Voice Server (IVS) is described by means of a graphical formalism. It includes speech recognition based on speaker independent flexible vocabulary technology and speaker verification performed by a number of techniques executed in parallel, and combined for optimal decision

Infoscience - École polytechnique fédérale de Lausanne

Swiss French PolyPhone and PolyVar: telephone speech databases to model inter- and intra-speaker variability

Author: Chollet Gérard
Cochard Jean-Luc
Constantinescu Andrei
Jaboulet Cédric
Langlais Philippe
Publication venue: IDIAP
Publication date: 10/03/2006
Field of study

Following the demand of the speech technology market, a number of companies and research laboratories joined their forces in order to produce valuable and reusable resources, especially speech databases. Serving their purpose, the collected databases are used for developing, testing, enhancing and evaluating speech technology products, like interactive voice servers, listening typewriter, speaker verification and identification systems, etc. Especially for capturing intra-speaker variability, the PolyVar database was designed and recorded at IDIAP, as a complement to the Swiss French PolyPhone database, which adresses inter-speaker variability issues. We will detail in the following the specific problems of speech database collection (sampling the speaker population, selection of vocabulary items, ...), and will present actual development we carried out at IDIAP throught the PolyPhone and PolyVar databases

Infoscience - École polytechnique fédérale de Lausanne

Ageing in the 21st century in Europe: social challenges and innovation opportunities to support elderly independency and wellbeing.

Author: Chollet Gérard
Cordasco Gennaro
Esposito Anna
Fernández Ruanova Begoña
González Fraile Eduardo
Korsnes Maria S.
Tenorio Laranga Joffre
Torres Barañano María Inés
Publication venue
Publication date: 15/03/2021
Field of study

This chapter describes the vision of the EMPATHIC project of social challenges and innovation opportunities to support elderly independency and well- being. Asanintroduction,wefirstidentifythemainchallengesthatresultfromthe current demographic status in Europe. Then, we show the vision and approach pro- posed by the EMPATHIC project to deal with some of these challenges. Next sec- tion develops the main concepts, goals and outcomes of the EMPATHIC project. The following section reports the impact of the project and the final section de- scribes the exploitation of the results as well as the concluding remarksThe research leading to the results in this paper has been conducted in the project EMPATHIC (Grant N: 769872) that received funding from the European Union’s Horizon 2020 research and innovation programme

Archivo Digital para la Docencia y la Investigación

Ageing in the 21st century in Europe: social challenges and innovation opportunities to support elderly independency and wellbeing.

Author: Chollet Gérard
Cordasco Gennaro
Esposito Anna
Fernández Ruanova Begoña
González Fraile Eduardo
Korsnes Maria S.
Tenorio Laranga Joffre
Torres Barañano María Inés
Publication venue
Publication date: 15/03/2021
Field of study

Archivo Digital para la Docencia y la Investigación