Search CORE

213 research outputs found

Von Mises-Fisher models in the total variability subspace for language recognition

Author: González Domínguez Javier
González-Rodríguez Joaquín
López Moreno Ignacio
Ramos Daniel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

ATVS-UAM NIST LRE 2009 System Description

Author: Franco-Pedroso Javier
González-Domínguez Joaquín
González-Rodríguez Joaquín
López-Moreno Ignacio
Ramos Daniel
Toledano Doroteo T.
Publication venue: 'National Institute of Standards and Technology (NIST)'
Publication date: 01/01/2009
Field of study

Official contribution of the National Institute of Standards and Technology; not subject to copyright in the United States.ATVS-UAM submits a fast, light and efficient single system. The use of a task-adapted nonspeech-recognition-based VAD (apart from NIST conversation labels) and gender-dependent total variability compensation technology allows our submitted system to obtain excellent development results with SRE08 data with exceptional computational efficiency. In order to test the VAD influence in the evaluation results, a contrastive equivalent system has been submitted exclusively changing ATVS VAD labels with BUT publicly contributed ones. In all contributed systems, two gender-independent calibrations have been trained with respectively telephone-only and mic (either mic-tel, tel-mic or mic-mic) data. The submitted systems have been designed for English speech in an application-independent way, all results being interpretable in the form of calibrated likelihood ratios to be properly evaluated with Cllr. Sample development results with English SRE08 data are 0.53% (male) and 1.11% (female) EER in tel-tel data (optimistic as all English speakers in SRE08 are included in total variability matrices), going up to 3.5% (tel-tel) to 5.1% EER (tel-mic) in pessimistic cross-validation experiments (25% of test speakers totally excluded from development data in each xval set). The submitted system is extremely light in computational resources, running 77 times faster than real time. Moreover, once VAD and feature extraction are performed (the heaviest components of our system), training and testing are performed respectively at 5300 and 2950 times faster than real time

Biblos-e Archivo

Frame-by-frame language identification in short utterances using deep neural networks

Author: González Domínguez Javier
González-Rodríguez Joaquín
López-Moreno Ignacio
Moreno Pedro J.
Publication venue: 'Elsevier BV'
Publication date: 01/04/2017
Field of study

This is the author’s version of a work that was accepted for publication in Neural Networks. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Neural Networks, VOL 64, (2015) DOI 10.1016/j.neunet.2014.08.006This work addresses the use of deep neural networks (DNNs) in automatic language identification (LID) focused on short test utterances. Motivated by their recent success in acoustic modelling for speech recognition, we adapt DNNs to the problem of identifying the language in a given utterance from the short-term acoustic features. We show how DNNs are particularly suitable to perform LID in real-time applications, due to their capacity to emit a language identification posterior at each new frame of the test utterance. We then analyse different aspects of the system, such as the amount of required training data, the number of hidden layers, the relevance of contextual information and the effect of the test utterance duration. Finally, we propose several methods to combine frame-by-frame posteriors. Experiments are conducted on two different datasets: the public NIST Language Recognition Evaluation 2009 (3 s task) and a much larger corpus (of 5 million utterances) known as Google 5M LID, obtained from different Google Services. Reported results show relative improvements of DNNs versus the i-vector system of 40% in LRE09 3 second task and 76% in Google 5M LID

Biblos-e Archivo

Automatic language identification using deep neural networks

Author: González-Domínguez Javier
González-Rodríguez Joaquín
López-Moreno Ignacio
Martínez David R.
Oldrich Plchot
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Crossref

Biblos-e Archivo

Coupled-oscillator model to analyze the interaction between a quartz resonator and trapped ions

Author: Altozano Ruiz Emilio
Berrocal Sánchez Joaquín
Domínguez González Francisco
Rodríguez Rubiales Daniel
Publication venue: American Physical Society
Publication date: 30/05/2023
Field of study

The novel application of a piezoelectric quartz resonator for the detection of trapped ions has enabled the observation of the quartz-ions interaction under nonequilibrium conditions, opening new perspectives for high-sensitive motional frequency measurements of radioactive particles. Energized quartz crystals have (long) decay-time constants in the order of milliseconds, permitting the coherent detection of charged particles within short time scales. In this paper we develop a detailed model governing the interaction between trapped 40Ca+ ions and a quartz resonator connected to a low-noise amplifier. We apply this model to experimental data and extract the ions’ reduced-cyclotron frequency in our 7-T Penning trap setup. We also obtain an upper limit for the coupling constant g with the present quartz-amplifier-trap (QAT) configuration. The study of the reduced-cyclotron frequency is especially important for the use of this resonator in precision Penning-trap mass spectrometry. The improvement in sensitivity can be accomplished by increasing the quality factor of the QAT configuration, which in turn will improve the performance of the system towards the strong-coupling regim

Repositorio Institucional Universidad de Granada

On the use of high-level information in speaker and language recognition

Author: González Domínguez Javier
González-Rodríguez Joaquín
López Moreno Ignacio
Montero-Asenjo Alberto
Ramos Daniel
Toledano Doroteo T.
Publication venue
Publication date: 01/01/2006
Field of study

Actas de las IV Jornadas de Tecnología del Habla (JTH 2006)Automatic Speaker Recognition systems have been largely dominated by acoustic-spectral based systems, relying in proper modelling of the short-term vocal tract of speakers. However, there is scientific and intuitive evidence that speaker specific information is embedded in the speech signal in multiple short- and long-term characteristics. In this work, a multilevel speaker recognition system combining acoustic, phonotactic and prosodic subsystems is presented and assessed using NIST 2005 Speaker Recognition Evaluation data. For language recognition systems, the NIST 2005 Language Recognition Evaluation was selected to measure performance of a high-level language recognition systems

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

Genetic study of the hepcidin gene (HAMP) promoter and functional analysis of the c.-582A > G variant

Author: Campos Joaquín
Domínguez Fernando
González-Quintela Arturo
Loidi Lourdes
Parajes Silvia
Quinteiro Celsa
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background Hepcidin acts as the main regulator of iron homeostasis through regulation of intestinal absorption and macrophage release. Hepcidin deficiency causes iron overload whereas its overproduction is associated with anaemia of chronic diseases. The aims of the study were: to identify genetic variants in the hepcidin gene (HAMP) promoter, to asses the associations between the variants found and iron status parameters, and to functionally study the role on HAMP expression of the most frequent variant. Results The sequencing of HAMP promoter from 103 healthy individuals revealed two genetic variants: The c.-153C > T with a frequency of 0.014 for allele T, which is known to reduce hepcidin expression and the c.-582A > G with a 0.218 frequency for allele G. In an additional group of 224 individuals, the c.-582A > G variant genotype showed no association with serum iron, transferrin or ferritin levels. The c.-582G HAMP promoter variant decreased the transcriptional activity by 20% compared to c.-582A variant in cells from the human hepatoma cell line HepG2 when cotransfected with luciferase reporter constructs and plasmid expressing upstream stimulatory factor 1 (USF1) and by 12-14% when cotransfected with plasmid expressing upstream stimulatory factor 2 (USF2). Conclusions The c.-582A > G HAMP promoter variant is not associated with serum iron, transferrin or ferritin levels in the healthy population. The in vitro effect of the c.-582A > G variant resulted in a small reduction of the gene transactivation by allele G compared to allele A. Therefore the effect of the variant on the hepcidin levels in vivo would be likely negligible. Finally, the c.-153C > T variant showed a frequency high enough to be considered when a genetic analysis is done in iron overload patientsThis work was supported by a grant from the Fondo de Investigaciones Sanitarias del Instituto de Salud Carlos III (PI052249 to LL) and Xunta de Galicia (PGIDIT06PXIC9101136PN)S

Crossref

Springer - Publisher Connector

PubMed Central

Repositorio Institucional da Universidade de Santiago de Compostela

Multilevel and session variability compensated language recognition: ATVS-UAM systems at NIST LRE 2009

Author: Franco-Pedroso Javier
González Domínguez Javier
González-Rodríguez Joaquín
López Moreno Ignacio
Ramos Daniel
Toledano Doroteo T.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. J. Gonzalez-Dominguez, I. Lopez-Moreno, J. Franco-Pedroso, D. Ramos, D. T. Toledano, and J. Gonzalez-Rodriguez, "Multilevel and Session Variability Compensated Language Recognition: ATVS-UAM Systems at NIST LRE 2009" IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 6, pp. 1084 – 1093, December 2010This work presents the systems submitted by the ATVS Biometric Recognition Group to the 2009 Language Recognition Evaluation (LRE’09), organized by NIST. New challenges included in this LRE edition can be summarized by three main differences with respect to past evaluations. Firstly, the number of languages to be recognized expanded to 23 languages from 14 in 2007, and 7 in 2005. Secondly, the data variability has been increased by including telephone speech excerpts extracted from Voice of America (VOA) radio broadcasts through Internet in addition to Conversational Telephone Speech (CTS). The third difference was the volume of data, involving in this evaluation up to 2 terabytes of speech data for development, which is an order of magnitude greater than past evaluations. LRE’09 thus required participants to develop robust systems able not only to successfully face the session variability problem but also to do it with reasonable computational resources. ATVS participation consisted of state-of-the-art acoustic and high-level systems focussing on these issues. Furthermore, the problem of finding a proper combination and calibration of the information obtained at different levels of the speech signal was widely explored in this submission. In this work, two original contributions were developed. The first contribution was applying a session variability compensation scheme based on Factor Analysis (FA) within the statistics domain into a SVM-supervector (SVM-SV) approach. The second contribution was the employment of a novel backend based on anchor models in order to fuse individual systems prior to one-vs-all calibration via logistic regression. Results both in development and evaluation corpora show the robustness and excellent performance of the submitted systems, exemplified by our system ranked 2nd in the 30 second open-set condition, with remarkably scarce computational resources.This work has been supported by the Spanish Ministry of Education under project TEC2006-13170-C02-01. Javier Gonzalez-Dominguez also thanks Spanish Ministry of Education for supporting his doctoral research under project TEC2006-13141-C03-03. Special thanks are given to Dr. David Van Leeuwen from TNO Human Factors (Utrech, The Netherlands) for his strong collaboration, valuable discussions and ideas. Also, authors thank to Dr. Patrick Lucey for his final support on (non-target) Australian English review of the manuscript

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

Assessment of air management strategies to improve the transient response of advanced gasoline engines operating under high EGR conditions

Author: Climent H.
De La Morena Joaquín
Galindo José
González-Domínguez David
Guilain Stéphane
Publication venue: Elsevier
Publication date: 01/01/2023
Field of study

[EN] Advanced gasoline engines may lead the medium-term future of the passenger vehicle market, working in conventional and hybrid powertrains. Downsizing with turbocharging is the most extended way to improve fuel economy in gasoline engines. It is also proven that exhaust gas recirculation (EGR) reduces fuel consumption, but extracting the maximum benefit from EGR requires operating with high EGR rates. This fact can compromise the transient engine operation due to the greater turbocharger dependence. This research evaluates the EGR influence on the transient response of a turbocharged gasoline engine and, mainly, the potential of three air management strategies to accelerate the said response. Tip-in maneuvers at 1500 rpm (6-12 bar BMEP) were tested and simulated to this end. The three strategies are: reducing the EGR dilution by closing the EGR valve simultaneously with the throttle opening, using a pressurized air tank (PAT), and installing an electric supercharger at the compressor outlet in series. Engine tests show that the torque response time with EGR is 2-s slower than without EGR. 1D modeling results reveal that: the PAT connected to the intake manifold provides the fastest response, and the electric supercharger guarantees an excellent tradeoff between fuel consumption and torque response.Galindo, J.; Climent, H.; De La Morena, J.; González-Domínguez, D.; Guilain, S. (2023). Assessment of air management strategies to improve the transient response of advanced gasoline engines operating under high EGR conditions. Energy. 262. https://doi.org/10.1016/j.energy.2022.12558626

RiuNet

A linguistically-motivated speaker recognition front-end through session variability compensated cepstral trajectories in phone units

Author: Franco-Pedroso Javier
González Domínguez Javier
González-Rodríguez Joaquín
Ramos Daniel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Biblos-e Archivo