Search CORE

31 research outputs found

Multilingual audio information management system based on semantic knowledge in complex environments

Author: Barroso Moreno Nora
Calvo Salomón Pilar María
Ezeiza Ramos Aitzol
Fernández Gómez de Segura Elsa
Hernández Gómez María del Carmen
López de Ipiña Peña Miren Karmele
Susperregui Aseguinolaza Unai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

This paper proposes a multilingual audio information management system based on semantic knowledge in complex environments. The complex environment is defined by the limited resources (financial, material, human, and audio resources); the poor quality of the audio signal taken from an internet radio channel; the multilingual context (Spanish, French, and Basque that is in under-resourced situation in some areas); and the regular appearance of cross-lingual elements between the three languages. In addition to this, the system is also constrained by the requirements of the local multilingual industrial sector. We present the first evolutionary system based on a scalable architecture that is able to fulfill these specifications with automatic adaptation based on automatic semantic speech recognition, folksonomies, automatic configuration selection, machine learning, neural computing methodologies, and collaborative networks. As a result, it can be said that the initial goals have been accomplished and the usability of the final application has been tested successfully, even with non-experienced users.This work is being funded by Grants: TEC201677791-C4 from Plan Nacional de I + D + i, Ministry of Economic Affairs and Competitiveness of Spain and from the DomusVi Foundation Kms para recorder, the Basque Government (ELKARTEK KK-2018/00114, GEJ IT1189-19, the Government of Gipuzkoa (DG18/14 DG17/16), UPV/EHU (GIU19/090), COST ACTION (CA18106, CA15225)

Archivo Digital para la Docencia y la Investigación

Multidialectal Spanish acoustic modeling for speech recognition

Author: Albino Nogueiras
Asunción Moreno
Ferreiros
Gibbon
Heeringa
Imperl
Kirchhoff
Köhler
Lipski
Mónica Caballero
Schultz
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Recommended from our members

Multilingual audio information management system based on semantic knowledge in complex environments

Author: Barroso Nora
Calvo Pilar M
Ezeiza Aitzol
Fernández Elsa
Hernandez Carmen
Lopez-de-Ipina Karmele
Susperregi Unai
Publication venue: 'Organisation for Economic Co-Operation and Development (OECD)'
Publication date: 02/02/2021
Field of study

AbstractThis paper proposes a multilingual audio information management system based on semantic knowledge in complex environments. The complex environment is defined by the limited resources (financial, material, human, and audio resources); the poor quality of the audio signal taken from an internet radio channel; the multilingual context (Spanish, French, and Basque that is in under-resourced situation in some areas); and the regular appearance of cross-lingual elements between the three languages. In addition to this, the system is also constrained by the requirements of the local multilingual industrial sector. We present the first evolutionary system based on a scalable architecture that is able to fulfill these specifications with automatic adaptation based on automatic semantic speech recognition, folksonomies, automatic configuration selection, machine learning, neural computing methodologies, and collaborative networks. As a result, it can be said that the initial goals have been accomplished and the usability of the final application has been tested successfully, even with non-experienced users.</jats:p

Apollo (Cambridge)

Recommended from our members

Multilingual audio information management system based on semantic knowledge in complex environments

Author: Barroso Nora
Calvo Pilar M
Ezeiza Aitzol
Fernández Elsa
Hernandez Carmen
Lopez-de-Ipina Karmele
Susperregi Unai
Publication venue: Neural Computing and Applications
Publication date: 03/12/2020
Field of study

Apollo (Cambridge)

Multidialectal acoustic modeling: a comparative study

Author: Caballero Galeote Mónica
Moreno Bilbao M. Asunción
Nogueiras Rodríguez Albino
Publication venue
Publication date: 01/01/2006
Field of study

In this paper, multidialectal acoustic modeling based on shar- ing data across dialects is addressed. A comparative study of different methods of combining data based on decision tree clustering algorithms is presented. Approaches evolved differ in the way of evaluating the similarity of sounds between di- alects, and the decision tree structure applied. Proposed systems are tested with Spanish dialects across Spain and Latin Amer- ica. All multidialectal proposed systems improve monodialectal performance using data from another dialect but it is shown that the way to share data is critical. The best combination between similarity measure and tree structure achieves an improvement of 7% over the results obtained with monodialectal systems.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Speech Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

Directory of Open Access Books (DOAB)

Automatic Understanding of ATC Speech: Study of Prospectives and Field Experiments for Several Controller Positions

Author: Córdoba Herralde Ricardo de
D'haro Enríquez Luis Fernando
Fernández Martínez Fernando
Ferreiros López Javier
González Germán
Macías Guarasa Javier
Montero Martínez Juan Manuel
Pardo Muñoz José Manuel
Sama Valentin
San Segundo Hernández Rubén
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Although there has been a lot of interest in recognizing and understanding air traffic control (ATC) speech, none of the published works have obtained detailed field data results. We have developed a system able to identify the language spoken and recognize and understand sentences in both Spanish and English. We also present field results for several in-tower controller positions. To the best of our knowledge, this is the first time that field ATC speech (not simulated) is captured, processed, and analyzed. The use of stochastic grammars allows variations in the standard phraseology that appear in field data. The robust understanding algorithm developed has 95% concept accuracy from ATC text input. It also allows changes in the presentation order of the concepts and the correction of errors created by the speech recognition engine improving it by 17% and 25%, respectively, absolute in the percentage of fully correctly understood sentences for English and Spanish in relation to the percentages of fully correctly recognized sentences. The analysis of errors due to the spontaneity of the speech and its comparison to read speech is also carried out. A 96% word accuracy for read speech is reduced to 86% word accuracy for field ATC data for Spanish for the "clearances" task confirming that field data is needed to estimate the performance of a system. A literature review and a critical discussion on the possibilities of speech recognition and understanding technology applied to ATC speech are also given

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Adaptation of voice sever to automotive environment

Author: Salinas Vila David
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2009
Field of study

This project is embedded within an investigation Project named "Movilidad y Automoción para Redes de Transporte Avanzados" (MARTA).It has as a fundamental strategic goal to consolidate the scientifically and technological basis to 21th century mobility to allow Spanish ITS ("Intelligent Transport Systems") sector to answer the challenges of efficiency, sustainability, etc . which European society and especially Spanish society has to confront in the next years. In this project Telefónica I+D (TID) is in charge of the study, specification and implementation of speech technology in automotive environment considering vehicle usability conditions. The work of the student in this project is to adapt a voice server, that contains speech tools, to automotive environment. Add new libraries that annex new functions and extend and develop the communication with XML to use these new functions

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

DEVELOPMENT AND EVALUATION OF THE ATOS SPONTANEOUS SPEECH CONVERSATIONAL SYSTEM

Author: C Crespo
D Tapias
F Martínez
I Cortazar
J Álvarez
Publication venue
Publication date: 12/02/2020
Field of study

ABSTRACT In this paper we report our recent development work in Spanish spontaneous speech conversational systems. We describe the Automatic Telephone Operator Service (ATOS) and present the improvements introduced into it to deal with spontaneous speech, which are: (a) a task independent dialogue manager, that can be adapted to a new semantic domain by changing a configuration file. It also generates a prediction about the user's expected utterance to constrain the language model used by the speech recognizer. (b) a language modeling strategy, which allows to adapt the statistical language model to a new task with just few hundreds of sentences. This strategy reduces a 27% the word error rate. We also report the results, conclusions and the speech database collected in the evaluation of the ATOS system, which has been tested by 30 real users

CiteSeerX

Modularity and Neural Integration in Large-Vocabulary Continuous Speech Recognition

Author: Kilgour Kevin
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2015
Field of study

This Thesis tackles the problems of modularity in Large-Vocabulary Continuous Speech Recognition with use of Neural Network

KITopen