Search CORE

4,316 research outputs found

Speech technologies for the audiovisual and multimedia interaction environments

Author: Alvarez Muniain Aitor
Publication venue
Publication date: 22/07/2016
Field of study

361 p

Archivo Digital para la Docencia y la Investigación

Semisupervised Speech Data Extraction from Basque Parliament Sessions and Validation on Fully Bilingual Basque–Spanish ASR

Author: Bordel García German
Peñagarikano Badiola Mikel
Rodríguez Fuentes Luis Javier
Varona Fernández Amparo
Publication venue: MDPI
Publication date: 28/07/2023
Field of study

In this paper, a semisupervised speech data extraction method is presented and applied to create a new dataset designed for the development of fully bilingual Automatic Speech Recognition (ASR) systems for Basque and Spanish. The dataset is drawn from an extensive collection of Basque Parliament plenary sessions containing frequent code switchings. Since session minutes are not exact, only the most reliable speech segments are kept for training. To that end, we use phonetic similarity scores between nominal and recognized phone sequences. The process starts with baseline acoustic models trained on generic out-of-domain data, then iteratively updates the models with the extracted data and applies the updated models to refine the training dataset until the observed improvement between two iterations becomes small enough. A development dataset, involving five plenary sessions not used for training, has been manually audited for tuning and evaluation purposes. Cross-validation experiments (with 20 random partitions) have been carried out on the development dataset, using the baseline and the iteratively updated models. On average, Word Error Rate (WER) reduces from 16.57% (baseline) to 4.41% (first iteration) and further to 4.02% (second iteration), which corresponds to relative WER reductions of 73.4% and 8.8%, respectively. When considering only Basque segments, WER reduces on average from 16.57% (baseline) to 5.51% (first iteration) and further to 5.13% (second iteration), which corresponds to relative WER reductions of 66.7% and 6.9%, respectively. As a result of this work, a new bilingual Basque–Spanish resource has been produced based on Basque Parliament sessions, including 998 h of training data (audio segments + transcriptions), a development set (17 h long) designed for tuning and evaluation under a cross-validation scheme and a fully bilingual trigram language model.This work was partially funded by the Spanish Ministry of Science and Innovation (OPEN-SPEECH project, PID2019-106424RB-I00) and by the Basque Government under the general support program to research groups (IT-1704-22)

Archivo Digital para la Docencia y la Investigación

On the use of high-level information in speaker and language recognition

Author: González Domínguez Javier
González-Rodríguez Joaquín
López Moreno Ignacio
Montero-Asenjo Alberto
Ramos Daniel
Toledano Doroteo T.
Publication venue
Publication date: 01/01/2006
Field of study

Actas de las IV Jornadas de Tecnología del Habla (JTH 2006)Automatic Speaker Recognition systems have been largely dominated by acoustic-spectral based systems, relying in proper modelling of the short-term vocal tract of speakers. However, there is scientific and intuitive evidence that speaker specific information is embedded in the speech signal in multiple short- and long-term characteristics. In this work, a multilevel speaker recognition system combining acoustic, phonotactic and prosodic subsystems is presented and assessed using NIST 2005 Speaker Recognition Evaluation data. For language recognition systems, the NIST 2005 Language Recognition Evaluation was selected to measure performance of a high-level language recognition systems

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

Multilingual audio information management system based on semantic knowledge in complex environments

Author: Barroso Moreno Nora
Calvo Salomón Pilar María
Ezeiza Ramos Aitzol
Fernández Gómez de Segura Elsa
Hernández Gómez María del Carmen
López de Ipiña Peña Miren Karmele
Susperregui Aseguinolaza Unai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

This paper proposes a multilingual audio information management system based on semantic knowledge in complex environments. The complex environment is defined by the limited resources (financial, material, human, and audio resources); the poor quality of the audio signal taken from an internet radio channel; the multilingual context (Spanish, French, and Basque that is in under-resourced situation in some areas); and the regular appearance of cross-lingual elements between the three languages. In addition to this, the system is also constrained by the requirements of the local multilingual industrial sector. We present the first evolutionary system based on a scalable architecture that is able to fulfill these specifications with automatic adaptation based on automatic semantic speech recognition, folksonomies, automatic configuration selection, machine learning, neural computing methodologies, and collaborative networks. As a result, it can be said that the initial goals have been accomplished and the usability of the final application has been tested successfully, even with non-experienced users.This work is being funded by Grants: TEC201677791-C4 from Plan Nacional de I + D + i, Ministry of Economic Affairs and Competitiveness of Spain and from the DomusVi Foundation Kms para recorder, the Basque Government (ELKARTEK KK-2018/00114, GEJ IT1189-19, the Government of Gipuzkoa (DG18/14 DG17/16), UPV/EHU (GIU19/090), COST ACTION (CA18106, CA15225)

Archivo Digital para la Docencia y la Investigación

Review of Research on Speech Technology: Main Contributions From Spanish Research Groups

Author: Martínez Hinarejos Carlos D.
Ortega Alfonso
San Segundo Hernández Rubén
Publication venue: E.T.S.I. Telecomunicación (UPM)
Publication date: 01/01/2011
Field of study

In the last two decades, there has been an important increase in research on speech technology in Spain, mainly due to a higher level of funding from European, Spanish and local institutions and also due to a growing interest in these technologies for developing new services and applications. This paper provides a review of the main areas of speech technology addressed by research groups in Spain, their main contributions in the recent years and the main focus of interest these days. This description is classified in five main areas: audio processing including speech, speaker characterization, speech and language processing, text to speech conversion and spoken language applications. This paper also introduces the Spanish Network of Speech Technologies (RTTH. Red Temática en Tecnologías del Habla) as the research network that includes almost all the researchers working in this area, presenting some figures, its objectives and its main activities developed in the last years

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

RiuNet

Archivo Digital UPM

Exploring Cross-linguistic Effects and Phonetic Interactions in the Context of Bilingualism

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

This Special Issue includes fifteen original state-of-the-art research articles from leading scholars that examine cross-linguistic influence in bilingual speech. These experimental studies contribute to the growing number of studies on multilingual phonetics and phonology by introducing novel empirical data collection techniques, sophisticated methodologies, and acoustic analyses, while also presenting findings that provide robust theoretical implications to a variety of subfields, such as L2 acquisition, L3 acquisition, laboratory phonology, acoustic phonetics, psycholinguistics, sociophonetics, blingualism, and language contact. These studies in this book further elucidate the nature of phonetic interactions in the context of bilingualism and multilingualism and outline future directions in multilingual phonetics and phonology research

Directory of Open Access Books (DOAB)

How do Spanish speakers read words? Insights from a crowdsourced lexical decision megastudy

Author: Aguasvivas José Armando
Brysbaert Marc
Carreiras Manuel
Duñabeitia Jon Andoni
Keuleers Emmanuel
Mandera Paweł
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Published online: 18 February 2020Vocabulary size seems to be affected by multiple factors, including those that belong to the properties of the words themselves and those that relate to the characteristics of the individuals assessing the words. In this study, we present results from a crowdsourced lexical decision megastudy in which more than 150,000 native speakers from around 20 Spanish-speaking countries performed a lexical decision task to 70 target word items selected from a list of about 45,000 Spanish words. We examined how demographic characteristics such as age, education level, and multilingualism affected participants’ vocabulary size. Also, we explored how common factors related to words like frequency, length, and orthographic neighbourhood influenced the knowledge of a particular item. Results indicated important contributions of age to overall vocabulary size, with vocabulary size increasing in a logarithmic fashion with this factor. Furthermore, a contrast between monolingual and bilingual communities within Spain revealed no significant vocabulary size differences between the communities. Additionally, we replicated the standard effects of the words’ properties and their interactions, accurately accounting for the estimated knowledge of a particular word. These results highlight the value of crowdsourced approaches to uncover effects that are traditionally masked by smallsampled in-lab factorial experimental designs.This research is supported by the Basque Government through the BERC 2018-2021 program and by the Spanish State Research Agency through BCBL Severo Ochoa excellence accreditation SEV-2015-0490. This study was also partially supported by grants PGC2018-097145-B-I00, RED2018-102615-T, and RTI2018-093547-B-I00 from the Spanish State Research Agency. Work by JA was supported by “la Caixa” Foundation and the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no. 713673, and fellowship code LCF/BQ/IN17/116200154004. We would also like to thank the reviewers for their insightful comments and efforts towards improving this manuscript

Ghent University Academic Bibliography

Archivo Digital para la Docencia y la Investigación

Tilburg University Repository

2017-2018 Boise State University Undergraduate Catalog

Author: Boise State University Office of the Registrar
Publication venue: 'IUScholarWorks'
Publication date: 01/04/2017
Field of study

This catalog is primarily for and directed at students. However, it serves many audiences, such as high school counselors, academic advisors, and the public. In this catalog you will find an overview of Boise State University and information on admission, registration, grades, tuition and fees, financial aid, housing, student services, and other important policies and procedures. However, most of this catalog is devoted to describing the various programs and courses offered at Boise State

Boise State University - ScholarWorks