Search CORE

6,722 research outputs found

Development of a speech recognition system for Spanish broadcast news

Author: Jong Franciska de
Niculescu Andreea
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2008
Field of study

This paper reports on the development process of a speech recognition system for Spanish broadcast news within the MESH FP6 project. The system uses the SONIC recognizer developed at the Center for Spoken Language Research (CSLR), University of Colorado. Acoustic and language models were trained using Hub4 broadcast news data. Experiments and evaluation results are reported

University of Twente Research Information

SCOLA: What it is and How to Get it

Author: Sprang Katie A.
Publication venue: 'The University of Kansas'
Publication date: 15/01/1991
Field of study

The University of Kansas: Journals@KU

Biodiversity Informatics

An Illustrated Methodology for Evaluating ASR Systems

Author: González María
Martínez Fernández José Luis
Martínez Paloma
Moreno Schneider Julián
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Proceeding of: 9th International Workshop on Adaptive Multimedia Retrieval (AMR 2011) Took place 2011, July, 18-19, in Barcelona, Spain. The event Web site is http://stel.ub.edu/amr2011/Automatic speech recognition technology can be integrated in an information retrieval process to allow searching on multimedia contents. But, in order to assure an adequate retrieval performance is necessary to state the quality of the recognition phase, especially in speaker-independent and domainindependent environments. This paper introduces a methodology to accomplish the evaluation of different speech recognition systems in several scenarios considering also the creation of new corpora of different types (broadcast news, interviews, etc.), especially in other languages apart from English that are not widely addressed in speech community.This work has been partially supported by the Spanish Center for Industry Technological Development (CDTI, Ministry of Industry, Tourism and Trade), through the BUSCAMEDIA Project (CEN-20091026). And also by MA2VICMR: Improving the access, analysis and visibility of the multilingual and multimedia information in web for the Region of Madrid (S2009/TIC-1542).Publicad

Universidad Carlos III de Madrid e-Archivo

Investigating cross-language speech retrieval for a spontaneous conversational speech collection

Author: Alzghool Muath
Inkpen Diana
Jones Gareth J.F.
Oard Douglas W.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2006
Field of study

Cross-language retrieval of spontaneous speech combines the challenges of working with noisy automated transcription and language translation. The CLEF 2005 Cross-Language Speech Retrieval (CL-SR) task provides a standard test collection to investigate these challenges. We show that we can improve retrieval performance: by careful selection of the term weighting scheme; by decomposing automated transcripts into phonetic substrings to help ameliorate transcription errors; and by combining automatic transcriptions with manually-assigned metadata. We further show that topic translation with online machine translation resources yields effective CL-SR

Albayzin 2018 Evaluation: The IberSpeech-RTVE Challenge on Speech Technologies for Spanish Broadcast Media

Author: Bazán-Gil Virginia
de Prada Alberto
Gómez Manuel
Lleida Eduardo
Miguel Antonio
Ortega Alfonso
Perez Carmen
Publication venue
Publication date: 01/01/2019
Field of study

The IberSpeech-RTVE Challenge presented at IberSpeech 2018 is a new Albayzin evaluation series supported by the Spanish Thematic Network on Speech Technologies (Red Temática en Tecnologías del Habla (RTTH)). That series was focused on speech-to-text transcription, speaker diarization, and multimodal diarization of television programs. For this purpose, the Corporacion Radio Television Española (RTVE), the main public service broadcaster in Spain, and the RTVE Chair at the University of Zaragoza made more than 500 h of broadcast content and subtitles available for scientists. The dataset included about 20 programs of different kinds and topics produced and broadcast by RTVE between 2015 and 2018. The programs presented different challenges from the point of view of speech technologies such as: the diversity of Spanish accents, overlapping speech, spontaneous speech, acoustic variability, background noise, or specific vocabulary. This paper describes the database and the evaluation process and summarizes the results obtained

Multimode delivery in the classroom

Author: Dodd William Steven
Publication venue: Faculty of Communication and Modern Languages, Universiti Utara Malaysia
Publication date: 01/01/2007
Field of study

Because of recent technological advances, subtitling is now easier and more versatile than in the past. There is an increasing interest in the use of digitally-recorded audiovisual materials with both soundtrack and subtitles in the same language as a language-learning aid. The full potential of this is not currently attained because of poor-quality subtitling and less appropriate “caption” or “synopsis” rather than “transcription” subtitles. An adaptation of a format successful over two decades in Europe might be of value for South-East Asian language learners

Beyond English text: Multilingual and multimedia information retrieval.

Author: Jones Gareth J.F.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2005
Field of study

Non

CiteSeerX

Archaeologies of Sound: Reconstructing Louis MacNeice’s Wartime Radio Publics

Author: Ian Whittington
Publication venue: 'Modern Language Association'
Publication date: 01/01/2015
Field of study

This article approaches the problem of reconstructing the culturally situated audience experience of radio programming through the example of Louis MacNeice's wartime radio broadcasts, notably "Alexander Nevsky" and "Christopher Columbus". The article draws on audience research reports, internal correspondence, and close analysis of the broadcasts themselves in order to triangulate a listening experience that, though it ultimately cannot be recovered, can be better understood through its proximate cultural traces

An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies

Author: Arzelus Haritz
Bazán Virginia
Bordel García German
de Prada Alberto
Lleida Eduardo
Miguel Antonio
Ortega Alfonso
Peñagarikano Badiola Mikel
Pérez Carmen
Rodríguez Fuentes Luis Javier
Tejedor Javier
Torre-Toledano Doroteo
Varona Fernández Amparo
Álvarez Aitor
Publication venue: MDPI
Publication date: 11/08/2023
Field of study

Evaluation campaigns provide a common framework with which the progress of speech technologies can be effectively measured. The aim of this paper is to present a detailed overview of the IberSpeech-RTVE 2022 Challenges, which were organized as part of the IberSpeech 2022 conference under the ongoing series of Albayzin evaluation campaigns. In the 2022 edition, four challenges were launched: (1) speech-to-text transcription; (2) speaker diarization and identity assignment; (3) text and speech alignment; and (4) search on speech. Different databases that cover different domains (e.g., broadcast news, conference talks, parliament sessions) were released for those challenges. The submitted systems also cover a wide range of speech processing methods, which include hidden Markov model-based approaches, end-to-end neural network-based methods, hybrid approaches, etc. This paper describes the databases, the tasks and the performance metrics used in the four challenges. It also provides the most relevant features of the submitted systems and briefly presents and discusses the obtained results. Despite employing state-of-the-art technology, the relatively poor performance attained in some of the challenges reveals that there is still room for improvement. This encourages us to carry on with the Albayzin evaluation campaigns in the coming years.This work was partially supported by Radio Televisión Española through the RTVE Chair at the University of Zaragoza, and Red Temática en Tecnologías del Habla (RED2022-134270-T), funded by AEI (Ministerio de Ciencia e Innovación); It was also partially funded by the European Union’s Horizon 2020 research and innovation program under Marie Skłodowska-Curie Grant 101007666; in part by MCIN/AEI/10.13039/501100011033 and by the European Union “NextGenerationEU”/ PRTR under Grants PDC2021-120846C41 PID2021-126061OB-C44, and in part by the Government of Aragon (Grant Group T3623R); it was also partially funded by the Spanish Ministry of Science and Innovation (OPEN-SPEECH project, PID2019-106424RB-I00) and by the Basque Government under the general support program to research groups (IT-1704-22), and by projects RTI2018-098091-B-I00 and PID2021-125943OB-I00 (Spanish Ministry of Science and Innovation and ERDF) as well

Archivo Digital para la Docencia y la Investigación

Sub-Sync: automatic synchronization of subtitles in the broadcasting of true live programs in spanish

Author: González Carrasco Israel
López Cuadrado José Luis
Puente Luis
Ruiz Mecua María Belén
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/05/2019
Field of study

Individuals With Sensory Impairment (Hearing Or Visual) Encounter Serious Communication Barriers Within Society And The World Around Them. These Barriers Hinder The Communication Process And Make Access To Information An Obstacle They Must Overcome On A Daily Basis. In This Context, One Of The Most Common Complaints Made By The Television (Tv) Users With Sensory Impairment Is The Lack Of Synchronism Between Audio And Subtitles In Some Types Of Programs. In Addition, Synchronization Remains One Of The Most Significant Factors In Audience Perception Of Quality In Live-Originated Tv Subtitles For The Deaf And Hard Of Hearing. This Paper Introduces The Sub-Sync Framework Intended For Use In Automatic Synchronization Of Audio-Visual Contents And Subtitles, Taking Advantage Of Current Well-Known Techniques Used In Symbol Sequences Alignment. In This Particular Case, These Symbol Sequences Are The Subtitles Produced By The Broadcaster Subtitling System And The Word Flow Generated By An Automatic Speech Recognizing The Procedure. The Goal Of Sub-Sync Is To Address The Lack Of Synchronism That Occurs In The Subtitles When Produced During The Broadcast Of Live Tv Programs Or Other Programs That Have Some Improvised Parts. Furthermore, It Also Aims To Resolve The Problematic Interphase Of Synchronized And Unsynchronized Parts Of Mixed Type Programs. In Addition, The Framework Is Able To Synchronize The Subtitles Even When They Do Not Correspond Literally To The Original Audio And/Or The Audio Cannot Be Completely Transcribed By An Automatic Process. Sub-Sync Has Been Successfully Tested In Different Live Broadcasts, Including Mixed Programs, In Which The Synchronized Parts (Recorded, Scripted) Are Interspersed With Desynchronized (Improvised) Ones

Universidad Carlos III de Madrid e-Archivo