Search CORE

4 research outputs found

Corpora compilation for prosody-informed speech processing

Author: Bonafonte Antonio
Farrús Mireia
Öktem Alp
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Research on speech technologies necessitates spoken data, which is usually obtained through read recorded speech, and specifically adapted to the research needs. When the aim is to deal with the prosody involved in speech, the available data must reflect natural and conversational speech, which is usually costly and difficult to get. This paper presents a machine learning-oriented toolkit for collecting, handling, and visualization of speech data, using prosodic heuristic. We present two corpora resulting from these methodologies: PANTED corpus, containing 250 h of English speech from TED Talks, and Heroes corpus containing 8 h of parallel English and Spanish movie speech. We demonstrate their use in two deep learning-based applications: punctuation restoration and machine translation. The presented corpora are freely available to the research community

UPF Digital Repository

Diposit Digital de la Universitat de Barcelona

Visualizing punctuation restoration in speech transcripts with prosograph

Author: Bonafonte Antonio
Farrús Mireia
Öktem Alp
Publication venue: International Speech Communication Association (ISCA)
Publication date: 01/01/2018
Field of study

Comunicació presentada a: Interspeech 2018, celebrat del 2 al 6 de setembre de 2018 a Hyderabad, Índia.We have developed a neural architecture that tests the effect of lexical, morphosyntactic and prosodic features in restoring punctuation in speech transcriptions. Having outperformed a baseline model in terms of precision and recall, we further extend our performance tests by attaching it in a speech recognition pipeline. The visual and interactive testing environment that we prepared helps us observe how our models generalizes in unseen data and also plan our next steps for improvement.The first author has received Maria de Maeztu Reproducibility Award from Department of Information and Communication Technologies of Universitat Pompeu Fabra in 2018 through presentation of this work. The second author is funded by the Spanish Ministry of Economy, Industry and Competitiveness through the Ram´on y Cajal program

UPCommons. Portal del coneixement obert de la UPC

UPF Digital Repository

Visualizing punctuation restoration in speech transcripts with prosograph

Author: Bonafonte Antonio
Farrús Mireia
Öktem Alp
Publication venue: International Speech Communication Association (ISCA)
Publication date
Field of study

RECERCAT

Visualizing punctuation restoration in speech transcripts with prosograph

Author: Bonafonte Cávez Antonio
Farrús M.
Oktem A.
Publication venue: International Speech Communication Association (ISCA)
Publication date
Field of study

We have developed a neural architecture that tests the effect of lexical, morphosyntactic and prosodic features in restoring punctuation in speech transcriptions. Having outperformed a baseline model in terms of precision and recall, we further extend our performance tests by attaching it in a speech recognition pipeline. The visual and interactive testing environment that we prepared helps us observe how our models generalizes in unseen data and also plan our next steps for improvement.Peer Reviewe

RECERCAT