Search CORE

10 research outputs found

Error Correction based on Error Signatures applied to automatic speech recognition

Author: Telaar Dominic
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2015
Field of study

Automating Behavioral Testing in Machine Translation

Author: Ferrando Javier
Hasan Saša
Setiawan Hendra
Sperber Matthias
Telaar Dominic
Publication venue
Publication date: 02/11/2023
Field of study

Behavioral testing in NLP allows fine-grained evaluation of systems by examining their linguistic capabilities through the analysis of input-output behavior. Unfortunately, existing work on behavioral testing in Machine Translation (MT) is currently restricted to largely handcrafted tests covering a limited range of capabilities and languages. To address this limitation, we propose to use Large Language Models (LLMs) to generate a diverse set of source sentences tailored to test the behavior of MT models in a range of situations. We can then verify whether the MT model exhibits the expected behavior through matching candidate sets that are also generated using LLMs. Our approach aims to make behavioral testing of MT systems practical while requiring only minimal human effort. In our experiments, we apply our proposed evaluation framework to assess multiple available MT systems, revealing that while in general pass-rates follow the trends observable from traditional accuracy-based metrics, our method was able to uncover several important differences and potential bugs that go unnoticed when relying only on accuracy

arXiv.org e-Print Archive

Towards Real-World Streaming Speech Translation for Code-Switched Speech

Author: Agarwal Aashish
Alastruey Belen
Gollan Christian
Ng Tim
Sperber Matthias
Telaar Dominic
Publication venue
Publication date: 23/10/2023
Field of study

Code-switching (CS), i.e. mixing different languages in a single sentence, is a common phenomenon in communication and can be challenging in many Natural Language Processing (NLP) settings. Previous studies on CS speech have shown promising results for end-to-end speech translation (ST), but have been limited to offline scenarios and to translation to one of the languages present in the source (\textit{monolingual transcription}). In this paper, we focus on two essential yet unexplored areas for real-world CS speech translation: streaming settings, and translation to a third language (i.e., a language not included in the source). To this end, we extend the Fisher and Miami test and validation datasets to include new targets in Spanish and German. Using this data, we train a model for both offline and streaming ST and we establish baseline results for the two settings mentioned earlier

arXiv.org e-Print Archive

Integration of Language Identification into a Recognition System for Spoken Conversations Containing Code-Switches

Author: Dau-Cheng Lyu
Dominic Telaar
Eng-Siong Chng
Florian Metze
Haizhou Li
Jochen Weiner
Ngoc Thang Vu
Tanja Schultz
Publication venue
Publication date: 01/01/2012
Field of study

ABSTRACT This paper describes the integration of language identification (LID) into a multilingual automatic speech recognition (ASR) system for spoken conversations containing code-switches between Mandarin and English. We apply a multistream approach to combine at frame level the acoustic model score and the language information, where the latter is provided by an LID component. Furthermore, we advance this multistream approach by a new method called "Language Lookahead", in which the language information of subsequent frames is used to improve accuracy. Both methods are evaluated using a set of controlled LID results with varying frame accuracies. Our results show that both approaches improve the ASR performance by at least 4% relative if the LID achieves a minimum frame accuracy of 85%

CiteSeerX

Brain-to-text: Decoding spoken phrases from phone representations in the brain

Author: Adriana de Pesters
Blakely
Bouchard
Bouchard
Brumberg
Canolty
Chang
Christian Herff
Crane
Crone
Crone
Deng
Dominic Heger
Dominic Telaar
Farwell
Formisano
Fukuda
Gales
Gales
Gasser
Gerwin Schalk
Guenther
Haeb-Umbach
Huang
Jelinek
Kellis
Kennedy
Kubanek
Kubanek
Lee
Leuthardt
Leuthardt
Lotte
Martin
McFarland
Mesgarani
Miller
Mugler
Mugler
Pasley
Pei
Pei
Peter Brunner
Potes
PulvermÃ¼ller
Rabiner
Roy
Sahin
Schalk
Schultz
Stolcke
Sutter
Talairach
Tanja Schultz
Telaar
Towle
Unknown.
Wolpaw
Publication venue: Frontiers Media
Publication date: 01/01/2015
Field of study

It has long been speculated whether communication between humans and machines based on natural speech related cortical activity is possible. Over the past decade, studies have suggested that it is feasible to recognize isolated aspects of speech from neural signals, such as auditory features, phones or one of a few isolated words. However, until now it remained an unsolved challenge to decode continuously spoken speech from the neural substrate associated with speech and language processing. Here, we show for the first time that continuously spoken speech can be decoded into the expressed words from intracranial electrocorticographic (ECoG) recordings. Specifically, we implemented a system, which we call Brain-To-Text that models single phones, employs techniques from automatic speech recognition (ASR), and thereby transforms brain activity while speaking into the corresponding textual representation. Our results demonstrate that our system can achieve word error rates as low as 25% and phone error rates below 50%. Additionally, our approach contributes to the current understanding of the neural basis of continuous speech production by identifying those cortical regions that hold substantial information about individual phones. In conclusion, the Brain-To-Text system described in this paper represents an important step toward human-machine communication based on imagined speech

Crossref

KITopen

Frontiers - Publisher Connector

PubMed Central

BioKIT -Real-time decoder for biosignal processing

Author: Christian Herff
Christoph Amma
Dirk Gehrig
Dominic Heger
Dominic Telaar
Felix Putze
Mark Erhardt
Matthias Janke
Michael Wand
Ngoc Thang Vu
Tanja Schultz
Tim Schlippe
Publication venue
Publication date: 10/04/2020
Field of study

Abstract We introduce BioKIT, a new Hidden Markov Model based toolkit to preprocess, model and interpret biosignals such as speech, motion, muscle and brain activities. The focus of this toolkit is to enable researchers from various communities to pursue their experiments and integrate real-time biosignal interpretation into their applications. BioKIT boosts a flexible two-layer structure with a modular C++ core that interfaces with a Python scripting layer, to facilitate development of new applications. BioKIT employs sequence-level parallelization and memory sharing across threads. Additionally, a fully integrated error blaming component facilitates in-depth analysis. A generic terminology keeps the barrier to entry for researchers from multiple fields to a minimum. We describe our onlinecapable dynamic decoder and report on initial experiments on three different tasks. The presented speech recognition experiments employ Kaldi [1] trained deep neural networks with the results set in relation to the real time factor needed to obtain them

CiteSeerX

A first speech recognition system for Mandarin-English code-switch conversational speech

Author: Blaicher Fabian
Chng Eng Siong
Li Haizhou
Lyu Dau-Cheng
Schlippe Tim
Schultz Tanja
Telaar Dominic
Vu Ngoc Thang
Weiner Jochen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

This paper presents first steps toward a large vocabulary continuous speech recognition system (LVCSR) for conversational Mandarin-English code-switching (CS) speech. We applied state-of-the-art techniques such as speaker adaptive and discriminative training to build the first baseline system on the SEAME corpus [1] (South East Asia Mandarin-English). For acoustic modeling, we applied different phone merging approaches based on the International Phonetic Alphabet (IPA) and Bhattacharyya distance in combination with discriminative training to improve accuracy. On language model level, we investigated statistical machine translation (SMT) - based text generation approaches for building code-switching language models. Furthermore, we integrated the provided information from a language identification system (LID) into the decoding process by using a multi-stream approach. Our best 2-pass system achieves a Mixed Error Rate (MER) of 36.6% on the SEAME development set

DR-NTU (Digital Repository of NTU)

Integration of Language Identification into a Recognition System for Spoken Conversations Containing Code-Switches

Author: Dau-Chen Lyu (5362508)
Dominic Telaar (5362505)
Eng-Siong Chng (5362502)
Florian Metze (3885196)
Haizhou Li (5362511)
Jochen Weiner (5362499)
Ngoc Thang Vu (5362049)
Tanja Schultz (5362010)
Publication venue
Publication date: 29/06/2018
Field of study

<p>This paper describes the integration of language identification (LID) into a multilingual automatic speech recognition (ASR) system for spoken conversations containing code-switches between Mandarin and English. We apply a multistream approach to combine at frame level the acoustic model score and the language information, where the latter is provided by an LID component. Furthermore, we advance this multistream approach by a new method called “Language Lookahead”, in which the language information of subsequent frames is used to improve accuracy. Both methods are evaluated using a set of controlled LID results with varying frame accuracies. Our results show that both approaches improve the ASR performance by at least 4% relative if the LID achieves a minimum frame accuracy of 85%.</p

FigShare

Brain-to-text: decoding spoken phrases from phone representations in the brain

Author: Adriana de Pesters
Blakely
Bouchard
Bouchard
Brumberg
Canolty
Chang
Christian Herff
Crane
Crone
Crone
Deng
Dominic Heger
Dominic Telaar
Farwell
Formisano
Fukuda
Gales
Gales
Gasser
Gerwin Schalk
Guenther
Haeb-Umbach
Huang
Jelinek
Kellis
Kennedy
Kubanek
Kubanek
Lee
Leuthardt
Leuthardt
Lotte
Martin
McFarland
Mesgarani
Miller
Mugler
Mugler
Pasley
Pei
Pei
Peter Brunner
Potes
PulvermÃ¼ller
Rabiner
Roy
Sahin
Schalk
Schultz
Stolcke
Sutter
Talairach
Tanja Schultz
Telaar
Towle
Unknown.
Wolpaw
Publication venue: 'Frontiers Media SA'
Publication date
Field of study

Crossref