Search CORE

4 research outputs found

FT Speech: Danish Parliament Speech Corpus

Author: Kirkedal Andreas Søeborg
Plank Barbara
Stepanovic Marija
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2020
Field of study

This paper introduces FT Speech, a new speech corpus created from the recorded meetings of the Danish Parliament, otherwise known as the Folketing (FT). The corpus contains over 1,800 hours of transcribed speech by a total of 434 speakers. It is significantly larger in duration, vocabulary, and amount of spontaneous speech than the existing public speech corpora for Danish, which are largely limited to read-aloud and dictation data. We outline design considerations, including the preprocessing methods and the alignment procedure. To evaluate the quality of the corpus, we train automatic speech recognition systems on the new resource and compare them to the systems trained on the Danish part of Spr\r{a}kbanken, the largest public ASR corpus for Danish to date. Our baseline results show that we achieve a 14.01 WER on the new corpus. A combination of FT Speech with in-domain language data provides comparable results to models trained specifically on Spr\r{a}kbanken, showing that FT Speech transfers well to this data set. Interestingly, our results demonstrate that the opposite is not the case. This shows that FT Speech provides a valuable resource for promoting research on Danish ASR with more spontaneous speech.Comment: Submitted to Interspeech 202

arXiv.org e-Print Archive

The IT University of Copenhagen's Repository

The Lacunae of Danish Natural Language Processing

Author: Derczynski Leon
Kirkedal Andreas Søeborg
Plank Barbara
Schluter Natalie
Publication venue
Publication date: 01/01/2019
Field of study

The IT University of Copenhagen's Repository

Analysis of phonetic transcriptions for Danish automatic speech recognition

Author: Andreas Søeborg Kirkedal
Publication venue
Publication date: 06/03/2020
Field of study

ABSTRACT Automatic speech recognition (ASR) relies on three resources: audio, orthographic transcriptions and a pronunciation dictionary. The dictionary or lexicon maps orthographic words to sequences of phones or phonemes that represent the pronunciation of the corresponding word. The quality of a speech recognition system depends heavily on the dictionary and the transcriptions therein. This paper presents an analysis of phonetic/phonemic features that are salient for current Danish ASR systems. This preliminary study consists of a series of experiments using an ASR system trained on the DK-PAROLE corpus. The analysis indicates that transcribing e.g. stress or vowel duration has a negative impact on performance. The best performance is obtained with coarse phonetic annotation and improves performance 1% word error rate and 3.8% sentence error rate

CiteSeerX

The Danish Gigaword Corpus

Author: Baglini Rebekah
Christiansen Morten
Ciosici Manuel Rafael
Dalsgaard Jacob Aarup
Derczynski Leon
Fusaroli Riccardo
Henrichsen Peter Juel
Hvingelby Rasmus
Kirkedal Andreas Søeborg
Kjeldsen Alex Speed
Ladefoged Claus
Madsen Jens
Nielsen Finn Årup
Petersen Malte Lau
Rystrøm Jonathan Hvithamar
Varab Daniel
Publication venue
Publication date: 01/01/2021
Field of study

Copenhagen University Research Information System

The IT University of Copenhagen's Repository