Search CORE

32 research outputs found

A Danish phonetically annotated spontaneous speech corpus (DanPASS)

Author: Anderson
Boersma
Brown
Fletcher
Grønnum
Grønnum
Grønnum
Grønnum
Grønnum
Grønnum
Horiuchi
Kohler
Kohler
Nina Grønnum
Silverman
Swerts
Swerts
Terken
Publication venue: 'Elsevier BV'
Publication date
Field of study

Soft d in Danish: Acoustic characteristics and issues in transcription

Author: Block Aleese
Brotherton Chloe
Publication venue: 'Linguistic Society of America'
Publication date: 15/04/2020
Field of study

Danish, like closely related Swedish and Norwegian, has descended from Old Norse (Haugen 1976). While the three contemporary languages are variably mutually intelligible, Danish has phonologically diverged from the other Scandinavian languages (Gooskens 2006). This is caused by extensive consonant lenition and vowel reduction within Danish (Basbøll 2005). The lenition of and in syllable coda positions into a sound that Danish linguists have called soft-d is seemingly unique to the Danish. In most phonological descriptions, it is transcribed using the phonetic symbol /ð/, a voiced interdental fricative. We assert that this is not accurate; not all phonologists agree that the soft-d is a fricative. Some describe it as an alveolar semi-vowel (Haberland 1994), while others transcribe it as a velarized, retracted, and lowered alveolar approximant (Basbøll 2005). Many observe that the sound resembles lateral /l/, a distinct phoneme of Danish (Wells, 2010). Through acoustic analysis of tokens taken from the DanPASS corpus (Grønnum 2016) we show that the acoustic properties (HNR) of soft-d are indeed not the same as a fricative, but rather that of an approximant or vowel. Therefore, the use of /ð/ to transcribe this symbol is inaccurate and does not align with the goals of the International Phonetic Association

Proceedings Published by the LSA (Linguistic Society of America)

FT Speech: Danish Parliament Speech Corpus

Author: Kirkedal Andreas Søeborg
Plank Barbara
Stepanovic Marija
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2020
Field of study

This paper introduces FT Speech, a new speech corpus created from the recorded meetings of the Danish Parliament, otherwise known as the Folketing (FT). The corpus contains over 1,800 hours of transcribed speech by a total of 434 speakers. It is significantly larger in duration, vocabulary, and amount of spontaneous speech than the existing public speech corpora for Danish, which are largely limited to read-aloud and dictation data. We outline design considerations, including the preprocessing methods and the alignment procedure. To evaluate the quality of the corpus, we train automatic speech recognition systems on the new resource and compare them to the systems trained on the Danish part of Spr\r{a}kbanken, the largest public ASR corpus for Danish to date. Our baseline results show that we achieve a 14.01 WER on the new corpus. A combination of FT Speech with in-domain language data provides comparable results to models trained specifically on Spr\r{a}kbanken, showing that FT Speech transfers well to this data set. Interestingly, our results demonstrate that the opposite is not the case. This shows that FT Speech provides a valuable resource for promoting research on Danish ASR with more spontaneous speech.Comment: Submitted to Interspeech 202

arXiv.org e-Print Archive

The IT University of Copenhagen's Repository

Fishing in a speech stream, angling for a lexicon

Author: Henrichsen Peter Juel
Publication venue
Publication date: 09/05/2011
Field of study

Proceedings of the 18th Nordic Conference of Computational Linguistics NODALIDA 2011. Editors: Bolette Sandford Pedersen, Gunta Nešpore and Inguna Skadiņa. NEALT Proceedings Series, Vol. 11 (2011), 90-97. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/16955

DSpace at Tartu University Library

The rarity of intervocalic voicing of stops in Danish spontaneous speech

Author: Horslund C.S.
Jørgensen H.
Puggaard-Rode R.
Publication venue
Publication date: 24/05/2022
Field of study

Previous studies of the phonetics of Danish stops have neglected closure voicing. Danish is an aspiration language, but the aspirated stops /p t k/ are produced with shorter closure duration and less articulatory effort than the unaspirated stops /b d ɡ/. Furthermore, all Danish stops are characterized by some degree of glottal spreading during the closure. In this study, we use a corpus of Danish spontaneous speech (DanPASS) to investigate the intervocalic voicing—its distribution across the two laryngeal categories, whether it patterns as a lenition phenomenon, and whether the aerodynamic environment predicts its distribution. We find that intervocalic voicing is not the norm for either set of stops and is particularly rare in /p t k/. Voiced tokens are mostly found in environments associated with lenition. We suggest that the glottal spreading gesture found in all Danish stops is a phonological mechanism blocking voicing, which is probabilistically lost in spontaneous speech. This predicts our results better than relying on laryngeal features like [voice] or [spread glottis]. The study fills a gap in our knowledge of Danish phonetics and phonology, and is also one of the most extensive corpus studies of intervocalic stop voicing in an ‘aspiration language.’Theoretical and Experimental LinguisticsDescriptive and Comparative Linguistic

Leiden University Scholary Publications

Praat for begyndere

Author: Tøndering John
Publication venue
Publication date: 01/01/2010
Field of study

Copenhagen University Research Information System

Feedback and gestural behaviour in a conversational corpus of Danish

Author: Navarretta Costanza
Paggio Patrizia
Publication venue
Publication date: 01/01/2011
Field of study

Proceedings of the 3rd Nordic Symposium on Multimodal Communication. Editors: Patrizia Paggio, Elisabeth Ahlsén, Jens Allwood, Kristiina Jokinen, Costanza Navarretta. NEALT Proceedings Series, Vol. 15 (2011), 33–39. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/22532

Copenhagen University Research Information System

DSpace at Tartu University Library

Information based speech transduction

Author: Juel Henrichsen Peter
Publication venue: 2011
Publication date: 29/02/2012
Field of study

Modern hearing aids use a variety of advanced digital signal processing methods in order to improve speech intelligibility. These methods are based on knowledge about the acoustics outside the ear as well as psychoacoustics. We present a novel observation based on the fact that acoustic prominence is not equal to information prominence for time intervals at the syllabic and sub-syllabic levels. The idea is that speech elements with a high degree of information can be robustly identified based on basic acoustic properties. We evaluated the correlation of (information rich) content words in the DanPASS corpus with fundamental frequency (F0) and spectral tilt across four frequency bands. Our results show a correlation of certain band-level differences and the presence of content words. Similarly, but to a lesser extent, a correlation between F0 and the presence of content words was found. The principle described here has the potential to improve the “information-to-noise” ratio in hearing aids. In addition, this concept may also be applicable in automatic speech recognition systems

OpenArchive@CBS

Prosodiske fraser og syntaktisk struktur i spontan tale

Author: Tøndering John
Publication venue
Publication date: 01/01/2010
Field of study

Copenhagen University Research Information System

Creating Comparable Multimodal Corpora for Nordic Languages

Author: Ahlsén Elisabeth
Allwood Jens
Jokinen Kristiina
Navarretta Costanza
Paggio Patrizia
Publication venue
Publication date: 01/01/2011
Field of study

Proceedings of the 18th Nordic Conference of Computational Linguistics NODALIDA 2011. Editors: Bolette Sandford Pedersen, Gunta Nešpore and Inguna Skadiņa. NEALT Proceedings Series, Vol. 11 (2011), 153-160. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/16955

CiteSeerX

Copenhagen University Research Information System

DSpace at Tartu University Library