32 research outputs found
Soft d in Danish: Acoustic characteristics and issues in transcription
Danish, like closely related Swedish and Norwegian, has descended from Old Norse (Haugen 1976). While the three contemporary languages are variably mutually intelligible, Danish has phonologically diverged from the other Scandinavian languages (Gooskens 2006). This is caused by extensive consonant lenition and vowel reduction within Danish (Basbøll 2005). The lenition of and in syllable coda positions into a sound that Danish linguists have called soft-d is seemingly unique to the Danish. In most phonological descriptions, it is transcribed using the phonetic symbol /ð/, a voiced interdental fricative. We assert that this is not accurate; not all phonologists agree that the soft-d is a fricative. Some describe it as an alveolar semi-vowel (Haberland 1994), while others transcribe it as a velarized, retracted, and lowered alveolar approximant (Basbøll 2005). Many observe that the sound resembles lateral /l/, a distinct phoneme of Danish (Wells, 2010). Through acoustic analysis of tokens taken from the DanPASS corpus (Grønnum 2016) we show that the acoustic properties (HNR) of soft-d are indeed not the same as a fricative, but rather that of an approximant or vowel. Therefore, the use of /ð/ to transcribe this symbol is inaccurate and does not align with the goals of the International Phonetic Association
FT Speech: Danish Parliament Speech Corpus
This paper introduces FT Speech, a new speech corpus created from the
recorded meetings of the Danish Parliament, otherwise known as the Folketing
(FT). The corpus contains over 1,800 hours of transcribed speech by a total of
434 speakers. It is significantly larger in duration, vocabulary, and amount of
spontaneous speech than the existing public speech corpora for Danish, which
are largely limited to read-aloud and dictation data. We outline design
considerations, including the preprocessing methods and the alignment
procedure. To evaluate the quality of the corpus, we train automatic speech
recognition systems on the new resource and compare them to the systems trained
on the Danish part of Spr\r{a}kbanken, the largest public ASR corpus for Danish
to date. Our baseline results show that we achieve a 14.01 WER on the new
corpus. A combination of FT Speech with in-domain language data provides
comparable results to models trained specifically on Spr\r{a}kbanken, showing
that FT Speech transfers well to this data set. Interestingly, our results
demonstrate that the opposite is not the case. This shows that FT Speech
provides a valuable resource for promoting research on Danish ASR with more
spontaneous speech.Comment: Submitted to Interspeech 202
Fishing in a speech stream, angling for a lexicon
Proceedings of the 18th Nordic Conference of Computational Linguistics
NODALIDA 2011.
Editors: Bolette Sandford Pedersen, Gunta Nešpore and Inguna Skadiņa.
NEALT Proceedings Series, Vol. 11 (2011), 90-97.
© 2011 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/16955
The rarity of intervocalic voicing of stops in Danish spontaneous speech
Previous studies of the phonetics of Danish stops have neglected closure voicing. Danish is an aspiration language, but the aspirated stops /p t k/ are produced with shorter closure duration and less articulatory effort than the unaspirated stops /b d ɡ/. Furthermore, all Danish stops are characterized by some degree of glottal spreading during the closure. In this study, we use a corpus of Danish spontaneous speech (DanPASS) to investigate the intervocalic voicing—its distribution across the two laryngeal categories, whether it patterns as a lenition phenomenon, and whether the aerodynamic environment predicts its distribution. We find that intervocalic voicing is not the norm for either set of stops and is particularly rare in /p t k/. Voiced tokens are mostly found in environments associated with lenition. We suggest that the glottal spreading gesture found in all Danish stops is a phonological mechanism blocking voicing, which is probabilistically lost in spontaneous speech. This predicts our results better than relying on laryngeal features like [voice] or [spread glottis]. The study fills a gap in our knowledge of Danish phonetics and phonology, and is also one of the most extensive corpus studies of intervocalic stop voicing in an ‘aspiration language.’Theoretical and Experimental LinguisticsDescriptive and Comparative Linguistic
Feedback and gestural behaviour in a conversational corpus of Danish
Proceedings of the 3rd Nordic Symposium on Multimodal Communication.
Editors: Patrizia Paggio, Elisabeth Ahlsén, Jens Allwood,
Kristiina Jokinen, Costanza Navarretta.
NEALT Proceedings Series, Vol. 15 (2011), 33–39.
© 2011 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/22532
Information based speech transduction
Modern hearing aids use a variety of advanced digital signal processing methods in order to improve speech intelligibility. These methods are based on knowledge about the acoustics outside the ear as well as psychoacoustics. We present a novel observation based on the fact that acoustic prominence is not equal to information prominence for time intervals at the syllabic and sub-syllabic levels. The idea is that speech elements with a high degree of information can be robustly identified based on basic acoustic properties. We evaluated the correlation of (information rich) content words in the DanPASS corpus with fundamental frequency (F0) and spectral tilt across four frequency bands. Our results show a correlation of certain band-level differences and the presence of content words. Similarly, but to a lesser extent, a correlation between F0 and the presence of content words was found. The principle described here has the potential to improve the “information-to-noise” ratio in hearing aids. In addition, this concept may also be applicable in automatic speech recognition systems
Creating Comparable Multimodal Corpora for Nordic Languages
Proceedings of the 18th Nordic Conference of Computational Linguistics
NODALIDA 2011.
Editors: Bolette Sandford Pedersen, Gunta Nešpore and Inguna Skadiņa.
NEALT Proceedings Series, Vol. 11 (2011), 153-160.
© 2011 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/16955