Understanding the acoustic implications of digital transmission on fricatives

Abstract

The aim of this thesis is to provide a better understanding of the acoustic implications of digital transmission on fricatives relevant across research fields. This is motivated by the increasing amount of digital transmitted speech across the world, and the limited knowledge on the effects of digital transmission on consonants. The thesis investigates the fricatives /f/, /θ/, /s/, /ʃ/, /z/, /ð/ and [fj]. Fricatives were expected to be particularly affected by codec compression because of their noise-like and aperiodic structure, which might be mistaken for noise by the codecs. The thesis investigates the effects of the AMR-WB-, Opus-, and MP3 codec using three different bitrates and in live transmission. The acoustic implications were measured as the first four spectral moments, peak frequency, and via spectrographic analysis. These measures were compared between baseline uncompressed WAV files and each of the codec compressed versions. This resulted in three studies. The first two are in controlled conditions i.e. the WAV files are codec compressed via a computer, whereas the third study is live with the speech transmitted between two mobile phones with and without background noise. The findings indicate significant effects of the codec compressions on the spectral measures with segment, codec and bitrate dependent tendencies. The live transmission and background noise generally produced larger effects than the controlled conditions. Intensity played a key role in the magnitude of the effects of the codec compressions and live transmission. This has implications when using codec compressed speech as data, but especially in socio- and forensic phonetics with possible diffusion of sound changes and speaker comparisons. In addition, the results have implications beyond linguistics e.g. in psychology, where clarity of speech plays a role in perceived charisma, and in hearing aid and cochlear implant technology, which both approach speech digitally and incorporate noise reduction

    Similar works