research

Strength of forensic voice comparison evidence from the acoustics of filled pauses

Abstract

This study investigates the evidential value of filled pauses (FPs, i.e. um, uh) as variables in forensic voice comparison. FPs for 60 young male speakers of standard southern British English were analysed. The following acoustic properties were analysed: midpoint frequencies of the first three formants in the vocalic portion; ‘dynamic’ characterisations of formant trajectories (i.e. quadratic polynomial equations fitted to nine measurement points over the entire vowel); vowel duration; and nasal duration for um. Likelihood ratio (LR) scores were computed using the Multivariate Kernel Density formula (MVKD; Aitken and Lucy, 2004) and converted to calibrated log10 LRs (LLRs) using logistic-regression (Brümmer et al., 2007). System validity was assessed using both equal error rate (EER) and the log LR cost function (Cllr; Brümmer and du Preez, 2006). The system with the best performance combines dynamic measurements of all three formants with vowel and nasal duration for um, achieving an EER of 4.08% and Cllr of 0.12. In terms of general patterns, um consistently outperformed uh. For um, the formant dynamic systems generated better validity than those based on midpoints, presumably reflecting the additional degree of formant movement in um caused by the transition from vowel to nasal. By contrast, midpoints outperformed dynamics for the more monophthongal uh. Further, the addition of duration (vowel or vowel and nasal) consistently improved system performance. The study supports the view that FPs have excellent potential as variables in forensic voice comparison cases

    Similar works