9 research outputs found
Lexical and Prosodic Pitch Modifications in Cantonese Infant-directed Speech
Published online 03 February 2021The functions of acoustic-phonetic modifications in infant-directed speech (IDS) remain a
question: do they specifically serve to facilitate language learning via enhanced phonemic
contrasts (the hyperarticulation hypothesis) or primarily to improve communication via
prosodic exaggeration (the prosodic hypothesis)? The study of lexical tones provides a
unique opportunity to shed light on this, as lexical tones are phonemically contrastive,
yet their primary cue, pitch, is also a prosodic cue. This study investigated Cantonese
IDS and found increased intra-talker variation of lexical tones, which more likely posed
a challenge to rather than facilitated phonetic learning. Although tonal space was
expanded which could facilitate phonetic learning, its expansion was a function of
overall intonational modifications. Similar findings were observed in speech to pets
who should not benefit from larger phonemic distinction. We conclude that lexicaltone
adjustments in IDS mainly serve to broadly enhance communication rather than
specifically increase phonemic contrast for learners.This work was supported by the University Grants Committee (HKSAR) (RGC34000118), the Innovation and
Technology Fund (HKSAR) (ITS/067/18), Dr. Stanley Ho Medical Development Foundation, and the
Global Parent Child Resource Centre Limited. The second author’s work is supported by the Basque
Government through the BERC 2018-2021 program and by the Spanish Ministry of Science and
Innovation through the Ramon y Cajal Research Fellowship, PID2019-105528GA-I00
The listening talker: A review of human and algorithmic context-induced modifications of speech
International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output
Dynamic Formant Trajectories in German Read Speech: Impact of Predictability and Prominence
Phonetic structures expand temporally and spectrally when they are difficult to predict from their context. To some extent, effects of predictability are modulated by prosodic structure. So far, studies on the impact of contextual predictability and prosody on phonetic structures have neglected the dynamic nature of the speech signal. This study investigates the impact of predictability and prominence on the dynamic structure of the first and second formants of German vowels. We expect to find differences in the formant movements between vowels standing in different predictability contexts and a modulation of this effect by prominence. First and second formant values are extracted from a large German corpus. Formant trajectories of peripheral vowels are modeled using generalized additive mixed models, which estimate nonlinear regressions between a dependent variable and predictors. Contextual predictability is measured as biphone and triphone surprisal based on a statistical German language model. We test for the effects of the information-theoretic measures surprisal and word frequency, as well as prominence, on formant movement, while controlling for vowel phonemes and duration. Primary lexical stress and vowel phonemes are significant predictors of first and second formant trajectory shape. We replicate previous findings that vowels are more dispersed in stressed syllables than in unstressed syllables. The interaction of stress and surprisal explains formant movement: unstressed vowels show more variability in their formant trajectory shape at different surprisal levels than stressed vowels. This work shows that effects of contextual predictability on fine phonetic detail can be observed not only in pointwise measures but also in dynamic features of phonetic segments
The experimental state of mind in elicitation: illustrations from tonal fieldwork
This paper illustrates how an “experimental state of mind”, i.e. principles of experimental design, can inform hypothesis generation and testing in structured fieldwork elicitation. The application of these principles is demonstrated with case studies in toneme discovery. Pike’s classic toneme discovery procedure is shown to be a special case of the application of experimental design. It is recast in two stages: (1) the inference of the hidden structure of tonemes based on unexplained variability in the pitch contour r emaining, even after other sources of influence on the pitch contour are accounted for, and (2) the confirmation of systematic effects of hypothesized tonal classes on the pitch contour in elicitations structured to control for confounding variables that could obscure the relati on between tonal classes and the pitch contour. Strategies for controlling the confounding variables, such as blocking and randomization, are discussed. The two stages are exemplified using data elicited from the early stages of toneme discovery in Kirikiri, a language of New Guinea. *This paper is in the series How to Study a Tone Language, edited by Steven Bird and Larry HymanNational Foreign Language Resource Cente
The effect of multitalker background noise on speech intelligibility in Parkinson\u27s disease and controls
This study investigated the effect of multi-talker background noise on speech intelligibility in participants with hypophonia due to Parkinson’s disease (PD). Ten individuals with PD and 10 geriatric controls were tested on four speech intelligibility tasks at the single word, sentence, and conversation level in various conditions of background noise. Listeners assessed speech intelligibility using word identification or orthographic transcription procedures. Results revealed non-significant differences between groups when intelligibility was assessed in no background noise. PD speech intelligibility decreased significantly relative to controls in the presence of background noise. A phonetic error analysis revealed a distinct error profile for PD speech in background noise. The four most frequent phonetic errors were glottal-null, consonant-null in final position, stop place of articulation, and initial position cluster-singleton. The results demonstrate that individuals with PD have significant and distinctive deficits in speech intelligibility and phonetic errors in the presence of background noise