2,739 research outputs found

    The listening talker: A review of human and algorithmic context-induced modifications of speech

    Get PDF
    International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

    Subphonemic and suballophonic consonant variation : the role of the phoneme inventory

    Get PDF
    Consonants exhibit more variation in their phonetic realization than is typically acknowledged, but that variation is linguistically constrained. Acoustic analysis of both read and spontaneous speech reveals that consonants are not necessarily realized with the manner of articulation they would have in careful citation form. Although the variation is wider than one would imagine, it is limited by the phoneme inventory. The phoneme inventory of the language restricts the range of variation to protect the system of phonemic contrast. That is, consonants may stray phonetically into unfilled areas of the language's sound space. Listeners are seldom consciously aware of the consonant variation, and perceive the consonants phonemically as in their citation forms. A better understanding of surface phonetic consonant variation can help make predictions in theoretical domains and advances in applied domains

    Speech Communication

    Get PDF
    Contains table of contents for Part IV, table of contents for Section 1, an introduction, reports on seven research projects and a list of publications.C.J. Lebel FellowshipDennis Klatt Memorial FundNational Institutes of Health Grant T32-DC00005National Institutes of Health Grant R01-DC00075National Institutes of Health Grant F32-DC00015National Institutes of Health Grant R01-DC00266National Institutes of Health Grant P01-DC00361National Institutes of Health Grant R01-DC00776National Science Foundation Grant IRI 89-10561National Science Foundation Grant IRI 88-05680National Science Foundation Grant INT 90-2471

    Cross-Linguistic Influence in the Bilingual Mental Lexicon: Evidence of Cognate Effects in the Phonetic Production and Processing of a Vowel Contrast.

    Get PDF
    The present study examines cognate effects in the phonetic production and processing of the Catalan back mid-vowel contrast (/o/-/ɔ/) by 24 early and highly proficient Spanish-Catalan bilinguals in Majorca (Spain). Participants completed a picture-naming task and a forced-choice lexical decision task in which they were presented with either words (e.g., /bɔsk/ "forest") or non-words based on real words, but with the alternate mid-vowel pair in stressed position ((*)/bosk/). The same cognate and non-cognate lexical items were included in the production and lexical decision experiments. The results indicate that even though these early bilinguals maintained the back mid-vowel contrast in their productions, they had great difficulties identifying non-words and real words based on the identity of the Catalan mid-vowel. The analyses revealed language dominance and cognate effects: Spanish-dominants exhibited higher error rates than Catalan-dominants, and production and lexical decision accuracy were also affected by cognate status. The present study contributes to the discussion of the organization of early bilinguals' dominant and non-dominant sound systems, and proposes that exemplar theoretic approaches can be extended to include bilingual lexical connections that account for the interactions between the phonetic and lexical levels of early bilingual individuals

    The role of gesture delay in coda /r/ weakening: an articulatory, auditory and acoustic study

    Get PDF
    The cross-linguistic tendency of coda consonants to weaken, vocalize, or be deleted is shown to have a phonetic basis, resulting from gesture reduction, or variation in gesture timing. This study investigates the effects of the timing of the anterior tongue gesture for coda /r/ on acoustics and perceived strength of rhoticity, making use of two sociolects of Central Scotland (working- and middle-class) where coda /r/ is weakening and strengthening, respectively. Previous articulatory analysis revealed a strong tendency for these sociolects to use different coda /r/ tongue configurations—working- and middle-class speakers tend to use tip/front raised and bunched variants, respectively; however, this finding does not explain working-class /r/ weakening. A correlational analysis in the current study showed a robust relationship between anterior lingual gesture timing, F3, and percept of rhoticity. A linear mixed effects regression analysis showed that both speaker social class and linguistic factors (word structure and the checked/unchecked status of the prerhotic vowel) had significant effects on tongue gesture timing and formant values. This study provides further evidence that gesture delay can be a phonetic mechanism for coda rhotic weakening and apparent loss, but social class emerges as the dominant factor driving lingual gesture timing variation

    Frequency-predicted shifts independent of word-specific phonetic details

    Get PDF
    Some sound changes seem to proceed at different rates depending on lexical frequency; these are often interpreted as reflecting phonetically detailed exemplar memories, with changes spreading via lexical diffusion (Pierrehumbert 2002; Bybee 2012). However, such patterns do not necessarily require word-specific phonetic details. Variation associated with lexical frequency also exists when there is no evidence for a change in progress, which might be explained by the process of lexical access: Higher lexical frequency facilitates activation, causing faster and more reduced productions (Gahl et al. 2012, Kahn & Arnold 2012, Jurafsky et al. 2002). This work examines how repeated exposure to particular words influences listeners’ category boundary between aspirated and unaspirated stops in those words. Listeners’ VOT category boundary is lowered after exposure to shortened VOT stimuli and also after exposure to lengthened VOT stimuli. These results suggest that frequency-related sound change can largely be explained by frequency directly influencing reduction in phonetic implementation and perceptual access. The size of the effect differed based on the acoustic characteristics of the exposure stimuli; this may suggest a role of word-specific phonetic details, but could also reflect different levels of activation due to the prototypicality of the stimuli

    Phonetic Realisation and Phonemic Categorisation of the Final Reduced Corner Vowels in the Finnic Languages of Ingria

    Get PDF
    Individual variability in sound change was explored at three stages of final vowel reduction and loss in the endangered Finnic varieties of Ingria (subdialects of Ingrian, Votic and Ingrian Finnish). The correlation between the realisation of reduced vowels and their phonemic categorisation by speakers was studied. The correlated results showed that if V was pronounced > 70%, its starting loss was not yet perceived, apart from certain frequent elements, but after > 70% loss, V was not perceived any more. A split of 50/50 between V and loss in production correlated with the same split in categorisation. At the beginning of a sound change, production is, therefore, more innovative, but after reanalysis, categorisation becomes more innovative and leads the change. The vowel a was the most innovative in terms of loss, u/o were the most conservative, and i was in the middle, while consonantal palatalisation was more salient than labialisation. These differences are based on acoustics, articulation and perception

    Acoustic Realization and Perception of English Lexical Stress by Mandarin Learners

    Get PDF
    The acquisition of English lexical stress by Mandarin L2 learners was examined. An acoustic study focusing on the implementation of mean F0, max F0, duration, intensity, and F2 in stressed and unstressed vowels in noun-verb word pairs contrasting in stress location (e.g. object-object) was conducted. The results indicate that native English speakers use all correlates in nouns but rely mostly on duration in verbs. The learners use these cues more consistently across different contexts. A perceptual study utilizing the disyllabic nonword 'dada', with resynthesized max F0, duration, and vowel quality indicates that full vowels induce stronger stress perception in all listener groups. Beginning listeners relied on duration, advanced listeners focused on max F0, while native listeners used both in perception. The similarities and differences in prosodic systems between Mandarin and English, as well as the possible discrepancies in production and perception data from second language learning research were discussed
    corecore