2,368 research outputs found

    Speech rhythm: a metaphor?

    Get PDF
    Is speech rhythmic? In the absence of evidence for a traditional view that languages strive to coordinate either syllables or stress-feet with regular time intervals, we consider the alternative that languages exhibit contrastive rhythm subsisting merely in the alternation of stronger and weaker elements. This is initially plausible, particularly for languages with a steep ā€˜prominence gradientā€™, i.e. a large disparity between stronger and weaker elements; but we point out that alternation is poorly achieved even by a ā€˜stress-timedā€™ language such as English, and, historically, languages have conspicuously failed to adopt simple phonological remedies that would ensure alternation. Languages seem more concerned to allow ā€˜syntagmatic contrastā€™ between successive units and to use durational effects to support linguistic functions than to facilitate rhythm. Furthermore, some languages (e.g. Tamil, Korean) lack the lexical prominence which would most straightforwardly underpin prominence alternation. We conclude that speech is not incontestibly rhythmic, and may even be antirhythmic. However, its linguistic structure and patterning allow the metaphorical extension of rhythm in varying degrees and in different ways depending on the language, and that it is this analogical process which allows speech to be matched to external rhythms

    Production and perception of speaker-specific phonetic detail at word boundaries

    Get PDF
    Experiments show that learning about familiar voices affects speech processing in many tasks. However, most studies focus on isolated phonemes or words and do not explore which phonetic properties are learned about or retained in memory. This work investigated inter-speaker phonetic variation involving word boundaries, and its perceptual consequences. A production experiment found significant variation in the extent to which speakers used a number of acoustic properties to distinguish junctural minimal pairs e.g. 'So he diced them'ā€”'So he'd iced them'. A perception experiment then tested intelligibility in noise of the junctural minimal pairs before and after familiarisation with a particular voice. Subjects who heard the same voice during testing as during the familiarisation period showed significantly more improvement in identification of words and syllable constituents around word boundaries than those who heard different voices. These data support the view that perceptual learning about the particular pronunciations associated with individual speakers helps listeners to identify syllabic structure and the location of word boundaries

    The listening talker: A review of human and algorithmic context-induced modifications of speech

    Get PDF
    International audienceSpeech output technology is finding widespread application, including in scenarios where intelligibility might be compromised - at least for some listeners - by adverse conditions. Unlike most current algorithms, talkers continually adapt their speech patterns as a response to the immediate context of spoken communication, where the type of interlocutor and the environment are the dominant situational factors influencing speech production. Observations of talker behaviour can motivate the design of more robust speech output algorithms. Starting with a listener-oriented categorisation of possible goals for speech modification, this review article summarises the extensive set of behavioural findings related to human speech modification, identifies which factors appear to be beneficial, and goes on to examine previous computational attempts to improve intelligibility in noise. The review concludes by tabulating 46 speech modifications, many of which have yet to be perceptually or algorithmically evaluated. Consequently, the review provides a roadmap for future work in improving the robustness of speech output

    The production and perception of word boundaries

    No full text

    The effect of L1 regional variation on the perception and production of standard L1 and L2 vowels

    Get PDF
    This study reports on the perception and production of Standard Dutch and Standard British English vowels by speakers of two regional varieties of Belgian Dutch (East Flemish and Brabantine) which differ in their vowel realizations. Twenty-four native speakers of Dutch performed two picture-naming tasks and two vowel categorization tasks, in which they heard Standard Dutch or English vowels and were asked to map these onto orthographic representations of Dutch vowels. The results of the Dutch production and categorization tasks revealed that the participantsā€™ L1 regional variety importantly influenced their production and especially perception of vowels in the standard variety of their L1. The two groups also differed in how they assimilated non-native English vowels to native vowel categories, but no major differences could be observed in their productions of non-native vowels. The study therefore only partly confirms earlier studies showing that L1 regional variation may have an influence on the acquisition of non-native language varieties

    Pauses and the temporal structure of speech

    Get PDF
    Natural-sounding speech synthesis requires close control over the temporal structure of the speech flow. This includes a full predictive scheme for the durational structure and in particuliar the prolongation of final syllables of lexemes as well as for the pausal structure in the utterance. In this chapter, a description of the temporal structure and the summary of the numerous factors that modify it are presented. In the second part, predictive schemes for the temporal structure of speech ("performance structures") are introduced, and their potential for characterising the overall prosodic structure of speech is demonstrated

    Phonotactic and acoustic cues for word segmentation in English

    Get PDF
    This study investigates the influence of both phonotactic and acoustic cues on the segmentation of spoken English. Listeners detected embedded English words in nonsense sequences (word spotting). Words aligned with phonotactic boundaries were easier to detect than words without such alignment. Acoustic cues to boundaries could also have signaled word boundaries, especially when word onsets lacked phonotactic alignment. However, only one of several durational boundary cues showed a marginally significant correlation with response times (RTs). The results suggest that word segmentation in English is influenced primarily by phonotactic constraints and only secondarily by acoustic aspects of the speech signal

    Infants\u27 Sensitivity to Fine Durational Cues in Speech Perception

    Get PDF
    Previous research has indicated that infants as young as 3 days of age show sensitivity to prosodic stress patterns and can use this information to distinguish word boundaries (Christophe et al., 1994). Older infants have also exhibited an ability to use prosodic stress patterns to segment streams of speech (Echols et al., 1997) and have shown a preference for samples of speech with the patterns of prosody displayed by their native language versus the prosodic patterns typical of other non-native languages (Werker & Tees 1984, Juscyzk et al. 1993). Adults have demonstrated the ability of language discrimination based strictly on fine durational cues rather than a broad sensitivity to rhythm. The purpose of the current research was to investigate this ability in infants. Sixteen 6- to 10-month old infants were presented with two different trisyllabic non-words, consisting of three consonant vowel pairs varying in rhythmic duration, one with a rhythmic duration previously familiarized and one with a novel rhythmic duration. Infants were tested using a head-turn preference procedure. Results indicated that infants significantly preferred to listen to a novel durational pattern, which suggests that infants are able to rely entirely on fine durational cues to discriminate between speech samples

    Syllabic lengthening as a word boundary cue

    No full text
    Bisyllabic sequences which could be interpreted as one word or two were produced in sentence contexts by a trained speaker, and syllabic durations measured. Listeners judged whether the bisyllables, excised from context, were one word or two. The proportion of two-word choices correlated positively with measured duration, but only for bisyllables stressed on the second syllable. The results may suggest a limit for listener sensitivity to syllabic lengthening as a word boundary cue

    Lexical Effects in Perception of Tamil Geminates

    Get PDF
    Lexical status effects are a phenomenon in which listeners use their prior lexical knowledge of a language to identify ambiguous speech sounds in a word based on its word or nonword status. This phenomenon has been demonstrated for ambiguous initial English consonants (one example being the Ganong Effect, a phenomenon in which listeners perceive an ambiguous speech sound as a phoneme that would complete a real word rather than a nonsense word) as a supporting factor for top-down lexical processing affecting listeners' subsequent acoustic judgement, but not for ambiguous mid-word consonants in non-English languages. In this experiment, we attempt to look at ambiguous mid-word consonants with Tamil, a South Asian language in order to see if the same top-down lexical effect was applicable outside of English. These Tamil consonants can present as either singletons (single speech sounds) or geminates (doubled speech sounds).We hypothesized that by creating ambiguous stimuli between a geminate word kuppam and a singleton non-word like kubam, participants would be more likely to perceive the ambiguous sound as a phoneme that completes the real word rather than the nonword (in this case, perceiving the ambiguous sound as a /p/ for kuppam instead of kubam). Participants listened to the ambiguous stimuli in two separate sets of continua (kuppam/suppam and nakkam/pakkam) and then indicated which word they heard in a four-alternative forced choice word identification task. Results showed that participants identified the ambiguous sounds as the sound that completed the actual word, but only for one set of continua (kuppam/suppam). These data suggest that there may be strong top-down lexical effects for ambiguous sounds in certain stimuli in Tamil, but not others.No embargoAcademic Major: LinguisticsAcademic Major: Psycholog
    • ā€¦
    corecore