63,554 research outputs found

    Does training with amplitude modulated tones affect tone-vocoded speech perception?

    Get PDF
    Temporal-envelope cues are essential for successful speech perception. We asked here whether training on stimuli containing temporal-envelope cues without speech content can improve the perception of spectrally-degraded (vocoded) speech in which the temporal-envelope (but not the temporal fine structure) is mainly preserved. Two groups of listeners were trained on different amplitude-modulation (AM) based tasks, either AM detection or AM-rate discrimination (21 blocks of 60 trials during two days, 1260 trials; frequency range: 4Hz, 8Hz, and 16Hz), while an additional control group did not undertake any training. Consonant identification in vocoded vowel-consonant-vowel stimuli was tested before and after training on the AM tasks (or at an equivalent time interval for the control group). Following training, only the trained groups showed a significant improvement in the perception of vocoded speech, but the improvement did not significantly differ from that observed for controls. Thus, we do not find convincing evidence that this amount of training with temporal-envelope cues without speech content provide significant benefit for vocoded speech intelligibility. Alternative training regimens using vocoded speech along the linguistic hierarchy should be explored

    Temporal markers of prosodic boundaries in children's speech production

    Get PDF
    It is often thought that the ability to use prosodic features accurately is mastered in early childhood. However, research to date has produced conflicting evidence, notably about the development of children's ability to mark prosodic boundaries. This paper investigates (i) whether, by the age of eight, children use temporal boundary features in their speech in a systematic way, and (ii) to what extent adult listeners are able to interpret their production accurately and unambiguously. The material consists of minimal pairs of utterances: one utterance includes a compound noun, in which there is no prosodic boundary after the first noun, e.g. ‘coffee-cake and tea’, while the other utterance includes simple nouns, separated by a prosodic boundary, e.g. ‘coffee, cake and tea’. Ten eight-year-old children took part, and their productions were rated by 23 adult listeners. Two phonetic exponents of prosodic boundaries were analysed: pause duration and phrase-final lengthening. The results suggest that, at the age of 8, there is considerable variability among children in their ability to mark phrase boundaries of the kind analysed in the experiment, with some children failing to differentiate between the members of the minimal pairs reliably. The differences between the children in their use of boundary features were reflected in the adults' perceptual judgements. Both temporal cues to prosodic boundaries significantly affected the perceptual ratings, with pause being a more salient determinant of ratings than phrase-final lengthening

    Talker identification is not improved by lexical access in the absence of familiar phonology

    Full text link
    Listeners identify talkers more accurately when they are familiar with both the sounds and words of the language being spoken. It is unknown whether lexical information alone can facilitate talker identification in the absence of familiar phonology. To dissociate the roles of familiar words and phonology, we developed English-Mandarin “hybrid” sentences, spoken in Mandarin, which can be convincingly coerced to sound like English when presented with corresponding subtitles (e.g., “wei4 gou3 chi1 kao3 li2 zhi1” becomes “we go to college”). Across two experiments, listeners learned to identify talkers in three conditions: listeners' native language (English), an unfamiliar, foreign language (Mandarin), and a foreign language paired with subtitles that primed native language lexical access (subtitled Mandarin). In Experiment 1 listeners underwent a single session of talker identity training; in Experiment 2 listeners completed three days of training. Talkers in a foreign language were identified no better when native language lexical representations were primed (subtitled Mandarin) than from foreign-language speech alone, regardless of whether they had received one or three days of talker identity training. These results suggest that the facilitatory effect of lexical access on talker identification depends on the availability of familiar phonological forms

    Pitch ability as an aptitude for tone learning

    Full text link
    Tone languages such as Mandarin use voice pitch to signal lexical contrasts, presenting a challenge for second/foreign language (L2) learners whose native languages do not use pitch in this manner. The present study examined components of an aptitude for mastering L2 lexical tone. Native English speakers with no previous tone language experience completed a Mandarin word learning task, as well as tests of pitch ability, musicality, L2 aptitude, and general cognitive ability. Pitch ability measures improved predictions of learning performance beyond musicality, L2 aptitude, and general cognitive ability and also predicted transfer of learning to new talkers. In sum, although certain nontonal measures help predict successful tone learning, the central components of tonal aptitude are pitch-specific perceptual measures

    The Sound Manifesto

    Full text link
    Computing practice today depends on visual output to drive almost all user interaction. Other senses, such as audition, may be totally neglected, or used tangentially, or used in highly restricted specialized ways. We have excellent audio rendering through D-A conversion, but we lack rich general facilities for modeling and manipulating sound comparable in quality and flexibility to graphics. We need co-ordinated research in several disciplines to improve the use of sound as an interactive information channel. Incremental and separate improvements in synthesis, analysis, speech processing, audiology, acoustics, music, etc. will not alone produce the radical progress that we seek in sonic practice. We also need to create a new central topic of study in digital audio research. The new topic will assimilate the contributions of different disciplines on a common foundation. The key central concept that we lack is sound as a general-purpose information channel. We must investigate the structure of this information channel, which is driven by the co-operative development of auditory perception and physical sound production. Particular audible encodings, such as speech and music, illuminate sonic information by example, but they are no more sufficient for a characterization than typography is sufficient for a characterization of visual information.Comment: To appear in the conference on Critical Technologies for the Future of Computing, part of SPIE's International Symposium on Optical Science and Technology, 30 July to 4 August 2000, San Diego, C

    Perceptual Calibration of F0 Production: Evidence from Feedback Perturbation

    Get PDF
    Hearing one’s own speech is important for language learning and maintenance of accurate articulation. For example, people with postlinguistically acquired deafness often show a gradual deterioration of many aspects of speech production. In this manuscript, data are presented that address the role played by acoustic feedback in the control of voice fundamental frequency (F0). Eighteen subjects produced vowels under a control ~normal F0 feedback! and two experimental conditions: F0 shifted up and F0 shifted down. In each experimental condition subjects produced vowels during a training period in which their F0 was slowly shifted without their awareness. Following this exposure to transformed F0, their acoustic feedback was returned to normal. Two effects were observed. Subjects compensated for the change in F0 and showed negative aftereffects. When F0 feedback was returned to normal, the subjects modified their produced F0 in the opposite direction to the shift. The results suggest that fundamental frequency is controlled using auditory feedback and with reference to an internal pitch representation. This is consistent with current work on internal models of speech motor control
    corecore