63,554 research outputs found
Does training with amplitude modulated tones affect tone-vocoded speech perception?
Temporal-envelope cues are essential for successful speech perception. We asked here whether training on stimuli containing temporal-envelope cues without speech content can improve the perception of spectrally-degraded (vocoded) speech in which the temporal-envelope (but not the temporal fine structure) is mainly preserved. Two groups of listeners were trained on different amplitude-modulation (AM) based tasks, either AM detection or AM-rate discrimination (21 blocks of 60 trials during two days, 1260 trials; frequency range: 4Hz, 8Hz, and 16Hz), while an additional control group did not undertake any training. Consonant identification in vocoded vowel-consonant-vowel stimuli was tested before and after training on the AM tasks (or at an equivalent time interval for the control group). Following training, only the trained groups showed a significant improvement in the perception of vocoded speech, but the improvement did not significantly differ from that observed for controls. Thus, we do not find convincing evidence that this amount of training with temporal-envelope cues without speech content provide significant benefit for vocoded speech intelligibility. Alternative training regimens using vocoded speech along the linguistic hierarchy should be explored
Temporal markers of prosodic boundaries in children's speech production
It is often thought that the ability to use prosodic features accurately is mastered in early childhood. However, research to date has produced conflicting evidence, notably about the development of children's ability to mark prosodic boundaries. This paper investigates (i) whether, by the age of eight, children use temporal boundary features in their speech in a systematic way, and (ii) to what extent adult listeners are able to interpret their production accurately and unambiguously. The material consists of minimal pairs of utterances: one utterance includes a compound noun, in which there is no prosodic boundary after the first noun, e.g. ‘coffee-cake and tea’, while the other utterance includes simple nouns, separated by a prosodic boundary, e.g. ‘coffee, cake and tea’. Ten eight-year-old children took part, and their productions were rated by 23 adult listeners. Two phonetic exponents of prosodic boundaries were analysed: pause duration and phrase-final lengthening. The results suggest that, at the age of 8, there is considerable variability among children in their ability to mark phrase boundaries of the kind analysed in the experiment, with some children failing to differentiate between the members of the minimal pairs reliably. The differences between the children in their use of boundary features were reflected in the adults' perceptual judgements. Both temporal cues to prosodic boundaries significantly affected the perceptual ratings, with pause being a more salient determinant of ratings than phrase-final lengthening
Talker identification is not improved by lexical access in the absence of familiar phonology
Listeners identify talkers more accurately when they are familiar with both the sounds and words of the language being spoken. It is unknown whether lexical information alone can facilitate talker identification in the absence of familiar phonology. To dissociate the roles of familiar words and phonology, we developed English-Mandarin “hybrid” sentences, spoken in Mandarin, which can be convincingly coerced to sound like English when presented with corresponding subtitles (e.g., “wei4 gou3 chi1 kao3 li2 zhi1” becomes “we go to college”). Across two experiments, listeners learned to identify talkers in three conditions: listeners' native language (English), an unfamiliar, foreign language (Mandarin), and a foreign language paired with subtitles that primed native language lexical access (subtitled Mandarin). In Experiment 1 listeners underwent a single session of talker identity training; in Experiment 2 listeners completed three days of training. Talkers in a foreign language were identified no better when native language lexical representations were primed (subtitled Mandarin) than from foreign-language speech alone, regardless of whether they had received one or three days of talker identity training. These results suggest that the facilitatory effect of lexical access on talker identification depends on the availability of familiar phonological forms
Pitch ability as an aptitude for tone learning
Tone languages such as Mandarin use voice pitch to signal lexical contrasts, presenting a challenge for second/foreign language (L2) learners whose native languages do not use pitch in this manner. The present study examined components of an aptitude for mastering L2 lexical tone. Native English speakers with no previous tone language experience completed a Mandarin word learning task, as well as tests of pitch ability, musicality, L2 aptitude, and general cognitive ability. Pitch ability measures improved predictions of learning performance beyond musicality, L2 aptitude, and general cognitive ability and also predicted transfer of learning to new talkers. In sum, although certain nontonal measures help predict successful tone learning, the central components of tonal aptitude are pitch-specific perceptual measures
The Sound Manifesto
Computing practice today depends on visual output to drive almost all user
interaction. Other senses, such as audition, may be totally neglected, or used
tangentially, or used in highly restricted specialized ways. We have excellent
audio rendering through D-A conversion, but we lack rich general facilities for
modeling and manipulating sound comparable in quality and flexibility to
graphics. We need co-ordinated research in several disciplines to improve the
use of sound as an interactive information channel.
Incremental and separate improvements in synthesis, analysis, speech
processing, audiology, acoustics, music, etc. will not alone produce the
radical progress that we seek in sonic practice. We also need to create a new
central topic of study in digital audio research. The new topic will assimilate
the contributions of different disciplines on a common foundation. The key
central concept that we lack is sound as a general-purpose information channel.
We must investigate the structure of this information channel, which is driven
by the co-operative development of auditory perception and physical sound
production. Particular audible encodings, such as speech and music, illuminate
sonic information by example, but they are no more sufficient for a
characterization than typography is sufficient for a characterization of visual
information.Comment: To appear in the conference on Critical Technologies for the Future
of Computing, part of SPIE's International Symposium on Optical Science and
Technology, 30 July to 4 August 2000, San Diego, C
Perceptual Calibration of F0 Production: Evidence from Feedback Perturbation
Hearing one’s own speech is important for language learning and maintenance of accurate articulation. For example, people with postlinguistically acquired deafness often show a gradual deterioration of many aspects of speech production. In this manuscript, data are presented that address the role played by acoustic feedback in the control of voice fundamental frequency (F0). Eighteen subjects produced vowels under a control ~normal F0 feedback! and two experimental conditions: F0 shifted up and F0 shifted down. In each experimental condition subjects produced vowels during a training period in which their F0 was slowly shifted without their awareness. Following this exposure to transformed F0, their acoustic feedback was returned to normal. Two effects were observed. Subjects compensated for the change in F0 and showed negative aftereffects. When F0 feedback was returned to normal, the subjects modified their produced F0 in the opposite direction to the shift. The results suggest that fundamental frequency is controlled using auditory feedback and with reference to an internal pitch representation. This is consistent with current work on internal models of speech motor control
- …