142 research outputs found

    Relative Pitch Perception and the Detection of Deviant Tone Patterns.

    Get PDF
    Most people are able to recognise familiar tunes even when played in a different key. It is assumed that this depends on a general capacity for relative pitch perception; the ability to recognise the pattern of inter-note intervals that characterises the tune. However, when healthy adults are required to detect rare deviant melodic patterns in a sequence of randomly transposed standard patterns they perform close to chance. Musically experienced participants perform better than naïve participants, but even they find the task difficult, despite the fact that musical education includes training in interval recognition.To understand the source of this difficulty we designed an experiment to explore the relative influence of the size of within-pattern intervals and between-pattern transpositions on detecting deviant melodic patterns. We found that task difficulty increases when patterns contain large intervals (5-7 semitones) rather than small intervals (1-3 semitones). While task difficulty increases substantially when transpositions are introduced, the effect of transposition size (large vs small) is weaker. Increasing the range of permissible intervals to be used also makes the task more difficult. Furthermore, providing an initial exact repetition followed by subsequent transpositions does not improve performance. Although musical training correlates with task performance, we find no evidence that violations to musical intervals important in Western music (i.e. the perfect fifth or fourth) are more easily detected. In summary, relative pitch perception does not appear to be conducive to simple explanations based exclusively on invariant physical ratios

    Effect of stimulus type and pitch salience on pitch-sequence processing

    Get PDF
    Using a same-different discrimination task, it has been shown that discrimination performance for sequences of complex tones varying just detectably in pitch is less dependent on sequence length (1, 2, or 4 elements) when the tones contain resolved harmonics than when they do not [Cousineau, Demany, and Pessnitzer (2009). J. Acoust. Soc. Am. 126, 3179-3187]. This effect had been attributed to the activation of automatic frequency-shift detectors (FSDs) by the shifts in resolved harmonics. The present study provides evidence against this hypothesis by showing that the sequence-processing advantage found for complex tones with resolved harmonics is not found for pure tones or other sounds supposed to activate FSDs (narrow bands of noise and wide-band noises eliciting pitch sensations due to interaural phase shifts). The present results also indicate that for pitch sequences, processing performance is largely unrelated to pitch salience per se: for a fixed level of discriminability between sequence elements, sequences of elements with salient pitches are not necessarily better processed than sequences of elements with less salient pitches. An ideal-observer model for the same-different binary-sequence discrimination task is also developed in the present study. The model allows the computation of d' for this task using numerical methods

    Dimension-specific attention directs learning and listening on auditory training tasks

    Get PDF
    The relative contributions of bottom-up versus top-down sensory inputs to auditory learning are not well established. In our experiment, listeners were instructed to perform either a frequency discrimination (FD) task ("FD-train group") or an intensity discrimination (ID) task ("ID-train group") during training on a set of physically identical tones that were impossible to discriminate consistently above chance, allowing us to vary top-down attention whilst keeping bottom-up inputs fixed. A third, control group did not receive any training. Only the FD-train group improved on a FD probe following training, whereas all groups improved on ID following training. However, only the ID-train group also showed changes in performance accuracy as a function of interval with training on the ID task. These findings suggest that top-down, dimension-specific attention can direct auditory learning, even when this learning is not reflected in conventional performance measures of threshold change

    The contribution of visual information to the perception of speech in noise with and without informative temporal fine structure

    Get PDF
    Understanding what is said in demanding listening situations is assisted greatly by looking at the face of a talker. Previous studies have observed that normal-hearing listeners can benefit from this visual information when a talker’s voice is presented in background noise. These benefits have also been observed in quiet listening conditions in cochlear-implant users, whose device does not convey the informative temporal fine structure cues in speech, and when normal-hearing individuals listen to speech processed to remove these informative temporal fine structure cues. The current study (1) characterised the benefits of visual information when listening in background noise; and (2) used sine-wave vocoding to compare the size of the visual benefit when speech is presented with or without informative temporal fine structure. The accuracy with which normal-hearing individuals reported words in spoken sentences was assessed across three experiments. The availability of visual information and informative temporal fine structure cues was varied within and across the experiments. The results showed that visual benefit was observed using open- and closed-set tests of speech perception. The size of the benefit increased when informative temporal fine structure cues were removed. This finding suggests that visual information may play an important role in the ability of cochlear-implant users to understand speech in many everyday situations. Models of audio-visual integration were able to account for the additional benefit of visual information when speech was degraded and suggested that auditory and visual information was being integrated in a similar way in all conditions. The modelling results were consistent with the notion that audio-visual benefit is derived from the optimal combination of auditory and visual sensory cues

    Individual Differences in Sound-in-Noise Perception Are Related to the Strength of Short-Latency Neural Responses to Noise

    Get PDF
    Important sounds can be easily missed or misidentified in the presence of extraneous noise. We describe an auditory illusion in which a continuous ongoing tone becomes inaudible during a brief, non-masking noise burst more than one octave away, which is unexpected given the frequency resolution of human hearing. Participants strongly susceptible to this illusory discontinuity did not perceive illusory auditory continuity (in which a sound subjectively continues during a burst of masking noise) when the noises were short, yet did so at longer noise durations. Participants who were not prone to illusory discontinuity showed robust early electroencephalographic responses at 40–66 ms after noise burst onset, whereas those prone to the illusion lacked these early responses. These data suggest that short-latency neural responses to auditory scene components reflect subsequent individual differences in the parsing of auditory scenes

    Enhanced Syllable Discrimination Thresholds in Musicians

    Get PDF
    Speech processing inherently relies on the perception of specific, rapidly changing spectral and temporal acoustic features. Advanced acoustic perception is also integral to musical expertise, and accordingly several studies have demonstrated a significant relationship between musical training and superior processing of various aspects of speech. Speech and music appear to overlap in spectral and temporal features; however, it remains unclear which of these acoustic features, crucial for speech processing, are most closely associated with musical training. The present study examined the perceptual acuity of musicians to the acoustic components of speech necessary for intra-phonemic discrimination of synthetic syllables. We compared musicians and non-musicians on discrimination thresholds of three synthetic speech syllable continua that varied in their spectral and temporal discrimination demands, specifically voice onset time (VOT) and amplitude envelope cues in the temporal domain. Musicians demonstrated superior discrimination only for syllables that required resolution of temporal cues. Furthermore, performance on the temporal syllable continua positively correlated with the length and intensity of musical training. These findings support one potential mechanism by which musical training may selectively enhance speech perception, namely by reinforcing temporal acuity and/or perception of amplitude rise time, and implications for the translation of musical training to long-term linguistic abilities.Grammy FoundationWilliam F. Milton Fun

    A Corticothalamic Circuit Model for Sound Identification in Complex Scenes

    Get PDF
    The identification of the sound sources present in the environment is essential for the survival of many animals. However, these sounds are not presented in isolation, as natural scenes consist of a superposition of sounds originating from multiple sources. The identification of a source under these circumstances is a complex computational problem that is readily solved by most animals. We present a model of the thalamocortical circuit that performs level-invariant recognition of auditory objects in complex auditory scenes. The circuit identifies the objects present from a large dictionary of possible elements and operates reliably for real sound signals with multiple concurrently active sources. The key model assumption is that the activities of some cortical neurons encode the difference between the observed signal and an internal estimate. Reanalysis of awake auditory cortex recordings revealed neurons with patterns of activity corresponding to such an error signal

    Pitch Comparisons between Electrical Stimulation of a Cochlear Implant and Acoustic Stimuli Presented to a Normal-hearing Contralateral Ear

    Get PDF
    Four cochlear implant users, having normal hearing in the unimplanted ear, compared the pitches of electrical and acoustic stimuli presented to the two ears. Comparisons were between 1,031-pps pulse trains and pure tones or between 12 and 25-pps electric pulse trains and bandpass-filtered acoustic pulse trains of the same rate. Three methods—pitch adjustment, constant stimuli, and interleaved adaptive procedures—were used. For all methods, we showed that the results can be strongly influenced by non-sensory biases arising from the range of acoustic stimuli presented, and proposed a series of checks that should be made to alert the experimenter to those biases. We then showed that the results of comparisons that survived these checks do not deviate consistently from the predictions of a widely-used cochlear frequency-to-place formula or of a computational cochlear model. We also demonstrate that substantial range effects occur with other widely used experimental methods, even for normal-hearing listeners

    Finding Your Mate at a Cocktail Party: Frequency Separation Promotes Auditory Stream Segregation of Concurrent Voices in Multi-Species Frog Choruses

    Get PDF
    Vocal communication in crowded social environments is a difficult problem for both humans and nonhuman animals. Yet many important social behaviors require listeners to detect, recognize, and discriminate among signals in a complex acoustic milieu comprising the overlapping signals of multiple individuals, often of multiple species. Humans exploit a relatively small number of acoustic cues to segregate overlapping voices (as well as other mixtures of concurrent sounds, like polyphonic music). By comparison, we know little about how nonhuman animals are adapted to solve similar communication problems. One important cue enabling source segregation in human speech communication is that of frequency separation between concurrent voices: differences in frequency promote perceptual segregation of overlapping voices into separate “auditory streams” that can be followed through time. In this study, we show that frequency separation (ΔF) also enables frogs to segregate concurrent vocalizations, such as those routinely encountered in mixed-species breeding choruses. We presented female gray treefrogs (Hyla chrysoscelis) with a pulsed target signal (simulating an attractive conspecific call) in the presence of a continuous stream of distractor pulses (simulating an overlapping, unattractive heterospecific call). When the ΔF between target and distractor was small (e.g., ≤3 semitones), females exhibited low levels of responsiveness, indicating a failure to recognize the target as an attractive signal when the distractor had a similar frequency. Subjects became increasingly more responsive to the target, as indicated by shorter latencies for phonotaxis, as the ΔF between target and distractor increased (e.g., ΔF = 6–12 semitones). These results support the conclusion that gray treefrogs, like humans, can exploit frequency separation as a perceptual cue to segregate concurrent voices in noisy social environments. The ability of these frogs to segregate concurrent voices based on frequency separation may involve ancient hearing mechanisms for source segregation shared with humans and other vertebrates
    corecore