13 research outputs found
Modulation-frequency acts as a primary cue for auditory stream segregation
In our surrounding acoustic world sounds are produced by different sources and interfere with each other before arriving to the ears. A key function of the auditory system is to provide consistent and robust descriptions of the coherent sound groupings and sequences (auditory objects), which likely correspond to the various sound sources in the environment. This function has been termed auditory stream segregation. In the current study we tested the effects of separation in the frequency of amplitude modulation on the segregation of concurrent sound sequences in the auditory stream-segregation paradigm (van Noorden 1975). The aim of the study was to assess 1) whether differential amplitude modulation would help in separating concurrent sound sequences and 2) whether this cue would interact with previously studied static cues (carrier frequency and location difference) in segregating concurrent streams of sound. We found that amplitude modulation difference is utilized as a primary cue for the stream segregation and it interacts with other primary cues such as frequency and location difference
Neuronal Correlates of Informational and Energetic Masking in the Human Brain in a Multi-Talker Situation
Human listeners can follow the voice of one speaker while several others are talking at the same time. This process requires segregating the speech streams from each other and continuously directing attention to the target stream. We investigated the functional brain networks underlying this ability. Two speech streams were presented simultaneously to participants, who followed one of them and detected targets within it (target stream). The loudness of the distractor speech stream varied on five levels: moderately softer, slightly softer, equal, slightly louder, or moderately louder than the attended. Performance measures showed that the most demanding task was the moderately softer distractors condition, which indicates that a softer distractor speech may receive more covert attention than louder distractors and, therefore, they require more cognitive resources. EEG-based measurement of functional connectivity between various brain regions revealed frequency-band specific networks: (1) energetic masking (comparing the louder distractor conditions with the equal loudness condition) was predominantly associated with stronger connectivity between the frontal and temporal regions at the lower alpha (8â10 Hz) and gamma (30â70 Hz) bands; (2) informational masking (comparing the softer distractor conditions with the equal loudness condition) was associated with a distributed network between parietal, frontal, and temporal regions at the theta (4â8 Hz) and beta (13â30 Hz) bands. These results suggest the presence of distinct cognitive and neural processes for solving the interference from energetic vs. informational masking
Effects of multiple congruent cues on concurrent sound segregation during passive and active listening: An event-related potential (ERP) study
In two experiments, we assessed the effects of combining different cues of concurrent sound
segregation on the object-related negativity (ORN) and the P400 event-related potential components.
Participants were presented with sequences of complex tones, half of which contained some
manipulation: One or two harmonic partials were mistuned, delayed, or presented from a different
location than the rest. In separate conditions, one, two, or three of these manipulations were combined.
Participants watched a silent movie (passive listening) or reported after each tone whether they
perceived one or two concurrent sounds (active listening). ORN was found in almost all conditions
except for location difference alone during passive listening. Combining several cues or manipulating
more than one partial consistently led to sub-additive effects on the ORN amplitude. These results
support the view that ORN reflects an integrated, feature-unspecific assessment of the auditory system
regarding the contribution of two sources to the incoming sound
Different roles of similarity and predictability in auditory stream segregation
Sound sources often emit trains of discrete sounds, such as a series of footsteps. Previously, two difÂŹferent principles have been suggested for how the human auditory system binds discrete sounds toÂŹgether into perceptual units. The feature similarity principle is based on linking sounds with similar characteristics over time. The predictability principle is based on linking sounds that follow each other in a predictable manner. The present study compared the effects of these two principles. Participants were presented with tone sequences and instructed to continuously indicate whether they perceived a single coherent sequence or two concurrent streams of sound. We investigated the inïŹuence of separate manipulations of similarity and predictability on these perceptual reports. Both grouping principles affected perception of the tone sequences, albeit with different characteristics. In particular, results suggest that whereas predictability is only analyzed for the currently perceived sound organization, feature similarity is also analyzed for alternative groupings of sound. Moreover, changing similarity or predictability within an ongoing sound sequence led to markedly different dynamic effects. Taken together, these results provide evidence for different roles of similarity and predictability in auditory scene analysis, suggesting that forming auditory stream representations and competition between alterÂŹnatives rely on partly different processes
Foreground-background discrimination indicated by event-related brain potentials in a new auditory multistability paradigm
For studying multistable auditory perception, we propose a paradigm that evokes integrated or segregated perception of a sound sequence, and permits decomposition of the segregated grouping into foreground and background sounds. The paradigm combines 3-tone pitch patterns with alternating timbres, resulting in a repeating 6-tone structure that can be perceived as rising based on temporal proximity, or as falling based on timbre similarity. Listeners continuously report their percept while EEG is recorded. Results show an ERP modulation starting at ~70 ms after sound onset that can be explained by whether a sound belongs to perceived foreground or background, with no additional effect of integrated vs. segregated grouping. Auditory grouping as indexed by the mismatch negativity did not correspond with reported sound grouping. The paradigm offers a new possibility for investigating effects of conscious perceptual organization on sound processing
The effects of rhythm and melody on auditory stream segregation
Whilst many studies have assessed the efficacy of similarity-based cues for auditory stream segregation, much less is known about whether and how the larger-scale structure of sound sequences support stream formation and the choice of sound organization. Two experiments investigated the effects of musical melody and rhythm on the segregation of two interleaved tone sequences. The two sets of tones fully overlapped in pitch range, but differed from each other in interaural time and intensity. Unbeknownst to the listener, separately, each of the interleaved sequences was created from the notes of a different song. In different experimental conditions, the notes and/or their timing could either follow those of the songs, or they could be scrambled or, in case of timing, set to be isochronous. Listeners were asked to continuously report whether they heard a single coherent sequence (integrated) or two concurrent streams (segregated). Although temporal overlap between tones from the two streams proved to be the strongest cue for stream segregation, significant effects of tonality and familiarity with the songs were also observed. These results suggest that the regular temporal patterns are utilized as cues in auditory stream segregation and that long-term memory is involved in this process
Attention and speech-processing related functional brain networks activated in a multi-speaker environment
Human listeners can focus on one speech stream out of several concurrent ones. The present study aimed to assess the whole-brain functional networks underlying a) the process of focusing attention on a single speech stream vs. dividing attention between two streams and 2) speech processing on different time-scales and depth. Two spoken narratives were presented simultaneously while listeners were instructed to a) track and memorize the contents of a speech stream and b) detect the presence of numerals or syntactic violations in the same (âfocused attended conditionâ) or in the parallel stream (âdivided attended conditionâ). Speech content tracking was found to be associated with stronger connectivity in lower frequency bands (delta band- 0,5â4 Hz), whereas the detection tasks were linked with networks operating in the faster alpha (8â10 Hz) and beta (13â30 Hz) bands. These results suggest that the oscillation frequencies of the dominant brain networks during speech processing may be related to the duration of the time window within which information is integrated. We also found that focusing attention on a single speaker compared to dividing attention between two concurrent speakers was predominantly associated with connections involving the frontal cortices in the delta (0.5â4 Hz), alpha (8â10 Hz), and beta bands (13â30 Hz), whereas dividing attention between two parallel speech streams was linked with stronger connectivity involving the parietal cortices in the delta and beta frequency bands. Overall, connections strengthened by focused attention may reflect control over information selection, whereas connections strengthened by divided attention may reflect the need for maintaining two streams in parallel and the related control processes necessary for performing the tasks.</div