Search CORE

Directory of Open Access Journals

Auditory Stream Segregation and the Perception of Across-Frequency Synchrony

Author: Ahad Bregman
Arthaud
Bacon
Carlyon
Glasberg Roberts
Grimault
Grimault
Micheyl
Moore
Publication venue
Publication date: 02/04/2020
Field of study

This study explored the extent to which sequential auditory grouping affects the perception of temporal synchrony. In Experiment 1, listeners discriminated between 2 pairs of asynchronous "target" tones at different frequencies, A and B, in which the B tone either led or lagged. Thresholds were markedly higher when the target tones were temporally surrounded by "captor tones" at the A frequency than when the captor tones were absent or at a remote frequency. Experiment 2 extended these findings to asynchrony detection, revealing that the perception of synchrony, one of the most potent cues for simultaneous auditory grouping, is not immune to competing effects of sequential grouping. Experiment 3 examined the influence of ear separation on the interactions between sequential and simultaneous grouping cues. The results showed that, although ear separation could facilitate perceptual segregation and impair asynchrony detection, it did not prevent the perceptual integration of simultaneous sounds. Keywords: perceptual organization, auditory perception, stream segregation, asynchrony Sensitivity to seemingly low-level sensory features in a scene can be profoundly influenced by the way in which the scene is perceptually organized. A striking example of this in visual perception relates to the finding by In the auditory modality, one of the most compelling examples of the influence of perceptual organization on sensitivity comes from demonstrations that listeners are largely unable to correctly perceive the relative timing of sounds that form part of separate perceptual "streams." The first lines of evidence for this were provided b

CiteSeerX

Relative Pitch Perception and the Detection of Deviant Tone Patterns.

Author: AS Bregman
BC Moore
C Micheyl
G Stefanics
I Winkler
JH McDermott
JH McDermott
K Friston
M Chait
M Fahle
M Kubovy
NE Foster
S Tew
SM Stringer
TD Griffiths
W Köhler
WJ Dowling
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Most people are able to recognise familiar tunes even when played in a different key. It is assumed that this depends on a general capacity for relative pitch perception; the ability to recognise the pattern of inter-note intervals that characterises the tune. However, when healthy adults are required to detect rare deviant melodic patterns in a sequence of randomly transposed standard patterns they perform close to chance. Musically experienced participants perform better than naïve participants, but even they find the task difficult, despite the fact that musical education includes training in interval recognition.To understand the source of this difficulty we designed an experiment to explore the relative influence of the size of within-pattern intervals and between-pattern transpositions on detecting deviant melodic patterns. We found that task difficulty increases when patterns contain large intervals (5-7 semitones) rather than small intervals (1-3 semitones). While task difficulty increases substantially when transpositions are introduced, the effect of transposition size (large vs small) is weaker. Increasing the range of permissible intervals to be used also makes the task more difficult. Furthermore, providing an initial exact repetition followed by subsequent transpositions does not improve performance. Although musical training correlates with task performance, we find no evidence that violations to musical intervals important in Western music (i.e. the perfect fifth or fourth) are more easily detected. In summary, relative pitch perception does not appear to be conducive to simple explanations based exclusively on invariant physical ratios

Springer - Publisher Connector

Plymouth Electronic Archive and Research Library

Repository of the Academy's Library

Signal detection in animal psychoacoustics: analysis and simulation of sensory and decision-related influences

Author: A. Alves-Pinto
Agus
Alsop
Alves-Pinto
Atiani
Barraclough
Bizley
Bizley
Blough
Blough
Blough
Blough
Britten
Brosch
Burdick
C.J. Sumner
Clopton
Costalupes
Creelman
Dayan
Dooling
Early
Gorea
Green
Green
Hack
Harrison
Heffner
Hine
Irving
J. Sollini
Jesteadt
Jesteadt
Kaernbach
Kelly
Klink
Lam
Lau
Lee
Levitt
Long
MacMillan
Maddox
Maier
Malone
Marston
Meier
Micheyl
Micheyl
Neff
Parker
Pastore
Petzold
Pressnitzer
Selezneva
Semal
Serafin
Shackleton
Shofner
Speeth
Stebbins
Swets
Talwar
Terman
Terman
Treisman
Ward
Wightman
Wixted
Zheng
Zhou
Publication venue: 'Elsevier BV'
Publication date: 18/09/2012
Field of study

Signal detection theory (SDT) provides a framework for interpreting psychophysical experiments, separating the putative internal sensory representation and the decision process. SDT was used to analyse ferret behavioural responses in a (yes–no) tone-in-noise detection task. Instead of measuring the receiver-operating characteristic (ROC), we tested SDT by comparing responses collected using two common psychophysical data collection methods. These (Constant Stimuli, Limits) differ in the set of signal levels presented within and across behavioural sessions. The results support the use of SDT as a method of analysis: SDT sensory component was unchanged between the two methods, even though decisions depended on the stimuli presented within a behavioural session. Decision criterion varied trial-by-trial: a ‘yes’ response was more likely after a correct rejection trial than a hit trial. Simulation using an SDT model with several decision components reproduced the experimental observations accurately, leaving only ∼10% of the variance unaccounted for. The model also showed that trial-by-trial dependencies were unlikely to influence measured psychometric functions or thresholds. An additional model component suggested that inattention did not contribute substantially. Further analysis showed that ferrets were changing their decision criteria, almost optimally, to maximise the reward obtained in a session. The data suggest trial-by-trial reward-driven optimization of the decision process. Understanding the factors determining behavioural responses is important for correlating neural activity and behaviour. SDT provides a good account of animal psychoacoustics, and can be validated using standard psychophysical methods and computer simulations, without recourse to ROC measurements

Nottingham Trent Institutional Repository (IRep)

UCL Discovery

Dimension-specific attention directs learning and listening on auditory training tasks

Author: A Karni
BA Wright
C Micheyl
DA Roth
David R. Moore
DB Polley
DG Jamieson
DG Jamieson
DG Jamieson
DM Green
FA Wichmann
GT Fechner
H Levitt
IS Hairston
Jenny L. Taylor
Lorna F. Halliday
M Ahissar
M Ahissar
M Ahissar
M Soltani
MA García-Pérez
R Ulrich
R Ulrich
S Amitay
S Amitay
S Amitay
Sygal Amitay
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/07/2011
Field of study

The relative contributions of bottom-up versus top-down sensory inputs to auditory learning are not well established. In our experiment, listeners were instructed to perform either a frequency discrimination (FD) task ("FD-train group") or an intensity discrimination (ID) task ("ID-train group") during training on a set of physically identical tones that were impossible to discriminate consistently above chance, allowing us to vary top-down attention whilst keeping bottom-up inputs fixed. A third, control group did not receive any training. Only the FD-train group improved on a FD probe following training, whereas all groups improved on ID following training. However, only the ID-train group also showed changes in performance accuracy as a function of interval with training on the ID task. These findings suggest that top-down, dimension-specific attention can direct auditory learning, even when this learning is not reflected in conventional performance measures of threshold change

UCL Discovery

The University of Manchester - Institutional Repository

Informational masking and the effects of differences in fundamental frequency and fundamental-frequency contour on phonetic integration in a formant ensemble

Author: Bailey
Barker
Binns
Bird
Boersma
Bregman
Brian Roberts
Broadbent
Brokx
Culling
Cutting
Darwin
Darwin
Davis
Deroche
Durlach
Foster
Gardner
Henke
Hillenbrand
Institute of Electrical and Electronics Engineers (IEEE)
Keppel
Kidd
Kidd
Klatt
Laures
Lee
Mattys
Micheyl
Miller
Neff
Peter J. Bailey
Rand
Remez
Remez
Remez
Robert J. Summers
Roberts
Roberts
Roberts
Roberts
Rosenberg
Shinn-Cunningham
Snedecor
Stone
Stone
Summers
Summers
Summers
Wingfield
Publication venue: 'Elsevier BV'
Publication date: 01/02/2017
Field of study

This study explored the effects on speech intelligibility of across-formant differences in fundamental frequency (ΔF0) and F0 contour. Sentence-length speech analogues were presented dichotically (left=F1+F3; right=F2), either alone or—because competition usually reveals grouping cues most clearly—accompanied in the left ear by a competitor for F2 (F2C) that listeners must reject to optimize recognition. F2C was created by inverting the F2 frequency contour. In experiment 1, all left-ear formants shared the same constant F0 and ΔF0F2 was 0 or ±4 semitones. In experiment 2, all left-ear formants shared the natural F0 contour and that for F2 was natural, constant, exaggerated, or inverted. Adding F2C lowered keyword scores, presumably because of informational masking. The results for experiment 1 were complicated by effects associated with the direction of ΔF0F2; this problem was avoided in experiment 2 because all four F0 contours had the same geometric mean frequency. When the target formants were presented alone, scores were relatively high and did not depend on the F0F2 contour. F2C impact was greater when F2 had a different F0 contour from the other formants. This effect was a direct consequence of the associated ΔF0; the F0F2 contour per se did not influence competitor impact

Aston Publications Explorer

The contribution of visual information to the perception of speech in noise with and without informative temporal fine structure

Author: Altieri
Bernstein
Blamey
Boothroyd
Braida
Brungart
Christian J. Sumner
Cohen
Cooke
Dau
Davis
Desai
Dorman
Eaves
Erber
Faul
Freyman
Fu
Glasberg
Grant
Grant
Grant
Grant
Grant
Green
Horne
IEEE
Ihlefeld
Kaiser
Kong
Kramer
Lakatos
Langeheine
Lorenzi
Luce
Lunner
Massaro
McGettigan
Micheyl
Miller
Moore
Moore
Paula C. Stacey
Peelle
Pádraig T. Kitterick
Qin
Rader
Rosen
Rouger
Saffron D. Morris
Schafer
Seldran
Shamma
Shannon
Skinner
Sumby
Sumner
Treisman
Tye-Murray
Tyler
Whitmal
Wolfe
Yakel
Publication venue: 'Elsevier BV'
Publication date: 01/06/2016
Field of study

Understanding what is said in demanding listening situations is assisted greatly by looking at the face of a talker. Previous studies have observed that normal-hearing listeners can benefit from this visual information when a talker’s voice is presented in background noise. These benefits have also been observed in quiet listening conditions in cochlear-implant users, whose device does not convey the informative temporal fine structure cues in speech, and when normal-hearing individuals listen to speech processed to remove these informative temporal fine structure cues. The current study (1) characterised the benefits of visual information when listening in background noise; and (2) used sine-wave vocoding to compare the size of the visual benefit when speech is presented with or without informative temporal fine structure. The accuracy with which normal-hearing individuals reported words in spoken sentences was assessed across three experiments. The availability of visual information and informative temporal fine structure cues was varied within and across the experiments. The results showed that visual benefit was observed using open- and closed-set tests of speech perception. The size of the benefit increased when informative temporal fine structure cues were removed. This finding suggests that visual information may play an important role in the ability of cochlear-implant users to understand speech in many everyday situations. Models of audio-visual integration were able to account for the additional benefit of visual information when speech was degraded and suggested that auditory and visual information was being integrated in a similar way in all conditions. The modelling results were consistent with the notion that audio-visual benefit is derived from the optimal combination of auditory and visual sensory cues

Nottingham ePrints

Nottingham eTheses

Nottingham Trent Institutional Repository (IRep)

Repository@Nottingham

UCL Discovery

Effect of stimulus type and pitch salience on pitch-sequence processing

Author: Christophe Micheyl
Daniel Pressnitzer
Demany L.
Laurent Demany
Macmillan N. A.
Marion Cousineau
Moore B. C. J.
Noreen D. L.
Patterson R. D.
Plack C. J.
Samuele Carcagno
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/06/2018
Field of study

Using a same-different discrimination task, it has been shown that discrimination performance for sequences of complex tones varying just detectably in pitch is less dependent on sequence length (1, 2, or 4 elements) when the tones contain resolved harmonics than when they do not [Cousineau, Demany, and Pessnitzer (2009). J. Acoust. Soc. Am. 126, 3179-3187]. This effect had been attributed to the activation of automatic frequency-shift detectors (FSDs) by the shifts in resolved harmonics. The present study provides evidence against this hypothesis by showing that the sequence-processing advantage found for complex tones with resolved harmonics is not found for pure tones or other sounds supposed to activate FSDs (narrow bands of noise and wide-band noises eliciting pitch sensations due to interaural phase shifts). The present results also indicate that for pitch sequences, processing performance is largely unrelated to pitch salience per se: for a fixed level of discriminability between sequence elements, sequences of elements with salient pitches are not necessarily better processed than sequences of elements with less salient pitches. An ideal-observer model for the same-different binary-sequence discrimination task is also developed in the present study. The model allows the computation of d' for this task using numerical methods

Lancaster E-Prints

Stream segregation in the anesthetized auditory cortex

Author: Alain
Alan R. Palmer
Albrecht
Anstis
Bee
Bee
Bee
Bidet-Caulet
Bregman
Bregman
Brosch
Buunen
Capsius
Carlyon
Carlyon
Carlyon
Chris Scholes
Christian J. Sumner
Christison-Lagay
Cusack
Denham
Elhilali
Fishman
Fishman
Fritz
Goldberg
Gutschalk
Hara
Harris
Hill
Hillyard
Itatani
Kanwal
Kiang
Lakatos
Macken
Mardia
May
Mesgarani
Micheyl
Paltoglou
Pressnitzer
Pressnitzer
Rennaker
Rose
Scholes
Shamma
Shore
Smith
Snyder
Snyder
Snyder
Spielmann
Sridhar
Sussman
Syka
Taaseh
Thompson
Ulanovsky
Ulanovsky
van Noorden
Winkler
Publication venue: 'Elsevier BV'
Publication date: 01/10/2015
Field of study

Auditory stream segregation describes the way that sounds are perceptually segregated into groups or streams on the basis of perceptual attributes such as pitch or spectral content. For sequences of pure tones, segregation depends on the tones' proximity in frequency and time. In the auditory cortex (and elsewhere) responses to sequences of tones are dependent on stimulus conditions in a similar way to the perception of these stimuli. However, although highly dependent on stimulus conditions, perception is also clearly influenced by factors unrelated to the stimulus, such as attention. Exactly how ‘bottom-up’ sensory processes and non-sensory ‘top-down’ influences interact is still not clear. Here, we recorded responses to alternating tones (ABAB …) of varying frequency difference (FD) and rate of presentation (PR) in the auditory cortex of anesthetized guinea-pigs. These data complement previous studies, in that top-down processing resulting from conscious perception should be absent or at least considerably attenuated. Under anesthesia, the responses of cortical neurons to the tone sequences adapted rapidly, in a manner sensitive to both the FD and PR of the sequences. While the responses to tones at frequencies more distant from neuron best frequencies (BFs) decreased as the FD increased, the responses to tones near to BF increased, consistent with a release from adaptation, or forward suppression. Increases in PR resulted in reductions in responses to all tones, but the reduction was greater for tones further from BF. Although asymptotically adapted responses to tones showed behavior that was qualitatively consistent with perceptual stream segregation, responses reached asymptote within 2 s, and responses to all tones were very weak at high PRs (>12 tones per second). A signal-detection model, driven by the cortical population response, made decisions that were dependent on both FD and PR in ways consistent with perceptual stream segregation. This included showing a range of conditions over which decisions could be made either in favor of perceptual integration or segregation, depending on the model ‘decision criterion’. However, the rate of ‘build-up’ was more rapid than seen perceptually, and at high PR responses to tones were sometimes so weak as to be undetectable by the model. Under anesthesia, adaptation occurs rapidly, and at high PRs tones are generally poorly represented, which compromises the interpretation of the experiment. However, within these limitations, these results complement experiments in awake animals and humans. They generally support the hypothesis that ‘bottom-up’ sensory processing plays a major role in perceptual organization, and that processes underlying stream segregation are active in the absence of attention

Nottingham Trent Institutional Repository (IRep)