24 research outputs found

    A temporal integration mechanism enhances frequency selectivity of broadband inputs to inferior colliculus.

    No full text
    Accurately resolving frequency components in sounds is essential for sound recognition, yet there is little direct evidence for how frequency selectivity is preserved or newly created across auditory structures. We demonstrate that prepotentials (PPs) with physiological properties resembling presynaptic potentials from broadly tuned brainstem inputs can be recorded concurrently with postsynaptic action potentials in inferior colliculus (IC). These putative brainstem inputs (PBIs) are broadly tuned and exhibit delayed and spectrally interleaved excitation and inhibition not present in the simultaneously recorded IC neurons (ICNs). A sharpening of tuning is accomplished locally at the expense of spike-timing precision through nonlinear temporal integration of broadband inputs. A neuron model replicates the finding and demonstrates that temporal integration alone can degrade timing precision while enhancing frequency tuning through interference of spectrally in- and out-of-phase inputs. These findings suggest that, in contrast to current models that require local inhibition, frequency selectivity can be sharpened through temporal integration, thus supporting an alternative computational strategy to quickly refine frequency selectivity

    Origins of scale invariance in vocalization sequences and speech.

    Get PDF
    To communicate effectively animals need to detect temporal vocalization cues that vary over several orders of magnitude in their amplitude and frequency content. This large range of temporal cues is evident in the power-law scale-invariant relationship between the power of temporal fluctuations in sounds and the sound modulation frequency (f). Though various forms of scale invariance have been described for natural sounds, the origins and implications of scale invariant phenomenon remain unknown. Using animal vocalization sequences, including continuous human speech, and a stochastic model of temporal amplitude fluctuations we demonstrate that temporal acoustic edges are the primary acoustic cue accounting for the scale invariant phenomenon. The modulation spectrum of vocalization sequences and the model both exhibit a dual regime lowpass structure with a flat region at low modulation frequencies and scale invariant 1/f2 trend for high modulation frequencies. Moreover, we find a time-frequency tradeoff between the average vocalization duration of each vocalization sequence and the cutoff frequency beyond which scale invariant behavior is observed. These results indicate that temporal edges are universal features responsible for scale invariance in vocalized sounds. This is significant since temporal acoustic edges are salient perceptually and the auditory system could exploit such statistical regularities to minimize redundancies and generate compact neural representations of vocalized sounds

    A neural ensemble correlation code for sound category identification.

    No full text
    Humans and other animals effortlessly identify natural sounds and categorize them into behaviorally relevant categories. Yet, the acoustic features and neural transformations that enable sound recognition and the formation of perceptual categories are largely unknown. Here, using multichannel neural recordings in the auditory midbrain of unanesthetized female rabbits, we first demonstrate that neural ensemble activity in the auditory midbrain displays highly structured correlations that vary with distinct natural sound stimuli. These stimulus-driven correlations can be used to accurately identify individual sounds using single-response trials, even when the sounds do not differ in their spectral content. Combining neural recordings and an auditory model, we then show how correlations between frequency-organized auditory channels can contribute to discrimination of not just individual sounds but sound categories. For both the model and neural data, spectral and temporal correlations achieved similar categorization performance and appear to contribute equally. Moreover, both the neural and model classifiers achieve their best task performance when they accumulate evidence over a time frame of approximately 1-2 seconds, mirroring human perceptual trends. These results together suggest that time-frequency correlations in sounds may be reflected in the correlations between auditory midbrain ensembles and that these correlations may play an important role in the identification and categorization of natural sounds

    Ensemble averaging of vocalization pulse spectra predicts the observed vocalization AMPS.

    No full text
    <p>(a) Three example pulses from the speech ensemble. (b) The AMPS for each pulse consists of a sinc<sup>2</sup> function with side lobe peaks and notch locations that depend on the vocalization duration and the side-lobe amplitudes that drop off proportional to 1/<i>f</i><sup>2</sup> (blue dotted lines). (c) The AMPS is obtained as the ensemble average across all durations, which produces an AMPS with lowpass structure and 1/<i>f</i><sup>2</sup> trend at high frequencies.</p

    Estimated model parameters for each vocalization sequence.

    No full text
    <p>Estimated model parameters for each vocalization sequence.</p

    Joint correlation statistics between the measured model parameters.

    No full text
    <p>Joint correlation statistics between the measured model parameters.</p

    Envelope extraction, segmentation, and model fitting.

    No full text
    <p>(<b>a</b>) Acoustic waveform for a speech sample from the BBC reproduction of Hamlet containing the phrase “That’s not my meaning: but breathes his faults so quaintly.” (<b>b</b>) The envelope used for segmentation (blue) was obtained by lowpass filtering the analytic signal amplitude at 30 Hz whereas the envelope used for data analysis and model fitting was filtered at 250 Hz (red). The optimized model envelope for this example consists of sequence of non-overlapping rectangular pulses of variable duration and amplitude (green). (<b>c</b>) Zoomed-in view of a short segment of the corresponding envelopes in (<b>b</b>). The model (green) captures the transient onsets and offsets between consecutive speech elements and words, but is unable to capture other envelope features such as the fast-periodic fluctuations created through vocal fold vibration (~190 Hz fundamental in <b>c</b>) that are evident in the original envelope (red).</p
    corecore