2,205 research outputs found

    Learning Mid-Level Auditory Codes from Natural Sound Statistics

    Get PDF
    Interaction with the world requires an organism to transform sensory signals into representations in which behaviorally meaningful properties of the environment are made explicit. These representations are derived through cascades of neuronal processing stages in which neurons at each stage recode the output of preceding stages. Explanations of sensory coding may thus involve understanding how low-level patterns are combined into more complex structures. Although models exist in the visual domain to explain how mid-level features such as junctions and curves might be derived from oriented filters in early visual cortex, little is known about analogous grouping principles for mid-level auditory representations. We propose a hierarchical generative model of natural sounds that learns combina- tions of spectrotemporal features from natural stimulus statistics. In the first layer the model forms a sparse convolutional code of spectrograms using a dictionary of learned spectrotemporal kernels. To generalize from specific kernel activation patterns, the second layer encodes patterns of time-varying magnitude of multiple first layer coefficients. Because second-layer features are sensitive to combi- nations of spectrotemporal features, the representation they support encodes more complex acoustic patterns than the first layer. When trained on corpora of speech and environmental sounds, some second-layer units learned to group spectrotemporal features that occur together in natural sounds. Others instantiate opponency between dissimilar sets of spectrotemporal features. Such groupings might be instantiated by neurons in the auditory cortex, providing a hypothesis for mid-level neuronal computation.This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216

    The Opponent Channel Population Code of Sound Location Is an Efficient Representation of Natural Binaural Sounds

    Get PDF
    In mammalian auditory cortex, sound source position is represented by a population of broadly tuned neurons whose firing is modulated by sounds located at all positions surrounding the animal. Peaks of their tuning curves are concentrated at lateral position, while their slopes are steepest at the interaural midline, allowing for the maximum localization accuracy in that area. These experimental observations contradict initial assumptions that the auditory space is represented as a topographic cortical map. It has been suggested that a “panoramic” code has evolved to match specific demands of the sound localization task. This work provides evidence suggesting that properties of spatial auditory neurons identified experimentally follow from a general design principle- learning a sparse, efficient representation of natural stimuli. Natural binaural sounds were recorded and served as input to a hierarchical sparse-coding model. In the first layer, left and right ear sounds were separately encoded by a population of complex-valued basis functions which separated phase and amplitude. Both parameters are known to carry information relevant for spatial hearing. Monaural input converged in the second layer, which learned a joint representation of amplitude and interaural phase difference. Spatial selectivity of each second-layer unit was measured by exposing the model to natural sound sources recorded at different positions. Obtained tuning curves match well tuning characteristics of neurons in the mammalian auditory cortex. This study connects neuronal coding of the auditory space with natural stimulus statistics and generates new experimental predictions. Moreover, results presented here suggest that cortical regions with seemingly different functions may implement the same computational strategy-efficient coding.German Science Foundation (Graduate College "InterNeuro"

    Functional Sensory Representations of Natural Stimuli: the Case of Spatial Hearing

    Get PDF
    In this thesis I attempt to explain mechanisms of neuronal coding in the auditory system as a form of adaptation to statistics of natural stereo sounds. To this end I analyse recordings of real-world auditory environments and construct novel statistical models of these data. I further compare regularities present in natural stimuli with known, experimentally observed neuronal mechanisms of spatial hearing. In a more general perspective, I use binaural auditory system as a starting point to consider the notion of function implemented by sensory neurons. In particular I argue for two, closely-related tenets: 1. The function of sensory neurons can not be fully elucidated without understanding statistics of natural stimuli they process. 2. Function of sensory representations is determined by redundancies present in the natural sensory environment. I present the evidence in support of the first tenet by describing and analysing marginal statistics of natural binaural sound. I compare observed, empirical distributions with knowledge from reductionist experiments. Such comparison allows to argue that the complexity of the spatial hearing task in the natural environment is much higher than analytic, physics-based predictions. I discuss the possibility that early brain stem circuits such as LSO and MSO do not \"compute sound localization\" as is often being claimed in the experimental literature. I propose that instead they perform a signal transformation, which constitutes the first step of a complex inference process. To support the second tenet I develop a hierarchical statistical model, which learns a joint sparse representation of amplitude and phase information from natural stereo sounds. I demonstrate that learned higher order features reproduce properties of auditory cortical neurons, when probed with spatial sounds. Reproduced aspects were hypothesized to be a manifestation of a fine-tuned computation specific to the sound-localization task. Here it is demonstrated that they rather reflect redundancies present in the natural stimulus. Taken together, results presented in this thesis suggest that efficient coding is a strategy useful for discovering structures (redundancies) in the input data. Their meaning has to be determined by the organism via environmental feedback

    Representation of statistical sound properties in human auditory cortex

    Get PDF
    The work carried out in this doctoral thesis investigated the representation of statistical sound properties in human auditory cortex. It addressed four key aspects in auditory neuroscience: the representation of different analysis time windows in auditory cortex; mechanisms for the analysis and segregation of auditory objects; information-theoretic constraints on pitch sequence processing; and the analysis of local and global pitch patterns. The majority of the studies employed a parametric design in which the statistical properties of a single acoustic parameter were altered along a continuum, while keeping other sound properties fixed. The thesis is divided into four parts. Part I (Chapter 1) examines principles of anatomical and functional organisation that constrain the problems addressed. Part II (Chapter 2) introduces approaches to digital stimulus design, principles of functional magnetic resonance imaging (fMRI), and the analysis of fMRI data. Part III (Chapters 3-6) reports five experimental studies. Study 1 controlled the spectrotemporal correlation in complex acoustic spectra and showed that activity in auditory association cortex increases as a function of spectrotemporal correlation. Study 2 demonstrated a functional hierarchy of the representation of auditory object boundaries and object salience. Studies 3 and 4 investigated cortical mechanisms for encoding entropy in pitch sequences and showed that the planum temporale acts as a computational hub, requiring more computational resources for sequences with high entropy than for those with high redundancy. Study 5 provided evidence for a hierarchical organisation of local and global pitch pattern processing in neurologically normal participants. Finally, Part IV (Chapter 7) concludes with a general discussion of the results and future perspectives

    Backwards is the way forward: feedback in the cortical hierarchy predicts the expected future

    Get PDF
    Clark offers a powerful description of the brain as a prediction machine, which offers progress on two distinct levels. First, on an abstract conceptual level, it provides a unifying framework for perception, action, and cognition (including subdivisions such as attention, expectation, and imagination). Second, hierarchical prediction offers progress on a concrete descriptive level for testing and constraining conceptual elements and mechanisms of predictive coding models (estimation of predictions, prediction errors, and internal models)

    Towards music perception by redundancy reduction and unsupervised learning in probabilistic models

    Get PDF
    PhDThe study of music perception lies at the intersection of several disciplines: perceptual psychology and cognitive science, musicology, psychoacoustics, and acoustical signal processing amongst others. Developments in perceptual theory over the last fifty years have emphasised an approach based on Shannon’s information theory and its basis in probabilistic systems, and in particular, the idea that perceptual systems in animals develop through a process of unsupervised learning in response to natural sensory stimulation, whereby the emerging computational structures are well adapted to the statistical structure of natural scenes. In turn, these ideas are being applied to problems in music perception. This thesis is an investigation of the principle of redundancy reduction through unsupervised learning, as applied to representations of sound and music. In the first part, previous work is reviewed, drawing on literature from some of the fields mentioned above, and an argument presented in support of the idea that perception in general and music perception in particular can indeed be accommodated within a framework of unsupervised learning in probabilistic models. In the second part, two related methods are applied to two different low-level representations. Firstly, linear redundancy reduction (Independent Component Analysis) is applied to acoustic waveforms of speech and music. Secondly, the related method of sparse coding is applied to a spectral representation of polyphonic music, which proves to be enough both to recognise that the individual notes are the important structural elements, and to recover a rough transcription of the music. Finally, the concepts of distance and similarity are considered, drawing in ideas about noise, phase invariance, and topological maps. Some ecologically and information theoretically motivated distance measures are suggested, and put in to practice in a novel method, using multidimensional scaling (MDS), for visualising geometrically the dependency structure in a distributed representation.Engineering and Physical Science Research Counci

    SPECTRAL INTEGRATION AND NEURAL REPRESENTATION OF HARMONIC COMPLEX TONES IN PRIMATE AUDITORY CORTEX

    Get PDF
    Many natural and man-made sounds, such as animal vocalizations, human speech, and sounds from many musical instruments contain rich harmonic structures. Although the peripheral auditory system decomposes these sounds into separate frequency channels, harmonically related frequency components must be grouped together in order to form a single auditory percept. A central neural process is therefore required to accomplish this perceptual grouping and to integrate information across frequency channels in order to compute spectral properties, such as pitch and timber, which are not explicitly encoded in the auditory periphery. In this dissertation, I investigated whether there are representations of harmonic structures at the single neuron level in auditory cortex beyond pitch and how harmonic sounds are represented by populations of cortical neurons. I systematically tested single neurons in the primary auditory cortex (A1) of awake marmoset monkeys with harmonic and inharmonic complex tones, varying fundamental frequency (f0) and harmonic composition. I found harmonic template neurons, which were strongly driven by harmonic complex tones but showed weak or no response to single harmonics. Harmonic template neurons were selective to f0s and sensitive to harmonic numbers. They also exhibited a reduced firing rate in response to inharmonic complex tones. Other sound features of a harmonic complex tone, such as overall sound level, resolved individual harmonic partials, and temporal envelope were represented by different subpopulations of neurons in A1. Overall, the findings of this dissertation support the existence of a distributed neural code for harmonic complex tones in A1 which represents an important stage in the auditory pathway for robust feature extraction and sound source recognition. In the study of spectral integration and neural coding of complex tones, searching for preferred stimuli of cortical neurons has also proven challenging because of the high dimensionality of the acoustic space of possible stimuli and limited recording time. In the last part of this dissertation, I presented an online adaptive stimulus design approach based on a neural network model for studying spectral integration in auditory cortex. The models estimated online helped to build a connection between receptive field structures and diverse spectral selectivity of cortical neurons
    • …
    corecore