119 research outputs found
Feature extraction based on bio-inspired model for robust emotion recognition
Emotional state identification is an important issue to achieve more natural speech interactive systems. Ideally, these systems should also be able to work in real environments in which generally exist some kind of noise. Several bio-inspired representations have been applied to artificial systems for speech processing under noise conditions. In this work, an auditory signal representation is used to obtain a novel bio-inspired set of features for emotional speech signals. These characteristics, together with other spectral and prosodic features, are used for emotion recognition under noise conditions. Neural models were trained as classifiers and results were compared to the well-known mel-frequency cepstral coefficients. Results show that using the proposed representations, it is possible to significantly improve the robustness of an emotion recognition system. The results were also validated in a speaker independent scheme and with two emotional speech corpora.Fil: Albornoz, Enrique Marcelo. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de IngenierÃa y Ciencias HÃdricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de IngenierÃa y Ciencias HÃdricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Rufiner, Hugo Leonardo. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de IngenierÃa y Ciencias HÃdricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentin
Non-hexagonal neural dynamics in vowel space
Are the grid cells discovered in rodents relevant to human cognition? Following up on two seminal studies by others, we aimed to check whether an approximate 6-fold, grid-like symmetry shows up in the cortical activity of humans who "navigate" between vowels, given that vowel space can be approximated with a continuous trapezoidal 2D manifold, spanned by the first and second formant frequencies. We created 30 vowel trajectories in the assumedly flat central portion of the trapezoid. Each of these trajectories had a duration of 240 milliseconds, with a steady start and end point on the perimeter of a "wheel". We hypothesized that if the neural representation of this "box" is similar to that of rodent grid units, there should be an at least partial hexagonal (6-fold) symmetry in the EEG response of participants who navigate it. We have not found any dominant n-fold symmetry, however, but instead, using PCAs, we find indications that the vowel representation may reflect phonetic features, as positioned on the vowel manifold. The suggestion, therefore, is that vowels are encoded in relation to their salient sensory-perceptual variables, and are not assigned to arbitrary gridlike abstract maps. Finally, we explored the relationship between the first PCA eigenvector and putative vowel attractors for native Italian speakers, who served as the subjects in our study
Reconstructing Speech from Human Auditory Cortex
Direct brain recordings from neurosurgical patients listening to speech reveal that the acoustic speech signals can be reconstructed from neural activity in auditory cortex
A Generalized Linear Model for Estimating Spectrotemporal Receptive Fields from Responses to Natural Sounds
In the auditory system, the stimulus-response properties of single neurons are often described in terms of the spectrotemporal receptive field (STRF), a linear kernel relating the spectrogram of the sound stimulus to the instantaneous firing rate of the neuron. Several algorithms have been used to estimate STRFs from responses to natural stimuli; these algorithms differ in their functional models, cost functions, and regularization methods. Here, we characterize the stimulus-response function of auditory neurons using a generalized linear model (GLM). In this model, each cell's input is described by: 1) a stimulus filter (STRF); and 2) a post-spike filter, which captures dependencies on the neuron's spiking history. The output of the model is given by a series of spike trains rather than instantaneous firing rate, allowing the prediction of spike train responses to novel stimuli. We fit the model by maximum penalized likelihood to the spiking activity of zebra finch auditory midbrain neurons in response to conspecific vocalizations (songs) and modulation limited (ml) noise. We compare this model to normalized reverse correlation (NRC), the traditional method for STRF estimation, in terms of predictive power and the basic tuning properties of the estimated STRFs. We find that a GLM with a sparse prior predicts novel responses to both stimulus classes significantly better than NRC. Importantly, we find that STRFs from the two models derived from the same responses can differ substantially and that GLM STRFs are more consistent between stimulus classes than NRC STRFs. These results suggest that a GLM with a sparse prior provides a more accurate characterization of spectrotemporal tuning than does the NRC method when responses to complex sounds are studied in these neurons
NeuroGrid: recording action potentials from the surface of the brain.
Recording from neural networks at the resolution of action potentials is critical for understanding how information is processed in the brain. Here, we address this challenge by developing an organic material-based, ultraconformable, biocompatible and scalable neural interface array (the 'NeuroGrid') that can record both local field potentials(LFPs) and action potentials from superficial cortical neurons without penetrating the brain surface. Spikes with features of interneurons and pyramidal cells were simultaneously acquired by multiple neighboring electrodes of the NeuroGrid, allowing for the isolation of putative single neurons in rats. Spiking activity demonstrated consistent phase modulation by ongoing brain oscillations and was stable in recordings exceeding 1 week's duration. We also recorded LFP-modulated spiking activity intraoperatively in patients undergoing epilepsy surgery. The NeuroGrid constitutes an effective method for large-scale, stable recording of neuronal spikes in concert with local population synaptic activity, enhancing comprehension of neural processes across spatiotemporal scales and potentially facilitating diagnosis and therapy for brain disorders
Spectrotemporal Processing in Spectral Tuning Modules of Cat Primary Auditory Cortex
Spectral integration properties show topographical order in cat primary auditory cortex (AI). Along the iso-frequency domain, regions with predominantly narrowly tuned (NT) neurons are segregated from regions with more broadly tuned (BT) neurons, forming distinct processing modules. Despite their prominent spatial segregation, spectrotemporal processing has not been compared for these regions. We identified these NT and BT regions with broad-band ripple stimuli and characterized processing differences between them using both spectrotemporal receptive fields (STRFs) and nonlinear stimulus/firing rate transformations. The durations of STRF excitatory and inhibitory subfields were shorter and the best temporal modulation frequencies were higher for BT neurons than for NT neurons. For NT neurons, the bandwidth of excitatory and inhibitory subfields was matched, whereas for BT neurons it was not. Phase locking and feature selectivity were higher for NT neurons. Properties of the nonlinearities showed only slight differences across the bandwidth modules. These results indicate fundamental differences in spectrotemporal preferences - and thus distinct physiological functions - for neurons in BT and NT spectral integration modules. However, some global processing aspects, such as spectrotemporal interactions and nonlinear input/output behavior, appear to be similar for both neuronal subgroups. The findings suggest that spectral integration modules in AI differ in what specific stimulus aspects are processed, but they are similar in the manner in which stimulus information is processed
Neural processing of natural sounds
Natural sounds include animal vocalizations, environmental sounds such as wind, water and fire noises and non-vocal sounds made by animals and humans for communication. These natural sounds have characteristic statistical properties that make them perceptually salient and that drive auditory neurons in optimal regimes for information transmission.Recent advances in statistics and computer sciences have allowed neuro-physiologists to extract the stimulus-response function of complex auditory neurons from responses to natural sounds. These studies have shown a hierarchical processing that leads to the neural detection of progressively more complex natural sound features and have demonstrated the importance of the acoustical and behavioral contexts for the neural responses.High-level auditory neurons have shown to be exquisitely selective for conspecific calls. This fine selectivity could play an important role for species recognition, for vocal learning in songbirds and, in the case of the bats, for the processing of the sounds used in echolocation. Research that investigates how communication sounds are categorized into behaviorally meaningful groups (e.g. call types in animals, words in human speech) remains in its infancy.Animals and humans also excel at separating communication sounds from each other and from background noise. Neurons that detect communication calls in noise have been found but the neural computations involved in sound source separation and natural auditory scene analysis remain overall poorly understood. Thus, future auditory research will have to focus not only on how natural sounds are processed by the auditory system but also on the computations that allow for this processing to occur in natural listening situations.The complexity of the computations needed in the natural hearing task might require a high-dimensional representation provided by ensemble of neurons and the use of natural sounds might be the best solution for understanding the ensemble neural code
- …