Search CORE

197 research outputs found

Information-based Analysis and Control of Recurrent Linear Networks and Recurrent Networks with Sigmoidal Nonlinearities

Author: Menolascino Deslin
Publication venue: Washington University Open Scholarship
Publication date: 15/12/2018
Field of study

Linear dynamical models have served as an analytically tractable approximation for a variety of natural and engineered systems. Recently, such models have been used to describe high-level diffusive interactions in the activation of complex networks, including those in the brain. In this regard, classical tools from control theory, including controllability analysis, have been used to assay the extent to which such networks might respond to their afferent inputs. However, for natural systems such as brain networks, it is not clear whether advantageous control properties necessarily correspond to useful functionality. That is, are systems that are highly controllable (according to certain metrics) also ones that are suited to computational goals such as representing, preserving and categorizing stimuli? This dissertation will introduce analysis methods that link the systems-theoretic properties of linear systems with informational measures that describe these functional characterizations. First, we assess sensitivity of a linear system to input orientation and novelty by deriving a measure of how networks translate input orientation differences into readable state trajectories. Next, we explore the implications of this novelty-sensitivity for endpoint-based input discrimination, wherein stimuli are decoded in terms of their induced representation in the state space. We develop a theoretical framework for the exploration of how networks utilize excess input energy to enhance orientation sensitivity (and thus enhanced discrimination ability). Next, we conduct a theoretical study to reveal how the background or default state of a network with linear dynamics allows it to best promote discrimination over a continuum of stimuli. Specifically, we derive a measure, based on the classical notion of a Fisher discriminant, quantifying the extent to which the state of a network encodes information about its afferent inputs. This measure provides an information value quantifying the knowablility of an input based on its projection onto the background state. We subsequently optimize this background state, and characterize both the optimal background and the inputs giving it rise. Finally, we extend this information-based network analysis to include networks with nonlinear dynamics--specifically, ones involving sigmoidal saturating functions. We employ a quasilinear approximation technique, novel here in terms of its multidimensionality and specific application, to approximate the nonlinear dynamics by scaling a corresponding linear system and biasing by an offset term. A Fisher information-based metric is derived for the quasilinear system, with analytical and numerical results showing that Fisher information is better for the quasilinear (hence sigmoidal) system than for an unconstrained linear system. Interestingly, this relation reverses when the noise is placed outside the sigmoid in the model, supporting conclusions extant in the literature that the relative alignment of the state and noise covariance is predictive of Fisher information. We show that there exists a clear trade-off between informational advantage, as conferred by the presence of sigmoidal nonlinearities, and speed of dynamics

Washington University St. Louis: Open Scholarship

Transformation of a temporal speech cue to a spatial neural code in human auditory cortex

Author: Chang E.
Fox N.
Leonard M.
Sjerps M.
Publication venue: 'eLife Sciences Publications, Ltd'
Publication date: 25/08/2020
Field of study

In speech, listeners extract continuously-varying spectrotemporal cues from the acoustic signal to perceive discrete phonetic categories. Spectral cues are spatially encoded in the amplitude of responses in phonetically-tuned neural populations in auditory cortex. It remains unknown whether similar neurophysiological mechanisms encode temporal cues like voice-onset time (VOT), which distinguishes sounds like /b/ and/p/. We used direct brain recordings in humans to investigate the neural encoding of temporal speech cues with a VOT continuum from /ba/ to /pa/. We found that distinct neural populations respond preferentially to VOTs from one phonetic category, and are also sensitive to sub-phonetic VOT differences within a population’s preferred category. In a simple neural network model, simulated populations tuned to detect either temporal gaps or coincidences between spectral cues captured encoding patterns observed in real neural data. These results demonstrate that a spatial/amplitude neural code underlies the cortical representation of both spectral and temporal speech cues

MPG.PuRe

Characterizing the Cortical Contributions to Working Memory-Guided Obstacle Locomotion

Author: Wong Carmen
Publication venue: Scholarship@Western
Publication date: 26/09/2018
Field of study

While walking in complex environments, the ability to acquire information about objects in our surroundings is essential for successful obstacle negotiation. Furthermore, the ease with which most animals can traverse cluttered terrain while grazing, exploring, or hunting is facilitated by the capacity to store obstacle information in working memory (WM). However, the underlying neural substrates supporting such complex behaviours are poorly understood. Therefore, the goal of this thesis is to examine the neural underpinnings of WM-guided obstacle negotiation in the walking cat. Obstacle locomotion was studied in two main paradigms, characterized by whether obstacle presence was detected via vision or touch. In both paradigms, walking was delayed following foreleg obstacle clearance. When walking resumed, elevated hindleg stepping demonstrated that animals successfully remembered the obstacle beneath them. The tactile paradigm was first examined to assess the ability of animals to remember an unexpected obstacle over which the forelegs had tripped. Such tactile input to the forelegs was capable of producing a robust, long-lasting WM of the obstacle, similar to what has been previously described using the visual paradigm. Next, to assess whether regions of the brain associated with spatial representation and movement planning contribute to these behaviours, parietal area 5 was reversibly deactivated as visual or tactile obstacle WM was tested. Such deactivations resulted in substantial WM deficits precluding successful avoidance in both paradigms. To further characterize this cortical contribution, neural activity was then recorded with multi-electrode arrays implanted in area 5. While diverse patterns of task-related modulation were observed, only a small proportion of neurons demonstrated WM-related activity. These neurons exhibited the hallmark property of sustained delay period activity associated with WM maintenance, and were able to reliably discern whether or not the animal had stepped over an obstacle prior to the delay. Therefore, only a specialized subset of area 5 neurons is capable of maintaining stable representations of obstacle information in WM. Altogether, this work extends our understanding of WM-guided obstacle locomotion in the cat. Additionally, these findings provide insight into the neural circuitry within the posterior parietal cortex, which likely supports a variety of WM-guided behaviours

Scholarship@Western

Seeing sound: a new way to illustrate auditory objects and their neural correlates

Author: Lim Yoon Seob
Publication venue
Publication date: 22/01/2016
Field of study

This thesis develops a new method for time-frequency signal processing and examines the relevance of the new representation in studies of neural coding in songbirds. The method groups together associated regions of the time-frequency plane into objects defined by time-frequency contours. By combining information about structurally stable contour shapes over multiple time-scales and angles, a signal decomposition is produced that distributes resolution adaptively. As a result, distinct signal components are represented in their own most parsimonious forms. Next, through neural recordings in singing birds, it was found that activity in song premotor cortex is significantly correlated with the objects defined by this new representation of sound. In this process, an automated way of finding sub-syllable acoustic transitions in birdsongs was first developed, and then increased spiking probability was found at the boundaries of these acoustic transitions. Finally, a new approach to study auditory cortical sequence processing more generally is proposed. In this approach, songbirds were trained to discriminate Morse-code-like sequences of clicks, and the neural correlates of this behavior were examined in primary and secondary auditory cortex. It was found that a distinct transformation of auditory responses to the sequences of clicks exists as information transferred from primary to secondary auditory areas. Neurons in secondary auditory areas respond asynchronously and selectively -- in a manner that depends on the temporal context of the click. This transformation from a temporal to a spatial representation of sound provides a possible basis for the songbird's natural ability to discriminate complex temporal sequences

Boston University Institutional Repository (OpenBU)

Information processing in visual systems

Author: Saleem Aman
Saleem Aman
Publication venue: Bio-engineering, Imperial College London
Publication date: 01/01/2010
Field of study

One of the goals of neuroscience is to understand how animals perceive sensory information. This thesis focuses on visual systems, to unravel how neuronal structures process aspects of the visual environment. To characterise the receptive field of a neuron, we developed spike-triggered independent component analysis. Alongside characterising the receptive field of a neuron, this method provides an insight into its underlying network structure. When applied to recordings from the H1 neuron of blowflies, it accurately recovered the sub-structure of the neuron. This sub-structure was studied further by recording H1's response to plaid stimuli. Based on the response, H1 can be classified as a component cell. We then fitted an anatomically inspired model to the response, and found the critical component to explain H1's response to be a sigmoid non-linearity at output of elementary movement detectors. The simpler blowfly visual system can help us understand elementary sensory information processing mechanisms. How does the more complex mammalian cortex implement these principles in its network? To study this, we used multi-electrode arrays to characterise the receptive field properties of neurons in the visual cortex of anaesthetised mice. Based on these recordings, we estimated the cortical limits on the performance of a visual task; the behavioural performance observed by Prusky and Douglas (2004) is within these limits. Our recordings were carried out in anaesthetised animals. During anaesthesia, cortical UP states are considered "fragments of wakefulness" and from simultaneous whole-cell and extracellular recordings, we found these states to be revealed in the phase of local field potentials. This finding was used to develop a method of detecting cortical state based on extracellular recordings, which allows us to explore information processing during different cortical states. Across this thesis, we have developed, tested and applied methods that help improve our understanding of information processing in visual systems

Spiral - Imperial College Digital Repository

Whole Word Phonetic Displays for Speech Articulation Training

Author: Meng Fansheng
Publication venue: ODU Digital Commons
Publication date: 01/04/2006
Field of study

The main objective of this dissertation is to investigate and develop speech recognition technologies for speech training for people with hearing impairments. During the course of this work, a computer aided speech training system for articulation speech training was also designed and implemented. The speech training system places emphasis on displays to improve children\u27s pronunciation of isolated Consonant-Vowel-Consonant (CVC) words, with displays at both the phonetic level and whole word level. This dissertation presents two hybrid methods for combining Hidden Markov Models (HMMs) and Neural Networks (NNs) for speech recognition. The first method uses NN outputs as posterior probability estimators for HMMs. The second method uses NNs to transform the original speech features to normalized features with reduced correlation. Based on experimental testing, both of the hybrid methods give higher accuracy than standard HMM methods. The second method, using the NN to create normalized features, outperforms the first method in terms of accuracy. Several graphical displays were developed to provide real time visual feedback to users, to help them to improve and correct their pronunciations

Old Dominion University

Human Machine Interfaces for Teleoperators and Virtual Environments

Author: Durlach Nathaniel I.
Ellis Stephen R.
Sheridan Thomas B.
Publication venue
Publication date
Field of study

In Mar. 1990, a meeting organized around the general theme of teleoperation research into virtual environment display technology was conducted. This is a collection of conference-related fragments that will give a glimpse of the potential of the following fields and how they interplay: sensorimotor performance; human-machine interfaces; teleoperation; virtual environments; performance measurement and evaluation methods; and design principles and predictive models

NASA Technical Reports Server

Algorithms for Joint Evaluation of Multiple Speech Patterns for Automatic Speech Recognition

Author: Nishanth Ulhas Nair
T.V. Sreenivas
Publication venue: 'IntechOpen'
Publication date: 01/11/2008
Field of study

IntechOpen

Crossref

Methods of Optimizing Speech Enhancement for Hearing Applications

Author: Liu Fangqi
Publication venue: UCL (University College London)
Publication date: 28/09/2019
Field of study

Speech intelligibility in hearing applications suffers from background noise. One of the most effective solutions is to develop speech enhancement algorithms based on the biological traits of the auditory system. In humans, the medial olivocochlear (MOC) reflex, which is an auditory neural feedback loop, increases signal-in-noise detection by suppressing cochlear response to noise. The time constant is one of the key attributes of the MOC reflex as it regulates the variation of suppression over time. Different time constants have been measured in nonhuman mammalian and human auditory systems. Physiological studies reported that the time constant of nonhuman mammalian MOC reflex varies with the properties (e.g. frequency, bandwidth) changes of the stimulation. A human based study suggests that time constant could vary when the bandwidth of the noise is changed. Previous works have developed MOC reflex models and successfully demonstrated the benefits of simulating the MOC reflex for speech-in-noise recognition. However, they often used fixed time constants. The effect of the different time constants on speech perception remains unclear. The main objectives of the present study are (1) to study the effect of the MOC reflex time constant on speech perception in different noise conditions; (2) to develop a speech enhancement algorithm with dynamic time constant optimization to adapt to varying noise conditions for improving speech intelligibility. The first part of this thesis studies the effect of the MOC reflex time constants on speech-in-noise perception. Conventional studies do not consider the relationship between the time constants and speech perception as it is difficult to measure the speech intelligibility changes due to varying time constants in human subjects. We use a model to investigate the relationship by incorporating Meddis’ peripheral auditory model (which includes a MOC reflex) with an automatic speech recognition (ASR) system. The effect of the MOC reflex time constant is studied by adjusting the time constant parameter of the model and testing the speech recognition accuracy of the ASR. Different time constants derived from human data are evaluated in both speech-like and non-speech like noise at the SNR levels from -10 dB to 20 dB and clean speech condition. The results show that the long time constants (≥1000 ms) provide a greater improvement of speech recognition accuracy at SNR levels≤10 dB. Maximum accuracy improvement of 40% (compared to no MOC condition) is shown in pink noise at the SNR of 10 dB. Short time constants (<1000 ms) show recognition accuracy over 5% higher than the longer ones at SNR levels ≥15 dB. The second part of the thesis develops a novel speech enhancement algorithm based on the MOC reflex with a time constant that is dynamically optimized, according to a lookup table for varying SNRs. The main contributions of this part include: (1) So far, the existing SNR estimation methods are challenged in cases of low SNR, nonstationary noise, and computational complexity. High computational complexity would increase processing delay that causes intelligibility degradation. A variance of spectral entropy (VSE) based SNR estimation method is developed as entropy based features have been shown to be more robust in the cases of low SNR and nonstationary noise. The SNR is estimated according to the estimated VSE-SNR relationship functions by measuring VSE of noisy speech. Our proposed method has an accuracy of 5 dB higher than other methods especially in the babble noise with fewer talkers (2 talkers) and low SNR levels (< 0 dB), with averaging processing time only about 30% of the noise power estimation based method. The proposed SNR estimation method is further improved by implementing a nonlinear filter-bank. The compression of the nonlinear filter-bank is shown to increase the stability of the relationship functions. As a result, the accuracy is improved by up to 2 dB in all types of tested noise. (2) A modification of Meddis’ MOC reflex model with a time constant dynamically optimized against varying SNRs is developed. The model incudes simulated inner hair cell response to reduce the model complexity, and now includes the SNR estimation method. Previous MOC reflex models often have fixed time constants that do not adapt to varying noise conditions, whilst our modified MOC reflex model has a time constant dynamically optimized according to the estimated SNRs. The results show a speech recognition accuracy of 8 % higher than the model using a fixed time constant of 2000 ms in different types of noise. (3) A speech enhancement algorithm is developed based on the modified MOC reflex model and implemented in an existing hearing aid system. The performance is evaluated by measuring the objective speech intelligibility metric of processed noisy speech. In different types of noise, the proposed algorithm increases intelligibility at least 20% in comparison to unprocessed noisy speech at SNRs between 0 dB and 20 dB, and over 15 % in comparison to processed noisy speech using the original MOC based algorithm in the hearing aid

UCL Discovery

Speech recognition in noise using weighted matching algorithms

Author: Becerra Yoma Nestor
Publication venue: The University of Edinburgh
Publication date: 01/01/1998
Field of study

Edinburgh Research Archive