25 research outputs found
Recommended from our members
Massively Parallel Spiking Neural Circuits: Encoding, Decoding and Functional Identification
This thesis presents a class of massively parallel spiking neural circuit architectures in which neurons are modeled by dendritic stimulus processors cascaded with spike generators. We investigate how visual stimuli can be represented by the spike times generated by the massively parallel neural circuits, how the spike times can be used to reconstruct and process visual stimuli, and the conditions when visual stimuli can be faithfully represented/reconstructed. Functional identification of the massively parallel neural circuits from spike times and its evaluation are also investigated. Together, this thesis offers a comprehensive analytic framework of massively parallel spiking neural circuit architectures arising in the study of early visual systems.
In encoding, modeling of visual stimuli in reproducing kernel Hilbert spaces is presented, recognizing the importance of studying visual encoding in a rigorous mathematical framework. For massively parallel neural circuits with biophysical spike generators, I/O characterization of the biophysical spike generators becomes possible by introducing phase response curve manifolds for the biophysical spike generators. I/O characterization of the entire neural circuit can then be interpreted as generalized sampling in the Hilbert space. Multi-component dendritic stimulus processors are introduced to model visual encoding in stereoscopic color vision. It is also shown that encoding of visual stimuli by an ensemble of complex cells has the complexity of Volterra dendritic stimulus processors.
Based on the I/O characterization, reconstruction algorithms are derived to decode, from spike times, visual stimuli encoded by these massively parallel neural circuits. Decoding problems are first formulated as spline interpolation problems. Conditions on faithful reconstruction are presented, allowing the probe of information content carried by the spikes. Algorithms are developed to qualify the decoding in massively parallel settings. For stereoscopic color visual stimuli, demixing of individual channels from an unlabeled set of spike trains is demonstrated. For encoding with complex cells, decoding problems are formulated as rank minimization problems. It is shown that the decoding algorithm does not suffer from the curse of dimensionality and thereby allows for a visual representation using biologically realistic neural resources.
The study of visual stimuli encoding and decoding enables the functional identification of massively parallel neural circuits. The duality between decoding and functional identification suggests that algorithms for functional identification of the projection of dendritic stimulus processors onto the space of input stimuli can be formulated similarly to the decoding algorithms. Functional identification of dendritic stimulus processors of neurons carrying stereoscopic color information as well as that of energy processing in complex cells is demonstrated. Furthermore, this duality also inspires a novel method to evaluate the quality of functional identification of massively parallel spiking neural circuits. By reconstructing novel stimuli using identified circuit parameters, the evaluation of the entire identified circuit is reduced to intuitive comparisons in stimulus space.
The use of biophysical spike generators advances a methodology in the study of intrinsic noise sources in neurons and their effects on stimulus representation and on precision of functional identification. These effects are investigated using a class of nonlinear neural circuits consisting of both feedforward and feedback Volterra dendritic stimulus processors and biophysical spike generators. It is shown that encoding with neural circuits with intrinsic noise sources can be interpreted as generalized sampling with noisy measurements. Effects of noise on decoding and functional identification are derived theoretically and were systematically investigated by extensive simulations.
Finally, the massively parallel neural circuit architectures are shown to enable the implementation of identity preserving transformations in the spike domain using a switching matrix regulating the connection between encoding and decoding. Two realizations of the architectures are developed, and extensive examples using continuous visual streams are provided. Implications of this result on the problem of invariant object recognition in the spike domain are discussed
Advances in Image Processing, Analysis and Recognition Technology
For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches
Recommended from our members
Identification of Dendritic Processing in Spiking Neural Circuits
A large body of experimental evidence points to sophisticated signal processing taking place at the level of dendritic trees and dendritic branches of neurons. This evidence suggests that, in addition to inferring the connectivity between neurons, identifying analog dendritic processing in individual cells is fundamentally important to understanding the underlying principles of neural computation. In this thesis, we develop a novel theoretical framework for the identification of dendritic processing directly from spike times produced by spiking neurons. The problem setting of spiking neurons is necessary since such neurons make up the majority of electrically excitable cells in most nervous systems and it is often hard or even impossible to directly monitor the activity within dendrites. Thus, action potentials produced by neurons often constitute the only causal and observable correlate of dendritic processing. In order to remain true to the underlying biophysics of electrically excitable cells, we employ well-established mechanistic models of action potential generation to describe the nonlinear mapping of the aggregate current produced by the tree into an asynchronous sequence of spikes. Specific models of spike generation considered include conductance-based models such as Hodgkin-Huxley, Morris-Lecar, Fitzhugh-Nagumo, as well as simpler models of the integrate-and-fire and threshold-and-fire type. The aggregate time-varying current driving the spike generator is taken to be produced by a dendritic stimulus processor, which is a nonlinear dynamical system capable of describing arbitrary linear and nonlinear transformations performed on one or more input stimuli. In the case of multiple stimuli, it can also describe the cross-coupling, or interaction, between various stimulus features. The behavior of the dendritic stimulus processor is fully captured by one or more kernels, which provide a characterization of the signal processing that is consistent with the broader cable theory description of dendritic trees. We prove that the neural identification problem, stated in terms of identifying the kernels of the dendritic stimulus processor, is mathematically dual to the neural population encoding problem. Specifically, we show that the collection of spikes produced by a single neuron in multiple experimental trials can be treated as a single multidimensional spike train of a population of neurons encoding the parameters of the dendritic stimulus processor. Using the theory of sampling in reproducing kernel Hilbert spaces, we then derive precise results demonstrating that, during any experiment, the entire neural circuit is projected onto the space of input stimuli and parameters of this projection are faithfully encoded in the spike train. Spike times are shown to correspond to generalized samples, or measurements, of this projection in a system of coordinates that is not fixed but is both neuron- and stimulus-dependent. We examine the theoretical conditions under which it may be possible to reconstruct the dendritic stimulus processor from these samples and derive corresponding experimental conditions for the minimum number of spikes and stimuli that need to be used. We also provide explicit algorithms for reconstructing the kernel projection and demonstrate that, under natural conditions, this projection converges to the true kernel. The developed methodology is quite general and can be applied to a number of neural circuits. In particular, the methods discussed span all sensory modalities, including vision, audition and olfaction, in which external stimuli are typically continuous functions of time and space. The results can also be applied to circuits in higher brain centers that receive multi-dimensional spike trains as input stimuli instead of continuous signals. In addition, the modularity of the approach allows one to extend it to mixed-signal circuits processing both continuous and spiking stimuli, to circuits with extensive lateral connections and feedback, as well as to multisensory circuits concurrently processing multiple stimuli of different dimensions, such as audio and video. Another important extension of the approach can be used to estimate the phase response curves of a neuron. All of the theoretical results are accompanied by detailed examples demonstrating the performance of the proposed identification algorithms. We employ both synthetic and naturalistic stimuli such as natural video and audio to highlight the power of the approach. Finally, we consider the implication of our work on problems pertaining to neural encoding and decoding and discuss promising directions for future research
Reconstruction, identification and implementation methods for spiking neural circuits
Integrate-and-fire (IF) neurons are time encoding machines (TEMs) that convert the amplitude of an analog signal into a non-uniform, strictly increasing sequence of spike times.
This thesis addresses three major issues in the field of computational neuroscience as well as neuromorphic engineering.
The first problem is concerned with the formulation of the encoding performed by an IF neuron. The encoding mechanism is described mathematically by the t-transform equation,
whose standard formulation is given by the projection of the stimulus onto a set of input dependent frame functions. As a consequence, the standard methods reconstruct the input
of an IF neuron in a space spanned by a set of functions that depend on the stimulus. The process becomes computationally demanding when performing reconstruction from long sequences of spike times.
The issue is addressed in this work by developing a new framework in which the IF encoding process is formulated as a problem of uniform sampling on a set of input independent
time points. Based on this formulation, new algorithms are introduced for reconstructing the input of an IF neuron belonging to bandlimited as well as shift-invariant spaces. The algorithms are significantly faster, whilst providing a similar level of accuracy, compared to the standard reconstruction methods.
Another important issue calls for inferring mathematical models for sensory processing systems directly from input-output observations. This problem was addressed before by
performing identification of sensory circuits consisting of linear filters in series with ideal IF neurons, by reformulating the identification problem as one of stimulus reconstruction. The result was extended to circuits in which the ideal IF neuron was replaced by more
biophysically realistic models, under the additional assumptions that the spiking neuron parameters are known a priori, or that input-output measurements of the spiking neuron are available.
This thesis develops two new identification methodologies for [Nonlinear Filter]-[Ideal IF] and [Linear Filter]-[Leaky IF] circuits consisting of two steps: the estimation of the spiking neuron parameters and the identification of the filter. The methodologies are based on the reformulation of the circuit as a scaled filter in series with a modified spiking neuron.
The first methodology identifies an unknown [Nonlinear Filter]-[Ideal IF] circuit from input-output data. The scaled nonlinear filter is estimated using the NARMAX identification methodology for the reconstructed filter output.
The [Linear Filter]-[Leaky IF] circuit is identified with the second proposed methodology by first estimating the leaky IF parameters with arbitrary precision using specific
stimuli sequences. The filter is subsequently identified using the NARMAX identification methodology.
The third problem addressed in this work is given by the need of developing neuromorphic engineering circuits that perform mathematical computations in the spike domain.
In this respect, this thesis developed a new representation between the time encoded input and output of a linear filter, where the TEM is represented by an ideal IF neuron. A new practical algorithm is developed based on this representation. The proposed algorithm is significantly faster than the alternative approach, which involves reconstructing the input, simulating the linear filter, and subsequently encoding the resulting output into a spike train
3D exemplar-based image inpainting in electron microscopy
In electron microscopy (EM) a common problem is the non-availability of data, which causes artefacts in reconstructions. In this thesis the goal is to generate artificial data where missing in EM by using exemplar-based inpainting (EBI). We implement an accelerated 3D version tailored to applications in EM, which reduces reconstruction times from days to minutes. We develop intelligent sampling strategies to find optimal data as input for reconstruction methods. Further, we investigate approaches to reduce electron dose and acquisition time. Sparse sampling followed by inpainting is the most promising approach. As common evaluation measures may lead to misinterpretation of results in EM and falsify a subsequent analysis, we propose to use application driven metrics and demonstrate this in a segmentation task. A further application of our technique is the artificial generation of projections in tiltbased EM. EBI is used to generate missing projections, such that the full angular range is covered. Subsequent reconstructions are significantly enhanced in terms of resolution, which facilitates further analysis of samples. In conclusion, EBI proves promising when used as an additional data generation step to tackle the non-availability of data in EM, which is evaluated in selected applications. Enhancing adaptive sampling methods and refining EBI, especially considering the mutual influence, promotes higher throughput in EM using less electron dose while not lessening quality.Ein häufig vorkommendes Problem in der Elektronenmikroskopie (EM) ist die Nichtverfügbarkeit von Daten, was zu Artefakten in Rekonstruktionen führt. In dieser Arbeit ist es das Ziel fehlende Daten in der EM künstlich zu erzeugen, was durch Exemplar-basiertes Inpainting (EBI) realisiert wird. Wir implementieren eine auf EM zugeschnittene beschleunigte 3D Version, welche es ermöglicht, Rekonstruktionszeiten von Tagen auf Minuten zu reduzieren. Wir entwickeln intelligente Abtaststrategien, um optimale Datenpunkte für die Rekonstruktion zu erhalten. Ansätze zur Reduzierung von Elektronendosis und Aufnahmezeit werden untersucht. Unterabtastung gefolgt von Inpainting führt zu den besten Resultaten. Evaluationsmaße zur Beurteilung der Rekonstruktionsqualität helfen in der EM oft nicht und können zu falschen Schlüssen führen, weswegen anwendungsbasierte Metriken die bessere Wahl darstellen. Dies demonstrieren wir anhand eines Beispiels. Die künstliche Erzeugung von Projektionen in der neigungsbasierten Elektronentomographie ist eine weitere Anwendung. EBI wird verwendet um fehlende Projektionen zu generieren. Daraus resultierende Rekonstruktionen weisen eine deutlich erhöhte Auflösung auf. EBI ist ein vielversprechender Ansatz, um nicht verfügbare Daten in der EM zu generieren. Dies wird auf Basis verschiedener Anwendungen gezeigt und evaluiert. Adaptive Aufnahmestrategien und EBI können also zu einem höheren Durchsatz in der EM führen, ohne die Bildqualität merklich zu verschlechtern
Learning Robust Features and Latent Representations for Single View 3D Pose Estimation of Humans and Objects
Estimating the 3D poses of rigid and articulated bodies is one of the fundamental problems of Computer Vision. It has a broad range of applications including augmented reality, surveillance, animation and human-computer interaction. Despite the ever-growing demand driven by the applications, predicting 3D pose from a 2D image is a challenging and ill-posed problem due to the loss of depth information during projection from 3D to 2D. Although there have been years of research on 3D pose estimation problem, it still remains unsolved. In this thesis, we propose a variety of ways to tackle the 3D pose estimation problem both for articulated human bodies and rigid object bodies by learning robust features and latent representations.
First, we present a novel video-based approach that exploits spatiotemporal features for 3D human pose estimation in a discriminative regression scheme. While early approaches typically account for motion information by temporally regularizing noisy pose estimates in individual frames, we demonstrate that taking into account motion information very early in the modeling process with spatiotemporal features yields significant performance improvements. We further propose a CNN-based motion compensation approach that stabilizes and centralizes the human body in the bounding boxes of consecutive frames to increase the reliability of spatiotemporal features. This then allows us to effectively overcome ambiguities and improve pose estimation accuracy.
Second, we develop a novel Deep Learning framework for structured prediction of 3D human pose. Our approach relies on an auto-encoder to learn a high-dimensional latent pose representation that accounts for joint dependencies. We combine traditional CNNs for supervised learning with auto-encoders for structured learning and demonstrate that our approach outperforms the existing ones both in terms of structure preservation and prediction accuracy.
Third, we propose a 3D human pose estimation approach that relies on a two-stream neural network architecture to simultaneously exploit 2D joint location heatmaps and image features. We show that 2D pose of a person, predicted in terms of heatmaps by a fully convolutional network, provides valuable cues to disambiguate challenging poses and results in increased pose estimation accuracy. We further introduce a novel and generic trainable fusion scheme, which automatically learns where and how to fuse the features extracted from two different input modalities that a two-stream neural network operates on. Our trainable fusion framework selects the optimal network architecture on-the-fly and improves upon standard hard-coded network architectures.
Fourth, we propose an efficient approach to estimate 3D pose of objects from a single RGB image. Existing methods typically detect 2D bounding boxes and then predict the object pose using a pipelined approach. The redundancy in different parts of the architecture makes such methods computationally expensive. Moreover, the final pose estimation accuracy depends on the accuracy of the intermediate 2D object detection step. In our method, the object is classified and its pose is regressed in a single shot from the full image using a single, compact fully convolutional neural network. Our approach achieves the state-of-the-art accuracy without requiring any costly pose refinement step and runs in real-time at 50 fps on a modern GPU, which is at least 5X faster than the state of the art