175 research outputs found
Extensions of independent component analysis for natural image data
An understanding of the statistical properties of natural images is useful for any kind of processing to be performed on them. Natural image statistics are, however, in many ways as complex as the world which they depict. Fortunately, the dominant low-level statistics of images are sufficient for many different image processing goals. A lot of research has been devoted to second order statistics of natural images over the years.
Independent component analysis is a statistical tool for analyzing higher than second order statistics of data sets. It attempts to describe the observed data as a linear combination of independent, latent sources. Despite its simplicity, it has provided valuable insights of many types of natural data. With natural image data, it gives a sparse basis useful for efficient description of the data. Connections between this description and early mammalian visual processing have been noticed.
The main focus of this work is to extend the known results of applying independent component analysis on natural images. We explore different imaging techniques, develop algorithms for overcomplete cases, and study the dependencies between the components by using a model that finds a topographic ordering for the components as well as by conditioning the statistics of a component on the activity of another. An overview is provided of the associated problem field, and it is discussed how these relatively small results may eventually be a part of a more complete solution to the problem of vision.reviewe
A sparse coding model with synaptically local plasticity and spiking neurons can account for the diverse shapes of V1 simple cell receptive fields
Sparse coding algorithms trained on natural images can accurately predict the
features that excite visual cortical neurons, but it is not known whether such
codes can be learned using biologically realistic plasticity rules. We have
developed a biophysically motivated spiking network, relying solely on
synaptically local information, that can predict the full diversity of V1
simple cell receptive field shapes when trained on natural images. This
represents the first demonstration that sparse coding principles, operating
within the constraints imposed by cortical architecture, can successfully
reproduce these receptive fields. We further prove, mathematically, that
sparseness and decorrelation are the key ingredients that allow for
synaptically local plasticity rules to optimize a cooperative, linear
generative image model formed by the neural representation. Finally, we discuss
several interesting emergent properties of our network, with the intent of
bridging the gap between theoretical and experimental studies of visual cortex.Comment: 33 pages, 6 figures. To appear in PLoS Computational Biology. Some of
these data were presented by author JZ at the 2011 CoSyNe meeting in Salt
Lake Cit
Natural Image Coding in V1: How Much Use is Orientation Selectivity?
Orientation selectivity is the most striking feature of simple cell coding in
V1 which has been shown to emerge from the reduction of higher-order
correlations in natural images in a large variety of statistical image models.
The most parsimonious one among these models is linear Independent Component
Analysis (ICA), whereas second-order decorrelation transformations such as
Principal Component Analysis (PCA) do not yield oriented filters. Because of
this finding it has been suggested that the emergence of orientation
selectivity may be explained by higher-order redundancy reduction. In order to
assess the tenability of this hypothesis, it is an important empirical question
how much more redundancies can be removed with ICA in comparison to PCA, or
other second-order decorrelation methods. This question has not yet been
settled, as over the last ten years contradicting results have been reported
ranging from less than five to more than hundred percent extra gain for ICA.
Here, we aim at resolving this conflict by presenting a very careful and
comprehensive analysis using three evaluation criteria related to redundancy
reduction: In addition to the multi-information and the average log-loss we
compute, for the first time, complete rate-distortion curves for ICA in
comparison with PCA. Without exception, we find that the advantage of the ICA
filters is surprisingly small. Furthermore, we show that a simple spherically
symmetric distribution with only two parameters can fit the data even better
than the probabilistic model underlying ICA. Since spherically symmetric models
are agnostic with respect to the specific filter shapes, we conlude that
orientation selectivity is unlikely to play a critical role for redundancy
reduction
Are v1 simple cells optimized for visual occlusions? : A comparative study
Abstract: Simple cells in primary visual cortex were famously found to respond to low-level image components such as edges. Sparse coding and independent component analysis (ICA) emerged as the standard computational models for simple cell coding because they linked their receptive fields to the statistics of visual stimuli. However, a salient feature of image statistics, occlusions of image components, is not considered by these models. Here we ask if occlusions have an effect on the predicted shapes of simple cell receptive fields. We use a comparative approach to answer this question and investigate two models for simple cells: a standard linear model and an occlusive model. For both models we simultaneously estimate optimal receptive fields, sparsity and stimulus noise. The two models are identical except for their component superposition assumption. We find the image encoding and receptive fields predicted by the models to differ significantly. While both models predict many Gabor-like fields, the occlusive model predicts a much sparser encoding and high percentages of ‘globular’ receptive fields. This relatively new center-surround type of simple cell response is observed since reverse correlation is used in experimental studies. While high percentages of ‘globular’ fields can be obtained using specific choices of sparsity and overcompleteness in linear sparse coding, no or only low proportions are reported in the vast majority of studies on linear models (including all ICA models). Likewise, for the here investigated linear model and optimal sparsity, only low proportions of ‘globular’ fields are observed. In comparison, the occlusive model robustly infers high proportions and can match the experimentally observed high proportions of ‘globular’ fields well. Our computational study, therefore, suggests that ‘globular’ fields may be evidence for an optimal encoding of visual occlusions in primary visual cortex.
Author Summary: The statistics of our visual world is dominated by occlusions. Almost every image processed by our brain consists of mutually occluding objects, animals and plants. Our visual cortex is optimized through evolution and throughout our lifespan for such stimuli. Yet, the standard computational models of primary visual processing do not consider occlusions. In this study, we ask what effects visual occlusions may have on predicted response properties of simple cells which are the first cortical processing units for images. Our results suggest that recently observed differences between experiments and predictions of the standard simple cell models can be attributed to occlusions. The most significant consequence of occlusions is the prediction of many cells sensitive to center-surround stimuli. Experimentally, large quantities of such cells are observed since new techniques (reverse correlation) are used. Without occlusions, they are only obtained for specific settings and none of the seminal studies (sparse coding, ICA) predicted such fields. In contrast, the new type of response naturally emerges as soon as occlusions are considered. In comparison with recent in vivo experiments we find that occlusive models are consistent with the high percentages of center-surround simple cells observed in macaque monkeys, ferrets and mice
Online Multi-Stage Deep Architectures for Feature Extraction and Object Recognition
Multi-stage visual architectures have recently found success in achieving high classification accuracies over image datasets with large variations in pose, lighting, and scale. Inspired by techniques currently at the forefront of deep learning, such architectures are typically composed of one or more layers of preprocessing, feature encoding, and pooling to extract features from raw images. Training these components traditionally relies on large sets of patches that are extracted from a potentially large image dataset. In this context, high-dimensional feature space representations are often helpful for obtaining the best classification performances and providing a higher degree of invariance to object transformations. Large datasets with high-dimensional features complicate the implementation of visual architectures in memory constrained environments. This dissertation constructs online learning replacements for the components within a multi-stage architecture and demonstrates that the proposed replacements (namely fuzzy competitive clustering, an incremental covariance estimator, and multi-layer neural network) can offer performance competitive with their offline batch counterparts while providing a reduced memory footprint. The online nature of this solution allows for the development of a method for adjusting parameters within the architecture via stochastic gradient descent. Testing over multiple datasets shows the potential benefits of this methodology when appropriate priors on the initial parameters are unknown. Alternatives to batch based decompositions for a whitening preprocessing stage which take advantage of natural image statistics and allow simple dictionary learners to work well in the problem domain are also explored. Expansions of the architecture using additional pooling statistics and multiple layers are presented and indicate that larger codebook sizes are not the only step forward to higher classification accuracies. Experimental results from these expansions further indicate the important role of sparsity and appropriate encodings within multi-stage visual feature extraction architectures
Role of homeostasis in learning sparse representations
Neurons in the input layer of primary visual cortex in primates develop
edge-like receptive fields. One approach to understanding the emergence of this
response is to state that neural activity has to efficiently represent sensory
data with respect to the statistics of natural scenes. Furthermore, it is
believed that such an efficient coding is achieved using a competition across
neurons so as to generate a sparse representation, that is, where a relatively
small number of neurons are simultaneously active. Indeed, different models of
sparse coding, coupled with Hebbian learning and homeostasis, have been
proposed that successfully match the observed emergent response. However, the
specific role of homeostasis in learning such sparse representations is still
largely unknown. By quantitatively assessing the efficiency of the neural
representation during learning, we derive a cooperative homeostasis mechanism
that optimally tunes the competition between neurons within the sparse coding
algorithm. We apply this homeostasis while learning small patches taken from
natural images and compare its efficiency with state-of-the-art algorithms.
Results show that while different sparse coding algorithms give similar coding
results, the homeostasis provides an optimal balance for the representation of
natural images within the population of neurons. Competition in sparse coding
is optimized when it is fair. By contributing to optimizing statistical
competition across neurons, homeostasis is crucial in providing a more
efficient solution to the emergence of independent components
Independent Component Analysis in Spiking Neurons
Although models based on independent component analysis (ICA) have been successful in explaining various properties of sensory coding in the cortex, it remains unclear how networks of spiking neurons using realistic plasticity rules can realize such computation. Here, we propose a biologically plausible mechanism for ICA-like learning with spiking neurons. Our model combines spike-timing dependent plasticity and synaptic scaling with an intrinsic plasticity rule that regulates neuronal excitability to maximize information transmission. We show that a stochastically spiking neuron learns one independent component for inputs encoded either as rates or using spike-spike correlations. Furthermore, different independent components can be recovered, when the activity of different neurons is decorrelated by adaptive lateral inhibition
Understanding the effect of correlated colour temperatures on spatio-chromatic properties of natural images
Despite the natural occurrence of global and local daylight changes in natural scenes, the human visual system
typically adapts well to these changes and develops stable colour perception. In a previous study, the influence of
daylight characterized by its Correlated Colour Temperatures (CCT) on different chromatic descriptors was
analysed (Ojeda et al., 2017). The results showed that chromatic information is almost constant for CCT values
above 14,000 K, with local extremes occurring in the range of low CCTs. The aim of this work is to extend the
analysis of the CCT dependence of the illuminant to those that consider the spatio-chromatic structure, including
second order descriptors (gradients, spectral slope, spectral signature, and PCA) and higher order descriptors
(kurtosis, skewness, and number of relevant colours). Our results show that most of the descriptors exhibit
horizontal asymptotic behaviour for CCTs above 15,000 K and local extremes in the range of 3,900 K-9,600 K.
For those descriptors that could be analysed in CIELAB space, sufficient statistical evidence was obtained to
consider skewness, kurtosis, and the independent spectral slopes of the L* channel as equal in the range of CCTs
used. However, the slight variations in spectral signatures and the directions of the principal components when
applying PCA to image patches are not statistically significant and cannot be considered equal under different
illuminants. The number of relevant colours (NRC) exhibits sensitivity to temperature variations and behaves
similarly to the other descriptors, due to its small number.Computational Colour and Spectral Imaging Erasmus+
master programme (610605-EPP-1-2019-1-NO-EPPKA1-JMD-MOB
Theory of representation learning in cortical neural networks
Our brain continuously self-organizes to construct and maintain an internal representation of the world based on the information arriving through sensory stimuli. Remarkably, cortical areas related to different sensory modalities appear to share the same functional unit, the neuron, and develop through the same learning mechanism, synaptic plasticity. It motivates the conjecture of a unifying theory to explain cortical representational learning across sensory modalities. In this thesis we present theories and computational models of learning and optimization in neural networks, postulating functional properties of synaptic plasticity that support the apparent universal learning capacity of cortical networks. In the past decades, a variety of theories and models have been proposed to describe receptive field formation in sensory areas. They include normative models such as sparse coding, and bottom-up models such as spike-timing dependent plasticity. We bring together candidate explanations by demonstrating that in fact a single principle is sufficient to explain receptive field development. First, we show that many representative models of sensory development are in fact implementing variations of a common principle: nonlinear Hebbian learning. Second, we reveal that nonlinear Hebbian learning is sufficient for receptive field formation through sensory inputs. A surprising result is that our findings are independent of specific details, and allow for robust predictions of the learned receptive fields. Thus nonlinear Hebbian learning and natural statistics can account for many aspects of receptive field formation across models and sensory modalities. The Hebbian learning theory substantiates that synaptic plasticity can be interpreted as an optimization procedure, implementing stochastic gradient descent. In stochastic gradient descent inputs arrive sequentially, as in sensory streams. However, individual data samples have very little information about the correct learning signal, and it becomes a fundamental problem to know how many samples are required for reliable synaptic changes. Through estimation theory, we develop a novel adaptive learning rate model, that adapts the magnitude of synaptic changes based on the statistics of the learning signal, enabling an optimal use of data samples. Our model has a simple implementation and demonstrates improved learning speed, making this a promising candidate for large artificial neural network applications. The model also makes predictions on how cortical plasticity may modulate synaptic plasticity for optimal learning. The optimal sampling size for reliable learning allows us to estimate optimal learning times for a given model. We apply this theory to derive analytical bounds on times for the optimization of synaptic connections. First, we show this optimization problem to have exponentially many saddle-nodes, which lead to small gradients and slow learning. Second, we show that the number of input synapses to a neuron modulates the magnitude of the initial gradient, determining the duration of learning. Our final result reveals that the learning duration increases supra-linearly with the number of synapses, suggesting an effective limit on synaptic connections and receptive field sizes in developing neural networks
- …