37,830 research outputs found
Sparse neural networks with large learning diversity
Coded recurrent neural networks with three levels of sparsity are introduced.
The first level is related to the size of messages, much smaller than the
number of available neurons. The second one is provided by a particular coding
rule, acting as a local constraint in the neural activity. The third one is a
characteristic of the low final connection density of the network after the
learning phase. Though the proposed network is very simple since it is based on
binary neurons and binary connections, it is able to learn a large number of
messages and recall them, even in presence of strong erasures. The performance
of the network is assessed as a classifier and as an associative memory
Entropy of Overcomplete Kernel Dictionaries
In signal analysis and synthesis, linear approximation theory considers a
linear decomposition of any given signal in a set of atoms, collected into a
so-called dictionary. Relevant sparse representations are obtained by relaxing
the orthogonality condition of the atoms, yielding overcomplete dictionaries
with an extended number of atoms. More generally than the linear decomposition,
overcomplete kernel dictionaries provide an elegant nonlinear extension by
defining the atoms through a mapping kernel function (e.g., the gaussian
kernel). Models based on such kernel dictionaries are used in neural networks,
gaussian processes and online learning with kernels.
The quality of an overcomplete dictionary is evaluated with a diversity
measure the distance, the approximation, the coherence and the Babel measures.
In this paper, we develop a framework to examine overcomplete kernel
dictionaries with the entropy from information theory. Indeed, a higher value
of the entropy is associated to a further uniform spread of the atoms over the
space. For each of the aforementioned diversity measures, we derive lower
bounds on the entropy. Several definitions of the entropy are examined, with an
extensive analysis in both the input space and the mapped feature space.Comment: 10 page
Nonlinear Hebbian learning as a unifying principle in receptive field formation
The development of sensory receptive fields has been modeled in the past by a
variety of models including normative models such as sparse coding or
independent component analysis and bottom-up models such as spike-timing
dependent plasticity or the Bienenstock-Cooper-Munro model of synaptic
plasticity. Here we show that the above variety of approaches can all be
unified into a single common principle, namely Nonlinear Hebbian Learning. When
Nonlinear Hebbian Learning is applied to natural images, receptive field shapes
were strongly constrained by the input statistics and preprocessing, but
exhibited only modest variation across different choices of nonlinearities in
neuron models or synaptic plasticity rules. Neither overcompleteness nor sparse
network activity are necessary for the development of localized receptive
fields. The analysis of alternative sensory modalities such as auditory models
or V2 development lead to the same conclusions. In all examples, receptive
fields can be predicted a priori by reformulating an abstract model as
nonlinear Hebbian learning. Thus nonlinear Hebbian learning and natural
statistics can account for many aspects of receptive field formation across
models and sensory modalities
Data-Driven Sparse Structure Selection for Deep Neural Networks
Deep convolutional neural networks have liberated its extraordinary power on
various tasks. However, it is still very challenging to deploy state-of-the-art
models into real-world applications due to their high computational complexity.
How can we design a compact and effective network without massive experiments
and expert knowledge? In this paper, we propose a simple and effective
framework to learn and prune deep models in an end-to-end manner. In our
framework, a new type of parameter -- scaling factor is first introduced to
scale the outputs of specific structures, such as neurons, groups or residual
blocks. Then we add sparsity regularizations on these factors, and solve this
optimization problem by a modified stochastic Accelerated Proximal Gradient
(APG) method. By forcing some of the factors to zero, we can safely remove the
corresponding structures, thus prune the unimportant parts of a CNN. Comparing
with other structure selection methods that may need thousands of trials or
iterative fine-tuning, our method is trained fully end-to-end in one training
pass without bells and whistles. We evaluate our method, Sparse Structure
Selection with several state-of-the-art CNNs, and demonstrate very promising
results with adaptive depth and width selection.Comment: ECCV Camera ready versio
Analyzing sparse dictionaries for online learning with kernels
Many signal processing and machine learning methods share essentially the
same linear-in-the-parameter model, with as many parameters as available
samples as in kernel-based machines. Sparse approximation is essential in many
disciplines, with new challenges emerging in online learning with kernels. To
this end, several sparsity measures have been proposed in the literature to
quantify sparse dictionaries and constructing relevant ones, the most prolific
ones being the distance, the approximation, the coherence and the Babel
measures. In this paper, we analyze sparse dictionaries based on these
measures. By conducting an eigenvalue analysis, we show that these sparsity
measures share many properties, including the linear independence condition and
inducing a well-posed optimization problem. Furthermore, we prove that there
exists a quasi-isometry between the parameter (i.e., dual) space and the
dictionary's induced feature space.Comment: 10 page
- …