1,505 research outputs found
Learning to Discriminate Through Long-Term Changes of Dynamical Synaptic Transmission
Short-term synaptic plasticity is modulated by long-term synaptic
changes. There is, however, no general agreement on the computational
role of this interaction. Here, we derive a learning rule for the release
probability and the maximal synaptic conductance in a circuit model
with combined recurrent and feedforward connections that allows learning
to discriminate among natural inputs. Short-term synaptic plasticity
thereby provides a nonlinear expansion of the input space of a linear
classifier, whereas the random recurrent network serves to decorrelate
the expanded input space. Computer simulations reveal that the twofold
increase in the number of input dimensions through short-term synaptic
plasticity improves the performance of a standard perceptron up to 100%.
The distributions of release probabilities and maximal synaptic conductances
at the capacity limit strongly depend on the balance between excitation
and inhibition. The model also suggests a new computational
interpretation of spikes evoked by stimuli outside the classical receptive
field. These neuronal activitiesmay reflect decorrelation of the expanded
stimulus space by intracortical synaptic connections
A CASE STUDY ON SUPPORT VECTOR MACHINES VERSUS ARTIFICIAL NEURAL NETWORKS
The capability of artificial neural networks for pattern recognition of real world problems is well known. In recent years, the support vector machine has been advocated for its structure risk minimization leading to tolerance margins of decision boundaries. Structures and performances of these pattern classifiers depend on the feature dimension and training data size. The objective of this research is to compare these pattern recognition systems based on a case study. The particular case considered is on classification of hypertensive and normotensive right ventricle (RV) shapes obtained from Magnetic Resonance Image (MRI) sequences. In this case, the feature dimension is reasonable, but the available training data set is small, however, the decision surface is highly nonlinear.For diagnosis of congenital heart defects, especially those associated with pressure and volume overload problems, a reliable pattern classifier for determining right ventricle function is needed. RV¡¦s global and regional surface to volume ratios are assessed from an individual¡¦s MRI heart images. These are used as features for pattern classifiers. We considered first two linear classification methods: the Fisher linear discriminant and the linear classifier trained by the Ho-Kayshap algorithm. When the data are not linearly separable, artificial neural networks with back-propagation training and radial basis function networks were then considered, providing nonlinear decision surfaces. Thirdly, a support vector machine was trained which gives tolerance margins on both sides of the decision surface. We have found in this case study that the back-propagation training of an artificial neural network depends heavily on the selection of initial weights, even though randomized. The support vector machine where radial basis function kernels are used is easily trained and provides decision tolerance margins, in spite of only small margins
Devising novel performance measures for assessing the behavior of multilayer perceptrons trained on regression tasks
This methodological article is mainly aimed at establishing a bridge between classification and regression tasks, in a frame shaped by performance evaluation. More specifically, a general procedure for calculating performance measures is proposed, which can be applied to both classification and regression models. To this end, a notable change in the policy used to evaluate the confusion matrix is made, with the goal of reporting information about regression performance therein. This policy, called generalized token sharing, allows to a) assess models trained on both classification and regression tasks, b) evaluate the importance of input features, and c) inspect the behavior of multilayer perceptrons by looking at their hidden layers. The occurrence of success and failure patterns at the hidden layers of multilayer perceptrons trained and tested on selected regression problems, together with the effectiveness of layer-wise training, is also discussed
Liquid State Machine with Dendritically Enhanced Readout for Low-power, Neuromorphic VLSI Implementations
In this paper, we describe a new neuro-inspired, hardware-friendly readout
stage for the liquid state machine (LSM), a popular model for reservoir
computing. Compared to the parallel perceptron architecture trained by the
p-delta algorithm, which is the state of the art in terms of performance of
readout stages, our readout architecture and learning algorithm can attain
better performance with significantly less synaptic resources making it
attractive for VLSI implementation. Inspired by the nonlinear properties of
dendrites in biological neurons, our readout stage incorporates neurons having
multiple dendrites with a lumped nonlinearity. The number of synaptic
connections on each branch is significantly lower than the total number of
connections from the liquid neurons and the learning algorithm tries to find
the best 'combination' of input connections on each branch to reduce the error.
Hence, the learning involves network rewiring (NRW) of the readout network
similar to structural plasticity observed in its biological counterparts. We
show that compared to a single perceptron using analog weights, this
architecture for the readout can attain, even by using the same number of
binary valued synapses, up to 3.3 times less error for a two-class spike train
classification problem and 2.4 times less error for an input rate approximation
task. Even with 60 times larger synapses, a group of 60 parallel perceptrons
cannot attain the performance of the proposed dendritically enhanced readout.
An additional advantage of this method for hardware implementations is that the
'choice' of connectivity can be easily implemented exploiting address event
representation (AER) protocols commonly used in current neuromorphic systems
where the connection matrix is stored in memory. Also, due to the use of binary
synapses, our proposed method is more robust against statistical variations.Comment: 14 pages, 19 figures, Journa
ThumbNet: One Thumbnail Image Contains All You Need for Recognition
Although deep convolutional neural networks (CNNs) have achieved great
success in computer vision tasks, its real-world application is still impeded
by its voracious demand of computational resources. Current works mostly seek
to compress the network by reducing its parameters or parameter-incurred
computation, neglecting the influence of the input image on the system
complexity. Based on the fact that input images of a CNN contain substantial
redundancy, in this paper, we propose a unified framework, dubbed as ThumbNet,
to simultaneously accelerate and compress CNN models by enabling them to infer
on one thumbnail image. We provide three effective strategies to train
ThumbNet. In doing so, ThumbNet learns an inference network that performs
equally well on small images as the original-input network on large images.
With ThumbNet, not only do we obtain the thumbnail-input inference network that
can drastically reduce computation and memory requirements, but also we obtain
an image downscaler that can generate thumbnail images for generic
classification tasks. Extensive experiments show the effectiveness of ThumbNet,
and demonstrate that the thumbnail-input inference network learned by ThumbNet
can adequately retain the accuracy of the original-input network even when the
input images are downscaled 16 times
Learning to Find Good Correspondences
We develop a deep architecture to learn to find good correspondences for
wide-baseline stereo. Given a set of putative sparse matches and the camera
intrinsics, we train our network in an end-to-end fashion to label the
correspondences as inliers or outliers, while simultaneously using them to
recover the relative pose, as encoded by the essential matrix. Our architecture
is based on a multi-layer perceptron operating on pixel coordinates rather than
directly on the image, and is thus simple and small. We introduce a novel
normalization technique, called Context Normalization, which allows us to
process each data point separately while imbuing it with global information,
and also makes the network invariant to the order of the correspondences. Our
experiments on multiple challenging datasets demonstrate that our method is
able to drastically improve the state of the art with little training data.Comment: CVPR 2018 (Oral
Support vector machine for functional data classification
In many applications, input data are sampled functions taking their values in
infinite dimensional spaces rather than standard vectors. This fact has complex
consequences on data analysis algorithms that motivate modifications of them.
In fact most of the traditional data analysis tools for regression,
classification and clustering have been adapted to functional inputs under the
general name of functional Data Analysis (FDA). In this paper, we investigate
the use of Support Vector Machines (SVMs) for functional data analysis and we
focus on the problem of curves discrimination. SVMs are large margin classifier
tools based on implicit non linear mappings of the considered data into high
dimensional spaces thanks to kernels. We show how to define simple kernels that
take into account the unctional nature of the data and lead to consistent
classification. Experiments conducted on real world data emphasize the benefit
of taking into account some functional aspects of the problems.Comment: 13 page
Statistical physics of neural systems
The ability of processing and storing information is considered a characteristic
trait of intelligent systems. In biological neural networks, learning is strongly
believed to take place at the synaptic level, in terms of modulation of synaptic
efficacy. It can be thus interpreted as the expression of a collective phenomena,
emerging when neurons connect each other in constituting a complex network of
interactions. In this work, we represent learning as an optimization problem, actually
implementing a local search, in the synaptic space, of specific configurations, known
as solutions and making a neural network able to accomplish a series of different
tasks. For instance, we would like the network to adapt the strength of its synaptic
connections, in order to be capable of classifying a series of objects, by assigning to
each object its corresponding class-label. Supported by a series of experiments, it
has been suggested that synapses may exploit a very few number of synaptic states
for encoding information. It is known that this feature makes learning in neural
networks a challenging task. Extending the large deviation analysis performed in
the extreme case of binary synaptic couplings, in this work, we prove the existence
of regions of the phase space, where solutions are organized in extremely dense
clusters. This picture turns out to be invariant to the tuning of all the parameters of
the model. Solutions within the clusters are more robust to noise, thus enhancing the
learning performances. This has inspired the design of new learning algorithms, as
well as it has clarified the effectiveness of the previously proposed ones. We further
provide quantitative evidence that the gain achievable when considering a greater
number of available synaptic states for encoding information, is consistent only up
to a very few number of bits. This is in line with the above mentioned experimental
results. Besides the challenging aspect of low precision synaptic connections, it is
also known that the neuronal environment is extremely noisy. Whether stochasticity
can enhance or worsen the learning performances is currently matter of debate. In
this work, we consider a neural network model where the synaptic connections are random variables, sampled according to a parametrized probability distribution.
We prove that, this source of stochasticity naturally drives towards regions of the
phase space at high densities of solutions. These regions are directly accessible by
means of gradient descent strategies, over the parameters of the synaptic couplings
distribution. We further set up a statistical physics analysis, through which we
show that solutions in the dense regions are characterized by robustness and good
generalization performances. Stochastic neural networks are also capable of building
abstract representations of input stimuli and then generating new input samples,
according to the inferred statistics of the input signal. In this regard, we propose a
new learning rule, called Delayed Correlation Matching (DCM), that relying on the
matching between time-delayed activity correlations, makes a neural network able
to store patterns of neuronal activity. When considering hidden neuronal states, the
DCM learning rule is also able to train Restricted Boltzmann Machines as generative
models. In this work, we further require the DCM learning rule to fulfil some
biological constraints, such as locality, sparseness of the neural coding and the Dale’s
principle. While retaining all these biological requirements, the DCM learning
rule has shown to be effective for different network topologies, and in both on-line
learning regimes and presence of correlated patterns. We further show that it is also
able to prevent the creation of spurious attractor states
- …