430 research outputs found
Online Learning with Ensembles
Supervised online learning with an ensemble of students randomized by the
choice of initial conditions is analyzed. For the case of the perceptron
learning rule, asymptotically the same improvement in the generalization error
of the ensemble compared to the performance of a single student is found as in
Gibbs learning. For more optimized learning rules, however, using an ensemble
yields no improvement. This is explained by showing that for any learning rule
a transform exists, such that a single student using
has the same generalization behaviour as an ensemble of
-students.Comment: 8 pages, 1 figure. Submitted to J.Phys.
Dynamical transitions in the evolution of learning algorithms by selection
We study the evolution of artificial learning systems by means of selection.
Genetic programming is used to generate a sequence of populations of algorithms
which can be used by neural networks for supervised learning of a rule that
generates examples. In opposition to concentrating on final results, which
would be the natural aim while designing good learning algorithms, we study the
evolution process and pay particular attention to the temporal order of
appearance of functional structures responsible for the improvements in the
learning process, as measured by the generalization capabilities of the
resulting algorithms. The effect of such appearances can be described as
dynamical phase transitions. The concepts of phenotypic and genotypic
entropies, which serve to describe the distribution of fitness in the
population and the distribution of symbols respectively, are used to monitor
the dynamics. In different runs the phase transitions might be present or not,
with the system finding out good solutions, or staying in poor regions of
algorithm space. Whenever phase transitions occur, the sequence of appearances
are the same. We identify combinations of variables and operators which are
useful in measuring experience or performance in rule extraction and can thus
implement useful annealing of the learning schedule.Comment: 11 pages, 11 figures, 2 table
Functional Optimisation of Online Algorithms in Multilayer Neural Networks
We study the online dynamics of learning in fully connected soft committee
machines in the student-teacher scenario. The locally optimal modulation
function, which determines the learning algorithm, is obtained from a
variational argument in such a manner as to maximise the average generalisation
error decay per example. Simulations results for the resulting algorithm are
presented for a few cases. The symmetric phase plateaux are found to be vastly
reduced in comparison to those found when online backpropagation algorithms are
used. A discussion of the implementation of these ideas as practical algorithms
is given
Gradient descent learning in and out of equilibrium
Relations between the off thermal equilibrium dynamical process of on-line
learning and the thermally equilibrated off-line learning are studied for
potential gradient descent learning. The approach of Opper to study on-line
Bayesian algorithms is extended to potential based or maximum likelihood
learning. We look at the on-line learning algorithm that best approximates the
off-line algorithm in the sense of least Kullback-Leibler information loss. It
works by updating the weights along the gradient of an effective potential
different from the parent off-line potential. The interpretation of this off
equilibrium dynamics holds some similarities to the cavity approach of
Griniasty. We are able to analyze networks with non-smooth transfer functions
and transfer the smoothness requirement to the potential.Comment: 08 pages, submitted to the Journal of Physics
Lobby index as a network centrality measure
We study the lobby index (l-index for short) as a local node centrality
measure for complex networks. The l-inde is compared with degree (a local
measure), betweenness and Eigenvector centralities (two global measures) in the
case of biological network (Yeast interaction protein-protein network) and a
linguistic network (Moby Thesaurus II). In both networks, the l-index has poor
correlation with betweenness but correlates with degree and Eigenvector. Being
a local measure, one can take advantage by using the l-index because it carries
more information about its neighbors when compared with degree centrality,
indeed it requires less time to compute when compared with Eigenvector
centrality. Results suggests that l-index produces better results than degree
and Eigenvector measures for ranking purposes, becoming suitable as a tool to
perform this task.Comment: 11 pages, 4 figures. arXiv admin note: substantial text overlap with
arXiv:1005.480
On the random neighbor Olami-Feder-Christensen slip-stick model
We reconsider the treatment of Lise and Jensen (Phys. Rev. Lett. 76, 2326
(1996)) on the random neighbor Olami-Feder-Christensen stik-slip model, and
examine the strong dependence of the results on the approximations used for the
distribution of states p(E).Comment: 6pages, 3 figures. To be published in PRE as a brief repor
On the robustness of scale invariance in SOC models
A random neighbor extremal stick-slip model is introduced. In the
thermodynamic limit, the distribution of states has a simple analytical form
and the mean avalanche size, as a function of the coupling parameter, is
exactly calculable. The system is critical only at a special point Jc in the
coupling parameter space. However, the critical region around this point, where
approximate scale invariance holds, is very large, suggesting a mechanism for
explaining the ubiquity of scale invariance in Nature.Comment: 6 pages, 4 figures; submitted to Physical Review E;
http://link.aps.org/doi/10.1103/PhysRevE.59.496
Cryptography based on neural networks - analytical results
Mutual learning process between two parity feed-forward networks with
discrete and continuous weights is studied analytically, and we find that the
number of steps required to achieve full synchronization between the two
networks in the case of discrete weights is finite. The synchronization process
is shown to be non-self-averaging and the analytical solution is based on
random auxiliary variables. The learning time of an attacker that is trying to
imitate one of the networks is examined analytically and is found to be much
longer than the synchronization time. Analytical results are found to be in
agreement with simulations
- …