536 research outputs found
Statistical Mechanics of Support Vector Networks
Using methods of Statistical Physics, we investigate the generalization
performance of support vector machines (SVMs), which have been recently
introduced as a general alternative to neural networks. For nonlinear
classification rules, the generalization error saturates on a plateau, when the
number of examples is too small to properly estimate the coefficients of the
nonlinear part. When trained on simple rules, we find that SVMs overfit only
weakly. The performance of SVMs is strongly enhanced, when the distribution of
the inputs has a gap in feature space.Comment: REVTeX, 4 pages, 2 figures, accepted by Phys. Rev. Lett (typos
corrected
Online Learning with Ensembles
Supervised online learning with an ensemble of students randomized by the
choice of initial conditions is analyzed. For the case of the perceptron
learning rule, asymptotically the same improvement in the generalization error
of the ensemble compared to the performance of a single student is found as in
Gibbs learning. For more optimized learning rules, however, using an ensemble
yields no improvement. This is explained by showing that for any learning rule
a transform exists, such that a single student using
has the same generalization behaviour as an ensemble of
-students.Comment: 8 pages, 1 figure. Submitted to J.Phys.
Phase Transitions of Neural Networks
The cooperative behaviour of interacting neurons and synapses is studied
using models and methods from statistical physics. The competition between
training error and entropy may lead to discontinuous properties of the neural
network. This is demonstrated for a few examples: Perceptron, associative
memory, learning from examples, generalization, multilayer networks, structure
recognition, Bayesian estimate, on-line training, noise estimation and time
series generation.Comment: Plenary talk for MINERVA workshop on mesoscopics, fractals and neural
networks, Eilat, March 1997 Postscript Fil
Generalization properties of finite size polynomial Support Vector Machines
The learning properties of finite size polynomial Support Vector Machines are
analyzed in the case of realizable classification tasks. The normalization of
the high order features acts as a squeezing factor, introducing a strong
anisotropy in the patterns distribution in feature space. As a function of the
training set size, the corresponding generalization error presents a crossover,
more or less abrupt depending on the distribution's anisotropy and on the task
to be learned, between a fast-decreasing and a slowly decreasing regime. This
behaviour corresponds to the stepwise decrease found by Dietrich et al.[Phys.
Rev. Lett. 82 (1999) 2975-2978] in the thermodynamic limit. The theoretical
results are in excellent agreement with the numerical simulations.Comment: 12 pages, 7 figure
Retarded Learning: Rigorous Results from Statistical Mechanics
We study learning of probability distributions characterized by an unknown
symmetry direction. Based on an entropic performance measure and the
variational method of statistical mechanics we develop exact upper and lower
bounds on the scaled critical number of examples below which learning of the
direction is impossible. The asymptotic tightness of the bounds suggests an
asymptotically optimal method for learning nonsmooth distributions.Comment: 8 pages, 1 figur
Parameter estimation and inference for stochastic reaction-diffusion systems: application to morphogenesis in D. melanogaster
Background: Reaction-diffusion systems are frequently used in systems biology to model developmental and signalling processes. In many applications, count numbers of the diffusing molecular species are very low, leading to the need to explicitly model the inherent variability using stochastic methods. Despite their importance and frequent use, parameter estimation for both deterministic and stochastic reaction-diffusion systems is still a challenging problem.
Results: We present a Bayesian inference approach to solve both the parameter and state estimation problem for stochastic reaction-diffusion systems. This allows a determination of the full posterior distribution of the parameters (expected values and uncertainty). We benchmark the method by illustrating it on a simple synthetic experiment. We then test the method on real data about the diffusion of the morphogen Bicoid in Drosophila melanogaster. The results show how the precision with which parameters can be inferred varies dramatically, indicating that the ability to infer full posterior distributions on the parameters can have important experimental design consequences.
Conclusions: The results obtained demonstrate the feasibility and potential advantages of applying a Bayesian approach to parameter estimation in stochastic reaction-diffusion systems. In particular, the ability to estimate credibility intervals associated with parameter estimates can be precious for experimental design. Further work, however, will be needed to ensure the method can scale up to larger problems
Statistical Mechanics of Learning in the Presence of Outliers
Using methods of statistical mechanics, we analyse the effect of outliers on
the supervised learning of a classification problem. The learning strategy aims
at selecting informative examples and discarding outliers. We compare two
algorithms which perform the selection either in a soft or a hard way. When the
fraction of outliers grows large, the estimation errors undergo a first order
phase transition.Comment: 24 pages, 7 figures (minor extensions added
Field Theoretical Analysis of On-line Learning of Probability Distributions
On-line learning of probability distributions is analyzed from the field
theoretical point of view. We can obtain an optimal on-line learning algorithm,
since renormalization group enables us to control the number of degrees of
freedom of a system according to the number of examples. We do not learn
parameters of a model, but probability distributions themselves. Therefore, the
algorithm requires no a priori knowledge of a model.Comment: 4 pages, 1 figure, RevTe
Gradient descent learning in and out of equilibrium
Relations between the off thermal equilibrium dynamical process of on-line
learning and the thermally equilibrated off-line learning are studied for
potential gradient descent learning. The approach of Opper to study on-line
Bayesian algorithms is extended to potential based or maximum likelihood
learning. We look at the on-line learning algorithm that best approximates the
off-line algorithm in the sense of least Kullback-Leibler information loss. It
works by updating the weights along the gradient of an effective potential
different from the parent off-line potential. The interpretation of this off
equilibrium dynamics holds some similarities to the cavity approach of
Griniasty. We are able to analyze networks with non-smooth transfer functions
and transfer the smoothness requirement to the potential.Comment: 08 pages, submitted to the Journal of Physics
- …