17,726 research outputs found
Training a perceptron in a discrete weight space
On-line and batch learning of a perceptron in a discrete weight space, where
each weight can take different values, are examined analytically and
numerically. The learning algorithm is based on the training of the continuous
perceptron and prediction following the clipped weights. The learning is
described by a new set of order parameters, composed of the overlaps between
the teacher and the continuous/clipped students. Different scenarios are
examined among them on-line learning with discrete/continuous transfer
functions and off-line Hebb learning. The generalization error of the clipped
weights decays asymptotically as / in the case of on-line learning with binary/continuous activation
functions, respectively, where is the number of examples divided by N,
the size of the input vector and is a positive constant that decays
linearly with 1/L. For finite and , a perfect agreement between the
discrete student and the teacher is obtained for . A crossover to the generalization error ,
characterized continuous weights with binary output, is obtained for synaptic
depth .Comment: 10 pages, 5 figs., submitted to PR
On the role of synaptic stochasticity in training low-precision neural networks
Stochasticity and limited precision of synaptic weights in neural network
models are key aspects of both biological and hardware modeling of learning
processes. Here we show that a neural network model with stochastic binary
weights naturally gives prominence to exponentially rare dense regions of
solutions with a number of desirable properties such as robustness and good
generalization performance, while typical solutions are isolated and hard to
find. Binary solutions of the standard perceptron problem are obtained from a
simple gradient descent procedure on a set of real values parametrizing a
probability distribution over the binary synapses. Both analytical and
numerical results are presented. An algorithmic extension aimed at training
discrete deep neural networks is also investigated.Comment: 7 pages + 14 pages of supplementary materia
Theory of Interacting Neural Networks
In this contribution we give an overview over recent work on the theory of
interacting neural networks. The model is defined in Section 2. The typical
teacher/student scenario is considered in Section 3. A static teacher network
is presenting training examples for an adaptive student network. In the case of
multilayer networks, the student shows a transition from a symmetric state to
specialisation. Neural networks can also generate a time series. Training on
time series and predicting it are studied in Section 4. When a network is
trained on its own output, it is interacting with itself. Such a scenario has
implications on the theory of prediction algorithms, as discussed in Section 5.
When a system of networks is trained on its minority decisions, it may be
considered as a model for competition in closed markets, see Section 6. In
Section 7 we consider two mutually interacting networks. A novel phenomenon is
observed: synchronisation by mutual learning. In Section 8 it is shown, how
this phenomenon can be applied to cryptography: Generation of a secret key over
a public channel.Comment: Contribution to Networks, ed. by H.G. Schuster and S. Bornholdt, to
be published by Wiley VC
Herding as a Learning System with Edge-of-Chaos Dynamics
Herding defines a deterministic dynamical system at the edge of chaos. It
generates a sequence of model states and parameters by alternating parameter
perturbations with state maximizations, where the sequence of states can be
interpreted as "samples" from an associated MRF model. Herding differs from
maximum likelihood estimation in that the sequence of parameters does not
converge to a fixed point and differs from an MCMC posterior sampling approach
in that the sequence of states is generated deterministically. Herding may be
interpreted as a"perturb and map" method where the parameter perturbations are
generated using a deterministic nonlinear dynamical system rather than randomly
from a Gumbel distribution. This chapter studies the distinct statistical
characteristics of the herding algorithm and shows that the fast convergence
rate of the controlled moments may be attributed to edge of chaos dynamics. The
herding algorithm can also be generalized to models with latent variables and
to a discriminative learning setting. The perceptron cycling theorem ensures
that the fast moment matching property is preserved in the more general
framework
Learning by message-passing in networks of discrete synapses
We show that a message-passing process allows to store in binary "material"
synapses a number of random patterns which almost saturates the information
theoretic bounds. We apply the learning algorithm to networks characterized by
a wide range of different connection topologies and of size comparable with
that of biological systems (e.g. ). The algorithm can be
turned into an on-line --fault tolerant-- learning protocol of potential
interest in modeling aspects of synaptic plasticity and in building
neuromorphic devices.Comment: 4 pages, 3 figures; references updated and minor corrections;
accepted in PR
Interacting neural networks and cryptography
Two neural networks which are trained on their mutual output bits are
analysed using methods of statistical physics. The exact solution of the
dynamics of the two weight vectors shows a novel phenomenon: The networks
synchronize to a state with identical time dependent weights. Extending the
models to multilayer networks with discrete weights, it is shown how
synchronization by mutual learning can be applied to secret key exchange over a
public channel.Comment: Invited talk for the meeting of the German Physical Societ
- …