17,726 research outputs found

    Training a perceptron in a discrete weight space

    Full text link
    On-line and batch learning of a perceptron in a discrete weight space, where each weight can take 2L+12 L+1 different values, are examined analytically and numerically. The learning algorithm is based on the training of the continuous perceptron and prediction following the clipped weights. The learning is described by a new set of order parameters, composed of the overlaps between the teacher and the continuous/clipped students. Different scenarios are examined among them on-line learning with discrete/continuous transfer functions and off-line Hebb learning. The generalization error of the clipped weights decays asymptotically as exp(Kα2)exp(-K \alpha^2)/exp(eλα)exp(-e^{|\lambda| \alpha}) in the case of on-line learning with binary/continuous activation functions, respectively, where α\alpha is the number of examples divided by N, the size of the input vector and KK is a positive constant that decays linearly with 1/L. For finite NN and LL, a perfect agreement between the discrete student and the teacher is obtained for αLln(NL)\alpha \propto \sqrt{L \ln(NL)}. A crossover to the generalization error 1/α\propto 1/\alpha, characterized continuous weights with binary output, is obtained for synaptic depth L>O(N)L > O(\sqrt{N}).Comment: 10 pages, 5 figs., submitted to PR

    On the role of synaptic stochasticity in training low-precision neural networks

    Get PDF
    Stochasticity and limited precision of synaptic weights in neural network models are key aspects of both biological and hardware modeling of learning processes. Here we show that a neural network model with stochastic binary weights naturally gives prominence to exponentially rare dense regions of solutions with a number of desirable properties such as robustness and good generalization performance, while typical solutions are isolated and hard to find. Binary solutions of the standard perceptron problem are obtained from a simple gradient descent procedure on a set of real values parametrizing a probability distribution over the binary synapses. Both analytical and numerical results are presented. An algorithmic extension aimed at training discrete deep neural networks is also investigated.Comment: 7 pages + 14 pages of supplementary materia

    Theory of Interacting Neural Networks

    Full text link
    In this contribution we give an overview over recent work on the theory of interacting neural networks. The model is defined in Section 2. The typical teacher/student scenario is considered in Section 3. A static teacher network is presenting training examples for an adaptive student network. In the case of multilayer networks, the student shows a transition from a symmetric state to specialisation. Neural networks can also generate a time series. Training on time series and predicting it are studied in Section 4. When a network is trained on its own output, it is interacting with itself. Such a scenario has implications on the theory of prediction algorithms, as discussed in Section 5. When a system of networks is trained on its minority decisions, it may be considered as a model for competition in closed markets, see Section 6. In Section 7 we consider two mutually interacting networks. A novel phenomenon is observed: synchronisation by mutual learning. In Section 8 it is shown, how this phenomenon can be applied to cryptography: Generation of a secret key over a public channel.Comment: Contribution to Networks, ed. by H.G. Schuster and S. Bornholdt, to be published by Wiley VC

    Herding as a Learning System with Edge-of-Chaos Dynamics

    Full text link
    Herding defines a deterministic dynamical system at the edge of chaos. It generates a sequence of model states and parameters by alternating parameter perturbations with state maximizations, where the sequence of states can be interpreted as "samples" from an associated MRF model. Herding differs from maximum likelihood estimation in that the sequence of parameters does not converge to a fixed point and differs from an MCMC posterior sampling approach in that the sequence of states is generated deterministically. Herding may be interpreted as a"perturb and map" method where the parameter perturbations are generated using a deterministic nonlinear dynamical system rather than randomly from a Gumbel distribution. This chapter studies the distinct statistical characteristics of the herding algorithm and shows that the fast convergence rate of the controlled moments may be attributed to edge of chaos dynamics. The herding algorithm can also be generalized to models with latent variables and to a discriminative learning setting. The perceptron cycling theorem ensures that the fast moment matching property is preserved in the more general framework

    Learning by message-passing in networks of discrete synapses

    Get PDF
    We show that a message-passing process allows to store in binary "material" synapses a number of random patterns which almost saturates the information theoretic bounds. We apply the learning algorithm to networks characterized by a wide range of different connection topologies and of size comparable with that of biological systems (e.g. n105106n\simeq10^{5}-10^{6}). The algorithm can be turned into an on-line --fault tolerant-- learning protocol of potential interest in modeling aspects of synaptic plasticity and in building neuromorphic devices.Comment: 4 pages, 3 figures; references updated and minor corrections; accepted in PR

    Interacting neural networks and cryptography

    Full text link
    Two neural networks which are trained on their mutual output bits are analysed using methods of statistical physics. The exact solution of the dynamics of the two weight vectors shows a novel phenomenon: The networks synchronize to a state with identical time dependent weights. Extending the models to multilayer networks with discrete weights, it is shown how synchronization by mutual learning can be applied to secret key exchange over a public channel.Comment: Invited talk for the meeting of the German Physical Societ
    corecore