Search CORE

13,575 research outputs found

Ensemble learning of linear perceptron; Online learning theory

Author: Breiman L.
Freund Y.
Hara K.
Hara K.
Krogh A.
Lazarevic A.
Miyoshi S.
Nishimori H.
Urbanczik R.
Publication venue: 'Japan Society of Applied Physics'
Publication date: 03/02/2004
Field of study

Within the framework of on-line learning, we study the generalization error of an ensemble learning machine learning from a linear teacher perceptron. The generalization error achieved by an ensemble of linear perceptrons having homogeneous or inhomogeneous initial weight vectors is precisely calculated at the thermodynamic limit of a large number of input elements and shows rich behavior. Our main findings are as follows. For learning with homogeneous initial weight vectors, the generalization error using an infinite number of linear student perceptrons is equal to only half that of a single linear perceptron, and converges with that of the infinite case with O(1/K) for a finite number of K linear perceptrons. For learning with inhomogeneous initial weight vectors, it is advantageous to use an approach of weighted averaging over the output of the linear perceptrons, and we show the conditions under which the optimal weights are constant during the learning process. The optimal weights depend on only correlation of the initial weight vectors.Comment: 14 pages, 3 figures, submitted to Physical Review

arXiv.org e-Print Archive

Crossref

High Order and Multilayer Perceptron Initialization

Author: Fiesler Emile
Thimm Georg
Publication venue: IDIAP
Publication date: 10/03/2006
Field of study

Proper initialization is one of the most important prerequisites for fast convergence of feed-forward neural networks like high order and multilayer perceptrons. This publication aims at determining the optimal variance (or range) for the initial weights and biases, which is the principal parameter of random initialization methods for both types of neural networks. An overview of random weight initialization methods for multilayer perceptrons is presented. These methods are extensively tested using eight real- world benchmark data sets and a broad range of initial weight variances by means of more than

30,000

simulations, in the aim to find the best weight initialization method for multilayer perceptrons. For high order networks, a large number of experiments (more than

200,000

simulations) was performed, using three weight distributions, three activation functions, several network orders, and the same eight data sets. The results of these experiments are compared to weight initialization techniques for multilayer perceptrons, which leads to the proposal of a suitable initialization method for high order perceptrons. The conclusions on the initialization methods for both types of networks are justified by sufficiently small confidence intervals of the mean convergence times

Infoscience - École polytechnique fédérale de Lausanne

Dynamics of Interacting Neural Networks

Author: Kanter I.
Kinzel W.
Metzler R.
Publication venue
Publication date: 01/01/1999
Field of study

The dynamics of interacting perceptrons is solved analytically. For a directed flow of information the system runs into a state which has a higher symmetry than the topology of the model. A symmetry breaking phase transition is found with increasing learning rate. In addition it is shown that a system of interacting perceptrons which is trained on the history of its minority decisions develops a good strategy for the problem of adaptive competition known as the Bar Problem or Minority Game.Comment: 9 pages, 3 figures; typos corrected, content reorganize

arXiv.org e-Print Archive

CiteSeerX

Optimizing 0/1 Loss for Perceptrons by Random Coordinate Descent

Author: Li Ling
Lin Hsuan-Tien
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

The 0/1 loss is an important cost function for perceptrons. Nevertheless it cannot be easily minimized by most existing perceptron learning algorithms. In this paper, we propose a family of random coordinate descent algorithms to directly minimize the 0/1 loss for perceptrons, and prove their convergence. Our algorithms are computationally efficient, and usually achieve the lowest 0/1 loss compared with other algorithms. Such advantages make them favorable for nonseparable real-world problems. Experiments show that our algorithms are especially useful for ensemble learning, and could achieve the lowest test error for many complex data sets when coupled with AdaBoost

Crossref

Caltech Authors

Liquid State Machine with Dendritically Enhanced Readout for Low-power, Neuromorphic VLSI Implementations

Author: Banerjee Amitava
Basu Arindam
Roy Subhrajit
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

In this paper, we describe a new neuro-inspired, hardware-friendly readout stage for the liquid state machine (LSM), a popular model for reservoir computing. Compared to the parallel perceptron architecture trained by the p-delta algorithm, which is the state of the art in terms of performance of readout stages, our readout architecture and learning algorithm can attain better performance with significantly less synaptic resources making it attractive for VLSI implementation. Inspired by the nonlinear properties of dendrites in biological neurons, our readout stage incorporates neurons having multiple dendrites with a lumped nonlinearity. The number of synaptic connections on each branch is significantly lower than the total number of connections from the liquid neurons and the learning algorithm tries to find the best 'combination' of input connections on each branch to reduce the error. Hence, the learning involves network rewiring (NRW) of the readout network similar to structural plasticity observed in its biological counterparts. We show that compared to a single perceptron using analog weights, this architecture for the readout can attain, even by using the same number of binary valued synapses, up to 3.3 times less error for a two-class spike train classification problem and 2.4 times less error for an input rate approximation task. Even with 60 times larger synapses, a group of 60 parallel perceptrons cannot attain the performance of the proposed dendritically enhanced readout. An additional advantage of this method for hardware implementations is that the 'choice' of connectivity can be easily implemented exploiting address event representation (AER) protocols commonly used in current neuromorphic systems where the connection matrix is stored in memory. Also, due to the use of binary synapses, our proposed method is more robust against statistical variations.Comment: 14 pages, 19 figures, Journa

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)