13,575 research outputs found
Ensemble learning of linear perceptron; Online learning theory
Within the framework of on-line learning, we study the generalization error
of an ensemble learning machine learning from a linear teacher perceptron. The
generalization error achieved by an ensemble of linear perceptrons having
homogeneous or inhomogeneous initial weight vectors is precisely calculated at
the thermodynamic limit of a large number of input elements and shows rich
behavior. Our main findings are as follows. For learning with homogeneous
initial weight vectors, the generalization error using an infinite number of
linear student perceptrons is equal to only half that of a single linear
perceptron, and converges with that of the infinite case with O(1/K) for a
finite number of K linear perceptrons. For learning with inhomogeneous initial
weight vectors, it is advantageous to use an approach of weighted averaging
over the output of the linear perceptrons, and we show the conditions under
which the optimal weights are constant during the learning process. The optimal
weights depend on only correlation of the initial weight vectors.Comment: 14 pages, 3 figures, submitted to Physical Review
High Order and Multilayer Perceptron Initialization
Proper initialization is one of the most important prerequisites for fast convergence of feed-forward neural networks like high order and multilayer perceptrons. This publication aims at determining the optimal variance (or range) for the initial weights and biases, which is the principal parameter of random initialization methods for both types of neural networks. An overview of random weight initialization methods for multilayer perceptrons is presented. These methods are extensively tested using eight real- world benchmark data sets and a broad range of initial weight variances by means of more than simulations, in the aim to find the best weight initialization method for multilayer perceptrons. For high order networks, a large number of experiments (more than simulations) was performed, using three weight distributions, three activation functions, several network orders, and the same eight data sets. The results of these experiments are compared to weight initialization techniques for multilayer perceptrons, which leads to the proposal of a suitable initialization method for high order perceptrons. The conclusions on the initialization methods for both types of networks are justified by sufficiently small confidence intervals of the mean convergence times
Dynamics of Interacting Neural Networks
The dynamics of interacting perceptrons is solved analytically. For a
directed flow of information the system runs into a state which has a higher
symmetry than the topology of the model. A symmetry breaking phase transition
is found with increasing learning rate. In addition it is shown that a system
of interacting perceptrons which is trained on the history of its minority
decisions develops a good strategy for the problem of adaptive competition
known as the Bar Problem or Minority Game.Comment: 9 pages, 3 figures; typos corrected, content reorganize
Optimizing 0/1 Loss for Perceptrons by Random Coordinate Descent
The 0/1 loss is an important cost function for perceptrons. Nevertheless it cannot be easily minimized by most existing perceptron learning algorithms. In this paper, we propose a family of random coordinate descent algorithms to directly minimize the 0/1 loss for perceptrons, and prove their convergence. Our algorithms are computationally efficient, and usually achieve the lowest 0/1 loss compared with other algorithms. Such advantages make them favorable for nonseparable real-world problems. Experiments show that our algorithms are especially useful for ensemble learning, and could achieve the lowest test error for many complex data sets when coupled with AdaBoost
Liquid State Machine with Dendritically Enhanced Readout for Low-power, Neuromorphic VLSI Implementations
In this paper, we describe a new neuro-inspired, hardware-friendly readout
stage for the liquid state machine (LSM), a popular model for reservoir
computing. Compared to the parallel perceptron architecture trained by the
p-delta algorithm, which is the state of the art in terms of performance of
readout stages, our readout architecture and learning algorithm can attain
better performance with significantly less synaptic resources making it
attractive for VLSI implementation. Inspired by the nonlinear properties of
dendrites in biological neurons, our readout stage incorporates neurons having
multiple dendrites with a lumped nonlinearity. The number of synaptic
connections on each branch is significantly lower than the total number of
connections from the liquid neurons and the learning algorithm tries to find
the best 'combination' of input connections on each branch to reduce the error.
Hence, the learning involves network rewiring (NRW) of the readout network
similar to structural plasticity observed in its biological counterparts. We
show that compared to a single perceptron using analog weights, this
architecture for the readout can attain, even by using the same number of
binary valued synapses, up to 3.3 times less error for a two-class spike train
classification problem and 2.4 times less error for an input rate approximation
task. Even with 60 times larger synapses, a group of 60 parallel perceptrons
cannot attain the performance of the proposed dendritically enhanced readout.
An additional advantage of this method for hardware implementations is that the
'choice' of connectivity can be easily implemented exploiting address event
representation (AER) protocols commonly used in current neuromorphic systems
where the connection matrix is stored in memory. Also, due to the use of binary
synapses, our proposed method is more robust against statistical variations.Comment: 14 pages, 19 figures, Journa
- …