1,678 research outputs found
IMPROVEMENTS TO THE BACKPROPAGATION ALGORITHM
This paper presents some simple techniques to improve the backpropagation algorithm. Since learning in neural networks is an NP-complete problem and since traditional gradient descent methods are rather slow, many alternatives have been tried in order to accelerate convergence. Some of the proposed methods are mutually compatible and a combination of them normally works better than each method alone
Shedding light on social learning
Culture involves the origination and transmission of ideas, but the
conditions in which culture can emerge and evolve are unclear. We constructed
and studied a highly simplified neural-network model of these processes. In
this model ideas originate by individual learning from the environment and are
transmitted by communication between individuals. Individuals (or "agents")
comprise a single neuron which receives structured data from the environment
via plastic synaptic connections. The data are generated in the simplest
possible way: linear mixing of independently fluctuating sources and the goal
of learning is to unmix the data. To make this problem tractable we assume that
at least one of the sources fluctuates in a nonGaussian manner. Linear mixing
creates structure in the data, and agents attempt to learn (from the data and
possibly from other individuals) synaptic weights that will unmix, i.e., to
"understand" the agent's world. For a variety of reasons even this goal can be
difficult for a single agent to achieve; we studied one particular type of
difficulty (created by imperfection in synaptic plasticity), though our
conclusions should carry over to many other types of difficulty. We previously
studied whether a small population of communicating agents, learning from each
other, could more easily learn unmixing coefficients than isolated individuals,
learning only from their environment. We found, unsurprisingly, that if agents
learn indiscriminately from any other agent (whether or not they have learned
good solutions), communication does not enhance understanding. Here we extend
the model slightly, by allowing successful learners to be more effective
teachers, and find that now a population of agents can learn more effectively
than isolated individuals. We suggest that a key factor in the onset of culture
might be the development of selective learning.Comment: 11 pages 8 figure
Incorporating a priori knowledge into initialized weights for neural classifier
Artificial neural networks (ANN), especially, multilayer perceptrons (MLP) have been widely used in pattern recognition and classification. Nevertheless, how to incorporate a priori knowledge in the design of ANNs is still an open problem. The paper tries to give some insight on this topic emphasizing weight initialization from three perspectives. Theoretical analyses and simulations are offered for validatio
Non-attracting Regions of Local Minima in Deep and Wide Neural Networks
Understanding the loss surface of neural networks is essential for the design
of models with predictable performance and their success in applications.
Experimental results suggest that sufficiently deep and wide neural networks
are not negatively impacted by suboptimal local minima. Despite recent
progress, the reason for this outcome is not fully understood. Could deep
networks have very few, if at all, suboptimal local optima? or could all of
them be equally good? We provide a construction to show that suboptimal local
minima (i.e., non-global ones), even though degenerate, exist for fully
connected neural networks with sigmoid activation functions. The local minima
obtained by our construction belong to a connected set of local solutions that
can be escaped from via a non-increasing path on the loss curve. For extremely
wide neural networks of decreasing width after the wide layer, we prove that
every suboptimal local minimum belongs to such a connected set. This provides a
partial explanation for the successful application of deep neural networks. In
addition, we also characterize under what conditions the same construction
leads to saddle points instead of local minima for deep neural networks
Solving the linear interval tolerance problem for weight initialization of neural networks
Determining good initial conditions for an algorithm used to train a neural network is considered a parameter estimation problem dealing with uncertainty about the initial weights. Interval Analysis approaches model uncertainty in parameter estimation problems using intervals and formulating tolerance problems. Solving a tolerance problem is defining lower and upper bounds of the intervals so that the system functionality is
guaranteed within predefined limits. The aim of this paper is to show how the problem of determining the initial weight intervals of a neural network can be defined in terms of solving a linear interval tolerance problem. The proposed Linear Interval Tolerance
Approach copes with uncertainty about the initial weights without any previous knowledge or specific assumptions on the input data as required by approaches such as fuzzy
sets or rough sets. The proposed method is tested on a number of well known benchmarks for neural networks trained with the back-propagation family of algorithms. Its efficiency is evaluated with regards to standard performance measures and the results obtained are compared against results of a number of well known and established initialization methods. These results provide credible evidence that the proposed method
outperforms classical weight initialization methods
- ā¦