1,678 research outputs found

    IMPROVEMENTS TO THE BACKPROPAGATION ALGORITHM

    Get PDF
    This paper presents some simple techniques to improve the backpropagation algorithm. Since learning in neural networks is an NP-complete problem and since traditional gradient descent methods are rather slow, many alternatives have been tried in order to accelerate convergence. Some of the proposed methods are mutually compatible and a combination of them normally works better than each method alone

    Shedding light on social learning

    Full text link
    Culture involves the origination and transmission of ideas, but the conditions in which culture can emerge and evolve are unclear. We constructed and studied a highly simplified neural-network model of these processes. In this model ideas originate by individual learning from the environment and are transmitted by communication between individuals. Individuals (or "agents") comprise a single neuron which receives structured data from the environment via plastic synaptic connections. The data are generated in the simplest possible way: linear mixing of independently fluctuating sources and the goal of learning is to unmix the data. To make this problem tractable we assume that at least one of the sources fluctuates in a nonGaussian manner. Linear mixing creates structure in the data, and agents attempt to learn (from the data and possibly from other individuals) synaptic weights that will unmix, i.e., to "understand" the agent's world. For a variety of reasons even this goal can be difficult for a single agent to achieve; we studied one particular type of difficulty (created by imperfection in synaptic plasticity), though our conclusions should carry over to many other types of difficulty. We previously studied whether a small population of communicating agents, learning from each other, could more easily learn unmixing coefficients than isolated individuals, learning only from their environment. We found, unsurprisingly, that if agents learn indiscriminately from any other agent (whether or not they have learned good solutions), communication does not enhance understanding. Here we extend the model slightly, by allowing successful learners to be more effective teachers, and find that now a population of agents can learn more effectively than isolated individuals. We suggest that a key factor in the onset of culture might be the development of selective learning.Comment: 11 pages 8 figure

    Incorporating a priori knowledge into initialized weights for neural classifier

    Get PDF
    Artificial neural networks (ANN), especially, multilayer perceptrons (MLP) have been widely used in pattern recognition and classification. Nevertheless, how to incorporate a priori knowledge in the design of ANNs is still an open problem. The paper tries to give some insight on this topic emphasizing weight initialization from three perspectives. Theoretical analyses and simulations are offered for validatio

    Non-attracting Regions of Local Minima in Deep and Wide Neural Networks

    Full text link
    Understanding the loss surface of neural networks is essential for the design of models with predictable performance and their success in applications. Experimental results suggest that sufficiently deep and wide neural networks are not negatively impacted by suboptimal local minima. Despite recent progress, the reason for this outcome is not fully understood. Could deep networks have very few, if at all, suboptimal local optima? or could all of them be equally good? We provide a construction to show that suboptimal local minima (i.e., non-global ones), even though degenerate, exist for fully connected neural networks with sigmoid activation functions. The local minima obtained by our construction belong to a connected set of local solutions that can be escaped from via a non-increasing path on the loss curve. For extremely wide neural networks of decreasing width after the wide layer, we prove that every suboptimal local minimum belongs to such a connected set. This provides a partial explanation for the successful application of deep neural networks. In addition, we also characterize under what conditions the same construction leads to saddle points instead of local minima for deep neural networks

    Solving the linear interval tolerance problem for weight initialization of neural networks

    Get PDF
    Determining good initial conditions for an algorithm used to train a neural network is considered a parameter estimation problem dealing with uncertainty about the initial weights. Interval Analysis approaches model uncertainty in parameter estimation problems using intervals and formulating tolerance problems. Solving a tolerance problem is defining lower and upper bounds of the intervals so that the system functionality is guaranteed within predefined limits. The aim of this paper is to show how the problem of determining the initial weight intervals of a neural network can be defined in terms of solving a linear interval tolerance problem. The proposed Linear Interval Tolerance Approach copes with uncertainty about the initial weights without any previous knowledge or specific assumptions on the input data as required by approaches such as fuzzy sets or rough sets. The proposed method is tested on a number of well known benchmarks for neural networks trained with the back-propagation family of algorithms. Its efficiency is evaluated with regards to standard performance measures and the results obtained are compared against results of a number of well known and established initialization methods. These results provide credible evidence that the proposed method outperforms classical weight initialization methods
    • ā€¦
    corecore