Search CORE

1,678 research outputs found

IMPROVEMENTS TO THE BACKPROPAGATION ALGORITHM

Author: MIRCEA PETRINI
Publication venue: University of Petrosani
Publication date: 01/12/2012
Field of study

This paper presents some simple techniques to improve the backpropagation algorithm. Since learning in neural networks is an NP-complete problem and since traditional gradient descent methods are rather slow, many alternatives have been tried in order to accelerate convergence. Some of the proposed methods are mutually compatible and a combination of them normally works better than each method alone

Directory of Open Access Journals

Shedding light on social learning

Author: Adams Paul R.
Cox Kingsley J. A.
Publication venue
Publication date: 13/10/2023
Field of study

Culture involves the origination and transmission of ideas, but the conditions in which culture can emerge and evolve are unclear. We constructed and studied a highly simplified neural-network model of these processes. In this model ideas originate by individual learning from the environment and are transmitted by communication between individuals. Individuals (or "agents") comprise a single neuron which receives structured data from the environment via plastic synaptic connections. The data are generated in the simplest possible way: linear mixing of independently fluctuating sources and the goal of learning is to unmix the data. To make this problem tractable we assume that at least one of the sources fluctuates in a nonGaussian manner. Linear mixing creates structure in the data, and agents attempt to learn (from the data and possibly from other individuals) synaptic weights that will unmix, i.e., to "understand" the agent's world. For a variety of reasons even this goal can be difficult for a single agent to achieve; we studied one particular type of difficulty (created by imperfection in synaptic plasticity), though our conclusions should carry over to many other types of difficulty. We previously studied whether a small population of communicating agents, learning from each other, could more easily learn unmixing coefficients than isolated individuals, learning only from their environment. We found, unsurprisingly, that if agents learn indiscriminately from any other agent (whether or not they have learned good solutions), communication does not enhance understanding. Here we extend the model slightly, by allowing successful learners to be more effective teachers, and find that now a population of agents can learn more effectively than isolated individuals. We suggest that a key factor in the onset of culture might be the development of selective learning.Comment: 11 pages 8 figure

arXiv.org e-Print Archive

Incorporating a priori knowledge into initialized weights for neural classifier

Author: Chen Zhe
Feng T.J.
Feng Tian-Jin
Houkes Z.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/07/2000
Field of study

Artificial neural networks (ANN), especially, multilayer perceptrons (MLP) have been widely used in pattern recognition and classification. Nevertheless, how to incorporate a priori knowledge in the design of ANNs is still an open problem. The paper tries to give some insight on this topic emphasizing weight initialization from three perspectives. Theoretical analyses and simulations are offered for validatio

University of Twente Research Information

Non-attracting Regions of Local Minima in Deep and Wide Neural Networks

Author: Petzka Henning
Sminchisescu Cristian
Publication venue
Publication date: 31/08/2020
Field of study

Understanding the loss surface of neural networks is essential for the design of models with predictable performance and their success in applications. Experimental results suggest that sufficiently deep and wide neural networks are not negatively impacted by suboptimal local minima. Despite recent progress, the reason for this outcome is not fully understood. Could deep networks have very few, if at all, suboptimal local optima? or could all of them be equally good? We provide a construction to show that suboptimal local minima (i.e., non-global ones), even though degenerate, exist for fully connected neural networks with sigmoid activation functions. The local minima obtained by our construction belong to a connected set of local solutions that can be escaped from via a non-increasing path on the loss curve. For extremely wide neural networks of decreasing width after the wide layer, we prove that every suboptimal local minimum belongs to such a connected set. This provides a partial explanation for the successful application of deep neural networks. In addition, we also characterize under what conditions the same construction leads to saddle points instead of local minima for deep neural networks

arXiv.org e-Print Archive

Lund University Publications

A New Algorithm for Initialization and Training of Beta Multi-Library Wavelets Neural Network

Author: Chokri Ben Amar
Mohamed Adel Alimi
Mohamed Othmani
Wajdi Bellil
Publication venue: 'IntechOpen'
Publication date: 01/10/2008
Field of study

IntechOpen

Solving the linear interval tolerance problem for weight initialization of neural networks

Author: Adam S.P.
Karras D.A.
Magoulas George D.
Vrahatis M.N.
Publication venue: 'Elsevier BV'
Publication date: 01/06/2014
Field of study

Determining good initial conditions for an algorithm used to train a neural network is considered a parameter estimation problem dealing with uncertainty about the initial weights. Interval Analysis approaches model uncertainty in parameter estimation problems using intervals and formulating tolerance problems. Solving a tolerance problem is defining lower and upper bounds of the intervals so that the system functionality is guaranteed within predefined limits. The aim of this paper is to show how the problem of determining the initial weight intervals of a neural network can be defined in terms of solving a linear interval tolerance problem. The proposed Linear Interval Tolerance Approach copes with uncertainty about the initial weights without any previous knowledge or specific assumptions on the input data as required by approaches such as fuzzy sets or rough sets. The proposed method is tested on a number of well known benchmarks for neural networks trained with the back-propagation family of algorithms. Its efficiency is evaluated with regards to standard performance measures and the results obtained are compared against results of a number of well known and established initialization methods. These results provide credible evidence that the proposed method outperforms classical weight initialization methods

Crossref

Birkbeck Institutional Research Online