187 research outputs found
Constructive Neural Networks
Tato práce se zabývá neuronovými sítěmi - konkrétně sítěmi s proměnnou topologií. Teoretická část popisuje neuronové sítě a jejich matematické modely. Dále ukazuje základní algoritmy pro učení neuronových sítí a rozebírá několik základních konstruktivních algoritmů a jejich rozšíření. Praktická část se zaobírá implementací vybraných konstruktivních algoritmů a uvádí jejich porovnání. Dále jsou algoritmy srovnány s učícím algoritmem backpropagation.Master theses deals with Constructive Neural newtorks. First part describes neural networks and coresponding mathematical models. Furher, it shows basic algorithms for learning neural networks and desribes basic constructive algotithms and their modifications. The second part deals with implementation details of selected algorithms and provides their comparision. Further comparision with backpropagation algorithm is provided.
Constructive neural networks : generalisation, convergence and architectures
Feedforward neural networks trained via supervised learning have proven to be
successful in the field of pattern recognition. The most important feature of a
pattern recognition technique is its ability to successfully classify future data.
This is known as generalisation. A more practical aspect of pattern recognition
methods is how quickly they can be trained and how reliably a good solution is
found. Feedforward neural networks have been shown to provide good generali-
sation on a variety of problems. A number of training techniques also exist that
provide fast convergence.
Two problems often addressed within the field of feedforward neural networks are
how to improve thegeneralisation and convergence of these pattern recognition
techniques. These two problems are addressed in this thesis through the frame-
work of constructive neural network algorithms. Constructive neural networks
are a type of feedforward neural network in which the network architecture is
built during the training process. The type of architecture built can affect both
generalisation and convergence speed.
Convergence speed and reliability areimportant properties of feedforward neu-
ral networks. These properties are studied by examining different training al-
gorithms and the effect of using a constructive process. A new gradient based
training algorithm, SARPROP, is introduced. This algorithm addresses the
problems of poor convergence speed and reliability when using a gradient based
training method. SARPROP is shown to increase both convergence speed and
the chance of convergence to a good solution. This is achieved through the
combination of gradient based and Simulated Annealing methods.
The convergence properties of various constructive algorithms are examined
through a series of empirical studies. The results of these studies demonstrate
that the cascade architecture allows for faster, more reliable convergence using a
gradient based method than a single layer architecture with a comparable num-
ber of weights. It is shown that constructive algorithms that bias the search
direction of the gradient based training algorithm for the newly added hidden
neurons, produce smaller networks and more rapid convergence. A constructive
algorithm using search direction biasing is shown to converge to solutions with
networks that are unreliable and ineÆcient to train using a non-constructive
gradient based algorithm. The technique of weight freezing is shown to result in
larger architectures than those obtained from training the whole network.
Improving the generalisation ability of constructive neural networks is an im-
portant area of investigation. A series of empirical studies are performed to
examine the effect of regularisation on generalisation in constructive cascade al-
gorithms. It is found that the combination of early stopping and regularisation
results in better generalisation than the use of early stopping alone. A cubic
regularisation term that greatly penalises large weights is shown to be benefi-
cial for generalisation in cascade networks. An adaptive method of setting the
regularisation magnitude in constructive networks is introduced and is shown
to produce generalisation results similar to those obtained with a fixed, user-
optimised regularisation setting. This adaptive method also oftenresults in the
construction of smaller networks for more complex problems.
The insights obtained from the SARPROP algorithm and from the convergence
and generalisation empirical studies are used to create a new constructive cascade
algorithm, acasper. This algorithm is extensively benchmarked and is shown to
obtain good generalisation results in comparison to a number of well-respected
and successful neural network algorithms. A technique of incorporating the
validation data into the training set after network construction is introduced
and is shown to generally result in similar or improved generalisation.
The diÆculties of implementing a cascade architecture in VLSI are described
and results are given on the effect of the cascade architecture on such attributes
as weight growth, fan-in, network depth, and propagation delay. Two variants
of the cascade architecture are proposed. These new architectures are shown
to produce similar generalisation results to the cascade architecture, while also
addressing the problems of VLSI implementation of cascade networks
Constructive spiking neural networks for simulations of neuroplasticity
Artificial neural networks are important tools in machine learning and neuroscience;
however, a difficult step in their implementation is the selection of the neural network size and
structure. This thesis develops fundamental theory on algorithms for constructing neurons in
spiking neural networks and simulations of neuroplasticity. This theory is applied in the
development of a constructive algorithm based on spike-timing- dependent plasticity (STDP) that
achieves continual one-shot learning of hidden spike patterns through neuron construction.
The theoretical developments in this thesis begin with the proposal of a set of definitions of
the fundamental components of constructive neural networks. Disagreement in terminology across the
literature and a lack of clear definitions and requirements for constructive neural networks is a
factor in the poor visibility and fragmentation of research. The proposed definitions are used as
the basis for a generalised methodology for decomposing constructive neural networks into
components to perform comparisons, design and analysis.
Spiking neuron models are uncommon in constructive neural network literature; however, spiking
neurons are common in simulated studies in neuroscience. Spike- timing-dependent construction is
proposed as a distinct class of constructive algorithm for spiking neural networks. Past algorithms
that perform spike-timing-dependent construction are decomposed into defined components for a
detailed critical comparison and found to have limited applicability in simulations of biological
neural networks.
This thesis develops concepts and principles for designing constructive algorithms that are
compatible with simulations of biological neural networks. Simulations often have orders of
magnitude fewer neurons than related biological neural systems; there- fore, the neurons in a
simulation may be assumed to be a selection or subset of a larger neural system with many neurons
not simulated. Neuron construction and pruning may therefore be reinterpreted as the transfer of
neurons between sets of simulated neurons and hypothetical neurons in the neural system.
Constructive algorithms with a functional equivalence to transferring neurons between sets allow
simulated neural networks to maintain biological plausibility while changing size.
The components of a novel constructive algorithm are incrementally developed from the principles
for biological plausibility. First, processes for calculating new synapse weights from observed
simulation activity and estimates of past STDP are developed and analysed. Second, a method for
predicting postsynaptic spike times for synapse weight calculations through the simulation of a proxy for hypothetical neurons is developed. Finally, spike-dependent conditions for neuron construction and pruning are developed and
the processes are combined in a constructive algorithm for simulations of STDP.
Repeating hidden spike patterns can be detected by neurons tuned through STDP; this result is
reproduced in STDP simulations with neuron construction. Tuned neurons become unresponsive to other
activity, preventing detuning but also preventing neurons from learning new spike patterns.
Continual learning is demonstrated through neuron construction with immediate detection of new
spike patterns from one-shot predictions of STDP convergence.
Future research may investigate applications of the developed constructive algorithm in
neuroscience and machine learning. The developed theory on constructive neural networks and
concepts of selective simulation of neurons also provide new directions for future research.Thesis (Ph.D.) -- University of Adelaide, School of Mechanical Engineering, 201
A study of early stopping, ensembling, and patchworking for cascade correlation neural networks
The constructive topology of the cascade correlation algorithm makes it a popular choice for many researchers wishing to utilize neural networks. However, for multimodal problems, the mean squared error of the approximation increases significantly as the number of modes increases. The components of this error will comprise both bias and variance and we provide formulae for estimating these values from mean squared errors alone. We achieve a near threefold reduction in the overall error by using early stopping and ensembling. Also described is a new subdivision technique that we call patchworking. Patchworking, when used in combination with early stopping and ensembling, can achieve an order of magnitude improvement in the error. Also presented is an approach for validating the quality of a neural network’s training, without the explicit use of a testing dataset
Incremental learning with respect to new incoming input attributes
Neural networks are generally exposed to a dynamic environment where the training patterns or the input attributes (features) will likely be introduced into the current domain incrementally. This paper considers the situation where a new set of input attributes must be considered and added into the existing neural network. The conventional method is to discard the existing network and redesign one from scratch. This approach wastes the old knowledge and the previous effort. In order to reduce computational time, improve generalization accuracy, and enhance intelligence of the learned models, we present ILIA algorithms (namely ILIA1, ILIA2, ILIA3, ILIA4 and ILIA5) capable of Incremental Learning in terms of Input Attributes. Using the ILIA algorithms, when new input attributes are introduced into the original problem, the existing neural network can be retained and a new sub-network is constructed and trained incrementally. The new sub-network and the old one are merged later to form a new network for the changed problem. In addition, ILIA algorithms have the ability to decide whether the new incoming input attributes are relevant to the output and consistent with the existing input attributes or not and suggest to accept or reject them. Experimental results show that the ILIA algorithms are efficient and effective both for the classification and regression problems
Constructive neural networks with applications to image compression and pattern recognition
The theory of Neural Networks (NNs) has witnessed a striking progress in the past fifteen years. The basic issues, such as determining the structure and size of the network, and developing efficient training/learning strategies have been extensively investigated. This thesis is mainly focused on constructive neural networks and their applications to regression, image compression and pattern recognition problems. The contributions of this work are as follows. First, two new strategies are proposed for a constructive One-Hidden-Layer Feedforward NN (OHL-FNN) that grows from a small initial network with a few hidden units to one that has sufficient number of hidden units as required by the underlying mapping problem. The first strategy denoted as error scaling is designed to improve the training efficiency and generalization performance of the OHL-FNN. The second strategy is a pruning criterion that produces a smaller network while not degrading the generalization capability of the network. Second, a novel strategy at the structure level adaptation is proposed for constructing, multi-hidden-layer FNNs. By utilizing the proposed scheme, a FNN is obtained that has sufficient number of hidden layers and hidden units that are required by the complexity of the mapping being considered. Third, a new constructive OHL-FNN at the functional level adaptation is developed. According to this scheme, each hidden unit uses a polynomial as its activation function that is different from those of the other units. This permits the growing network to employ different activation functions so that the network would be able to represent and capture the underlying map more efficiently as compared to the fixed activation function networks. Finally the proposed error scaling and input-side pruning techniques are applied to regression, still and moving image compression, and facial expression recognition problems. The proposed constructive algorithm for creating multilayer FNNs is applied to a range of regression problems. The proposed polynomial OHL-FNN is utilized to solve both regression and classification problems. It has been shown through extensive simulations that all the proposed techniques and networks produce very promising result
Enhanced robotic hand-eye coordination inspired from human-like behavioral patterns
Robotic hand-eye coordination is recognized as an important skill to deal with complex real environments. Conventional robotic hand-eye coordination methods merely transfer stimulus signals from robotic visual space to hand actuator space. This paper introduces a reverse method: Build another channel that transfers stimulus signals from robotic hand space to visual space. Based on the reverse channel, a human-like behavior pattern: “Stop-to-Fixate”, is imparted to the robot, thereby giving the robot an enhanced reaching ability. A visual processing system inspired by the human retina structure is used to compress visual information so as to reduce the robot’s learning complexity. In addition, two constructive neural networks establish the two sensory delivery channels. The experimental results demonstrate that the robotic system gradually obtains a reaching ability. In particular, when the robotic hand touches an unseen object, the reverse channel successfully drives the visual system to notice the unseen object
Lattice dynamical wavelet neural networks implemented using particle swarm optimization for spatio-temporal system identification
In this brief, by combining an efficient wavelet representation with a coupled map lattice model, a new family of adaptive wavelet neural networks, called lattice dynamical wavelet neural networks (LDWNNs), is introduced for spatio-temporal system identification. A new orthogonal projection pursuit (OPP) method, coupled with a particle swarm optimization (PSO) algorithm, is proposed for augmenting the proposed network. A novel two-stage hybrid training scheme is developed for constructing a parsimonious network model. In the first stage, by applying the OPP algorithm, significant wavelet neurons are adaptively and successively recruited into the network, where adjustable parameters of the associated wavelet neurons are optimized using a particle swarm optimizer. The resultant network model, obtained in the first stage, however, may be redundant. In the second stage, an orthogonal least squares algorithm is then applied to refine and improve the initially trained network by removing redundant wavelet neurons from the network. An example for a real spatio-temporal system identification problem is presented to demonstrate the performance of the proposed new modeling framework
- …