187 research outputs found

    Constructive Neural Networks

    Get PDF
    Tato práce se zabývá neuronovými sítěmi - konkrétně sítěmi s proměnnou topologií. Teoretická část popisuje neuronové sítě a jejich matematické modely. Dále ukazuje základní algoritmy pro učení neuronových sítí a rozebírá několik základních konstruktivních algoritmů a jejich rozšíření. Praktická část se zaobírá implementací vybraných konstruktivních algoritmů a uvádí jejich porovnání. Dále jsou algoritmy srovnány s učícím algoritmem backpropagation.Master theses deals with Constructive Neural newtorks. First part describes neural networks and coresponding mathematical models. Furher, it shows basic algorithms for learning neural networks and desribes basic constructive algotithms and their modifications. The second part deals with implementation details of selected algorithms and provides their comparision. Further comparision with backpropagation algorithm is provided.

    Constructive neural networks : generalisation, convergence and architectures

    Full text link
    Feedforward neural networks trained via supervised learning have proven to be successful in the field of pattern recognition. The most important feature of a pattern recognition technique is its ability to successfully classify future data. This is known as generalisation. A more practical aspect of pattern recognition methods is how quickly they can be trained and how reliably a good solution is found. Feedforward neural networks have been shown to provide good generali- sation on a variety of problems. A number of training techniques also exist that provide fast convergence. Two problems often addressed within the field of feedforward neural networks are how to improve thegeneralisation and convergence of these pattern recognition techniques. These two problems are addressed in this thesis through the frame- work of constructive neural network algorithms. Constructive neural networks are a type of feedforward neural network in which the network architecture is built during the training process. The type of architecture built can affect both generalisation and convergence speed. Convergence speed and reliability areimportant properties of feedforward neu- ral networks. These properties are studied by examining different training al- gorithms and the effect of using a constructive process. A new gradient based training algorithm, SARPROP, is introduced. This algorithm addresses the problems of poor convergence speed and reliability when using a gradient based training method. SARPROP is shown to increase both convergence speed and the chance of convergence to a good solution. This is achieved through the combination of gradient based and Simulated Annealing methods. The convergence properties of various constructive algorithms are examined through a series of empirical studies. The results of these studies demonstrate that the cascade architecture allows for faster, more reliable convergence using a gradient based method than a single layer architecture with a comparable num- ber of weights. It is shown that constructive algorithms that bias the search direction of the gradient based training algorithm for the newly added hidden neurons, produce smaller networks and more rapid convergence. A constructive algorithm using search direction biasing is shown to converge to solutions with networks that are unreliable and ineÆcient to train using a non-constructive gradient based algorithm. The technique of weight freezing is shown to result in larger architectures than those obtained from training the whole network. Improving the generalisation ability of constructive neural networks is an im- portant area of investigation. A series of empirical studies are performed to examine the effect of regularisation on generalisation in constructive cascade al- gorithms. It is found that the combination of early stopping and regularisation results in better generalisation than the use of early stopping alone. A cubic regularisation term that greatly penalises large weights is shown to be benefi- cial for generalisation in cascade networks. An adaptive method of setting the regularisation magnitude in constructive networks is introduced and is shown to produce generalisation results similar to those obtained with a fixed, user- optimised regularisation setting. This adaptive method also oftenresults in the construction of smaller networks for more complex problems. The insights obtained from the SARPROP algorithm and from the convergence and generalisation empirical studies are used to create a new constructive cascade algorithm, acasper. This algorithm is extensively benchmarked and is shown to obtain good generalisation results in comparison to a number of well-respected and successful neural network algorithms. A technique of incorporating the validation data into the training set after network construction is introduced and is shown to generally result in similar or improved generalisation. The diÆculties of implementing a cascade architecture in VLSI are described and results are given on the effect of the cascade architecture on such attributes as weight growth, fan-in, network depth, and propagation delay. Two variants of the cascade architecture are proposed. These new architectures are shown to produce similar generalisation results to the cascade architecture, while also addressing the problems of VLSI implementation of cascade networks

    Constructive spiking neural networks for simulations of neuroplasticity

    Get PDF
    Artificial neural networks are important tools in machine learning and neuroscience; however, a difficult step in their implementation is the selection of the neural network size and structure. This thesis develops fundamental theory on algorithms for constructing neurons in spiking neural networks and simulations of neuroplasticity. This theory is applied in the development of a constructive algorithm based on spike-timing- dependent plasticity (STDP) that achieves continual one-shot learning of hidden spike patterns through neuron construction. The theoretical developments in this thesis begin with the proposal of a set of definitions of the fundamental components of constructive neural networks. Disagreement in terminology across the literature and a lack of clear definitions and requirements for constructive neural networks is a factor in the poor visibility and fragmentation of research. The proposed definitions are used as the basis for a generalised methodology for decomposing constructive neural networks into components to perform comparisons, design and analysis. Spiking neuron models are uncommon in constructive neural network literature; however, spiking neurons are common in simulated studies in neuroscience. Spike- timing-dependent construction is proposed as a distinct class of constructive algorithm for spiking neural networks. Past algorithms that perform spike-timing-dependent construction are decomposed into defined components for a detailed critical comparison and found to have limited applicability in simulations of biological neural networks. This thesis develops concepts and principles for designing constructive algorithms that are compatible with simulations of biological neural networks. Simulations often have orders of magnitude fewer neurons than related biological neural systems; there- fore, the neurons in a simulation may be assumed to be a selection or subset of a larger neural system with many neurons not simulated. Neuron construction and pruning may therefore be reinterpreted as the transfer of neurons between sets of simulated neurons and hypothetical neurons in the neural system. Constructive algorithms with a functional equivalence to transferring neurons between sets allow simulated neural networks to maintain biological plausibility while changing size. The components of a novel constructive algorithm are incrementally developed from the principles for biological plausibility. First, processes for calculating new synapse weights from observed simulation activity and estimates of past STDP are developed and analysed. Second, a method for predicting postsynaptic spike times for synapse weight calculations through the simulation of a proxy for hypothetical neurons is developed. Finally, spike-dependent conditions for neuron construction and pruning are developed and the processes are combined in a constructive algorithm for simulations of STDP. Repeating hidden spike patterns can be detected by neurons tuned through STDP; this result is reproduced in STDP simulations with neuron construction. Tuned neurons become unresponsive to other activity, preventing detuning but also preventing neurons from learning new spike patterns. Continual learning is demonstrated through neuron construction with immediate detection of new spike patterns from one-shot predictions of STDP convergence. Future research may investigate applications of the developed constructive algorithm in neuroscience and machine learning. The developed theory on constructive neural networks and concepts of selective simulation of neurons also provide new directions for future research.Thesis (Ph.D.) -- University of Adelaide, School of Mechanical Engineering, 201

    A study of early stopping, ensembling, and patchworking for cascade correlation neural networks

    Get PDF
    The constructive topology of the cascade correlation algorithm makes it a popular choice for many researchers wishing to utilize neural networks. However, for multimodal problems, the mean squared error of the approximation increases significantly as the number of modes increases. The components of this error will comprise both bias and variance and we provide formulae for estimating these values from mean squared errors alone. We achieve a near threefold reduction in the overall error by using early stopping and ensembling. Also described is a new subdivision technique that we call patchworking. Patchworking, when used in combination with early stopping and ensembling, can achieve an order of magnitude improvement in the error. Also presented is an approach for validating the quality of a neural network’s training, without the explicit use of a testing dataset

    Incremental learning with respect to new incoming input attributes

    Get PDF
    Neural networks are generally exposed to a dynamic environment where the training patterns or the input attributes (features) will likely be introduced into the current domain incrementally. This paper considers the situation where a new set of input attributes must be considered and added into the existing neural network. The conventional method is to discard the existing network and redesign one from scratch. This approach wastes the old knowledge and the previous effort. In order to reduce computational time, improve generalization accuracy, and enhance intelligence of the learned models, we present ILIA algorithms (namely ILIA1, ILIA2, ILIA3, ILIA4 and ILIA5) capable of Incremental Learning in terms of Input Attributes. Using the ILIA algorithms, when new input attributes are introduced into the original problem, the existing neural network can be retained and a new sub-network is constructed and trained incrementally. The new sub-network and the old one are merged later to form a new network for the changed problem. In addition, ILIA algorithms have the ability to decide whether the new incoming input attributes are relevant to the output and consistent with the existing input attributes or not and suggest to accept or reject them. Experimental results show that the ILIA algorithms are efficient and effective both for the classification and regression problems

    Constructive neural networks with applications to image compression and pattern recognition

    Get PDF
    The theory of Neural Networks (NNs) has witnessed a striking progress in the past fifteen years. The basic issues, such as determining the structure and size of the network, and developing efficient training/learning strategies have been extensively investigated. This thesis is mainly focused on constructive neural networks and their applications to regression, image compression and pattern recognition problems. The contributions of this work are as follows. First, two new strategies are proposed for a constructive One-Hidden-Layer Feedforward NN (OHL-FNN) that grows from a small initial network with a few hidden units to one that has sufficient number of hidden units as required by the underlying mapping problem. The first strategy denoted as error scaling is designed to improve the training efficiency and generalization performance of the OHL-FNN. The second strategy is a pruning criterion that produces a smaller network while not degrading the generalization capability of the network. Second, a novel strategy at the structure level adaptation is proposed for constructing, multi-hidden-layer FNNs. By utilizing the proposed scheme, a FNN is obtained that has sufficient number of hidden layers and hidden units that are required by the complexity of the mapping being considered. Third, a new constructive OHL-FNN at the functional level adaptation is developed. According to this scheme, each hidden unit uses a polynomial as its activation function that is different from those of the other units. This permits the growing network to employ different activation functions so that the network would be able to represent and capture the underlying map more efficiently as compared to the fixed activation function networks. Finally the proposed error scaling and input-side pruning techniques are applied to regression, still and moving image compression, and facial expression recognition problems. The proposed constructive algorithm for creating multilayer FNNs is applied to a range of regression problems. The proposed polynomial OHL-FNN is utilized to solve both regression and classification problems. It has been shown through extensive simulations that all the proposed techniques and networks produce very promising result

    Enhanced robotic hand-eye coordination inspired from human-like behavioral patterns

    Get PDF
    Robotic hand-eye coordination is recognized as an important skill to deal with complex real environments. Conventional robotic hand-eye coordination methods merely transfer stimulus signals from robotic visual space to hand actuator space. This paper introduces a reverse method: Build another channel that transfers stimulus signals from robotic hand space to visual space. Based on the reverse channel, a human-like behavior pattern: “Stop-to-Fixate”, is imparted to the robot, thereby giving the robot an enhanced reaching ability. A visual processing system inspired by the human retina structure is used to compress visual information so as to reduce the robot’s learning complexity. In addition, two constructive neural networks establish the two sensory delivery channels. The experimental results demonstrate that the robotic system gradually obtains a reaching ability. In particular, when the robotic hand touches an unseen object, the reverse channel successfully drives the visual system to notice the unseen object

    Lattice dynamical wavelet neural networks implemented using particle swarm optimization for spatio-temporal system identification

    No full text
    In this brief, by combining an efficient wavelet representation with a coupled map lattice model, a new family of adaptive wavelet neural networks, called lattice dynamical wavelet neural networks (LDWNNs), is introduced for spatio-temporal system identification. A new orthogonal projection pursuit (OPP) method, coupled with a particle swarm optimization (PSO) algorithm, is proposed for augmenting the proposed network. A novel two-stage hybrid training scheme is developed for constructing a parsimonious network model. In the first stage, by applying the OPP algorithm, significant wavelet neurons are adaptively and successively recruited into the network, where adjustable parameters of the associated wavelet neurons are optimized using a particle swarm optimizer. The resultant network model, obtained in the first stage, however, may be redundant. In the second stage, an orthogonal least squares algorithm is then applied to refine and improve the initially trained network by removing redundant wavelet neurons from the network. An example for a real spatio-temporal system identification problem is presented to demonstrate the performance of the proposed new modeling framework
    corecore