7 research outputs found

    Parallel growing and training of neural networks using output parallelism

    Get PDF
    In order to find an appropriate architecture for a large-scale real-world application automatically and efficiently, a natural method is to divide the original problem into a set of sub-problems. In this paper, we propose a simple neural network task decomposition method based on output parallelism. By using this method, a problem can be divided flexibly into several sub-problems as chosen, each of which is composed of the whole input vector and a fraction of the output vector. Each module (for one sub-problem) is responsible for producing a fraction of the output vector of the original problem. The hidden structure for the original problem’s output units are decoupled. These modules can be grown and trained in parallel on parallel processing elements. Incorporated with a constructive learning algorithm, our method does not require excessive computation and any prior knowledge concerning decomposition. The feasibility of output parallelism is analyzed and proved. Some benchmarks are implemented to test the validity of this method. Their results show that this method can reduce computational time, increase learning speed and improve generalization accuracy for both classification and regression problems

    Input Partitioning Based on Correlation for Neural Network Learning

    Full text link

    Investigation of the CasCor family of learning algorithms

    Get PDF

    Flexibility and accuracy enhancement techniques for neural networks

    Get PDF
    Master'sMASTER OF ENGINEERIN

    Constructive neural networks : generalisation, convergence and architectures

    Full text link
    Feedforward neural networks trained via supervised learning have proven to be successful in the field of pattern recognition. The most important feature of a pattern recognition technique is its ability to successfully classify future data. This is known as generalisation. A more practical aspect of pattern recognition methods is how quickly they can be trained and how reliably a good solution is found. Feedforward neural networks have been shown to provide good generali- sation on a variety of problems. A number of training techniques also exist that provide fast convergence. Two problems often addressed within the field of feedforward neural networks are how to improve thegeneralisation and convergence of these pattern recognition techniques. These two problems are addressed in this thesis through the frame- work of constructive neural network algorithms. Constructive neural networks are a type of feedforward neural network in which the network architecture is built during the training process. The type of architecture built can affect both generalisation and convergence speed. Convergence speed and reliability areimportant properties of feedforward neu- ral networks. These properties are studied by examining different training al- gorithms and the effect of using a constructive process. A new gradient based training algorithm, SARPROP, is introduced. This algorithm addresses the problems of poor convergence speed and reliability when using a gradient based training method. SARPROP is shown to increase both convergence speed and the chance of convergence to a good solution. This is achieved through the combination of gradient based and Simulated Annealing methods. The convergence properties of various constructive algorithms are examined through a series of empirical studies. The results of these studies demonstrate that the cascade architecture allows for faster, more reliable convergence using a gradient based method than a single layer architecture with a comparable num- ber of weights. It is shown that constructive algorithms that bias the search direction of the gradient based training algorithm for the newly added hidden neurons, produce smaller networks and more rapid convergence. A constructive algorithm using search direction biasing is shown to converge to solutions with networks that are unreliable and ineÆcient to train using a non-constructive gradient based algorithm. The technique of weight freezing is shown to result in larger architectures than those obtained from training the whole network. Improving the generalisation ability of constructive neural networks is an im- portant area of investigation. A series of empirical studies are performed to examine the effect of regularisation on generalisation in constructive cascade al- gorithms. It is found that the combination of early stopping and regularisation results in better generalisation than the use of early stopping alone. A cubic regularisation term that greatly penalises large weights is shown to be benefi- cial for generalisation in cascade networks. An adaptive method of setting the regularisation magnitude in constructive networks is introduced and is shown to produce generalisation results similar to those obtained with a fixed, user- optimised regularisation setting. This adaptive method also oftenresults in the construction of smaller networks for more complex problems. The insights obtained from the SARPROP algorithm and from the convergence and generalisation empirical studies are used to create a new constructive cascade algorithm, acasper. This algorithm is extensively benchmarked and is shown to obtain good generalisation results in comparison to a number of well-respected and successful neural network algorithms. A technique of incorporating the validation data into the training set after network construction is introduced and is shown to generally result in similar or improved generalisation. The diÆculties of implementing a cascade architecture in VLSI are described and results are given on the effect of the cascade architecture on such attributes as weight growth, fan-in, network depth, and propagation delay. Two variants of the cascade architecture are proposed. These new architectures are shown to produce similar generalisation results to the cascade architecture, while also addressing the problems of VLSI implementation of cascade networks

    A Neural Network Approach to Constructive Induction

    No full text
    corecore