Search CORE

4,487 research outputs found

Optimization of Evolutionary Neural Networks Using Hybrid Learning Algorithms

Author: Abraham Ajith
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

Evolutionary artificial neural networks (EANNs) refer to a special class of artificial neural networks (ANNs) in which evolution is another fundamental form of adaptation in addition to learning. Evolutionary algorithms are used to adapt the connection weights, network architecture and learning algorithms according to the problem environment. Even though evolutionary algorithms are well known as efficient global search algorithms, very often they miss the best local solutions in the complex solution space. In this paper, we propose a hybrid meta-heuristic learning approach combining evolutionary learning and local search methods (using 1st and 2nd order error information) to improve the learning and faster convergence obtained using a direct evolutionary approach. The proposed technique is tested on three different chaotic time series and the test results are compared with some popular neuro-fuzzy systems and a recently developed cutting angle method of global optimization. Empirical results reveal that the proposed technique is efficient in spite of the computational complexity

arXiv.org e-Print Archive

Crossref

Newton\u27s Method Backpropagation for Complex-Valued Holomorphic Neural Networks: Algebraic and Analytic Properties

Author: La Corte Diana Thomson
Publication venue: UWM Digital Commons
Publication date: 01/08/2014
Field of study

The study of Newton\u27s method in complex-valued neural networks (CVNNs) faces many difficulties. In this dissertation, we derive Newton\u27s method backpropagation algorithms for complex-valued holomorphic multilayer perceptrons (MLPs), and we investigate the convergence of the one-step Newton steplength algorithm for the minimization of real-valued complex functions via Newton\u27s method. The problem of singular Hessian matrices provides an obstacle to the use of Newton\u27s method backpropagation to train CVNNs. We approach this problem by developing an adaptive underrelaxation factor algorithm that avoids singularity of the Hessian matrices for the minimization of real-valued complex polynomial functions. To provide experimental support for the use of our algorithms, we perform a comparison of using sigmoidal functions versus their Taylor polynomial approximations as activation functions by using the Newton and pseudo-Newton backpropagation algorithms developed here and the known gradient descent backpropagation algorithm. Our experiments indicate that the Newton\u27s method based algorithms, combined with the use of polynomial activation functions, provide significant improvement in the number of training iterations required over the existing algorithms. We also test our underrelaxation factor algorithm using a small-scale polynomial neuron and a polynomial MLP. Finally, we investigate the application of an algebraic root-finding technique to the case of a polynomial MLP to develop a theoretical framework for the location of initial weight vectors that will guarantee successful training

University of Wisconsin-Milwaukee

A neural network based policy iteration algorithm with global $H^2$ -superlinear convergence for stochastic games on domains

Author: Ito Kazufumi
Reisinger Christoph
Zhang Yufei
Publication venue
Publication date: 01/01/2020
Field of study

In this work, we propose a class of numerical schemes for solving semilinear Hamilton-Jacobi-Bellman-Isaacs (HJBI) boundary value problems which arise naturally from exit time problems of diffusion processes with controlled drift. We exploit policy iteration to reduce the semilinear problem into a sequence of linear Dirichlet problems, which are subsequently approximated by a multilayer feedforward neural network ansatz. We establish that the numerical solutions converge globally in the

H^2

-norm, and further demonstrate that this convergence is superlinear, by interpreting the algorithm as an inexact Newton iteration for the HJBI equation. Moreover, we construct the optimal feedback controls from the numerical value functions and deduce convergence. The numerical schemes and convergence results are then extended to HJBI boundary value problems corresponding to controlled diffusion processes with oblique boundary reflection. Numerical experiments on the stochastic Zermelo navigation problem are presented to illustrate the theoretical results and to demonstrate the effectiveness of the method.Comment: Additional numerical experiments have been included (on Pages 27-31) to show the proposed algorithm achieves a more stable and more rapid convergence than the existing neural network based methods within similar computational tim

arXiv.org e-Print Archive

Oxford University Research Archive

HINT: Hierarchical Invertible Neural Transport for Density Estimation and Bayesian Inference

Author: Detommaso Gianluca
Kruse Jakob
Köthe Ullrich
Scheichl Robert
Publication venue
Publication date: 05/03/2020
Field of study

A large proportion of recent invertible neural architectures is based on a coupling block design. It operates by dividing incoming variables into two sub-spaces, one of which parameterizes an easily invertible (usually affine) transformation that is applied to the other. While the Jacobian of such a transformation is triangular, it is very sparse and thus may lack expressiveness. This work presents a simple remedy by noting that (affine) coupling can be repeated recursively within the resulting sub-spaces, leading to an efficiently invertible block with dense triangular Jacobian. By formulating our recursive coupling scheme via a hierarchical architecture, HINT allows sampling from a joint distribution p(y,x) and the corresponding posterior p(x|y) using a single invertible network. We demonstrate the power of our method for density estimation and Bayesian inference on a novel data set of 2D shapes in Fourier parameterization, which enables consistent visualization of samples for different dimensionalities

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

A representer theorem for deep kernel learning

Author: Bohn Bastian
Griebel Michael
Rieger Christian
Publication venue
Publication date: 07/06/2018
Field of study

In this paper we provide a finite-sample and an infinite-sample representer theorem for the concatenation of (linear combinations of) kernel functions of reproducing kernel Hilbert spaces. These results serve as mathematical foundation for the analysis of machine learning algorithms based on compositions of functions. As a direct consequence in the finite-sample case, the corresponding infinite-dimensional minimization problems can be recast into (nonlinear) finite-dimensional minimization problems, which can be tackled with nonlinear optimization algorithms. Moreover, we show how concatenated machine learning problems can be reformulated as neural networks and how our representer theorem applies to a broad class of state-of-the-art deep learning methods

arXiv.org e-Print Archive

Fraunhofer-ePrints

Bayesian Quadrature for Multiple Related Integrals

Author: Briol François-Xavier
Girolami Mark
Xi Xiaoyue
Publication venue
Publication date: 11/05/2018
Field of study

Bayesian probabilistic numerical methods are a set of tools providing posterior distributions on the output of numerical methods. The use of these methods is usually motivated by the fact that they can represent our uncertainty due to incomplete/finite information about the continuous mathematical problem being approximated. In this paper, we demonstrate that this paradigm can provide additional advantages, such as the possibility of transferring information between several numerical methods. This allows users to represent uncertainty in a more faithful manner and, as a by-product, provide increased numerical efficiency. We propose the first such numerical method by extending the well-known Bayesian quadrature algorithm to the case where we are interested in computing the integral of several related functions. We then prove convergence rates for the method in the well-specified and misspecified cases, and demonstrate its efficiency in the context of multi-fidelity models for complex engineering systems and a problem of global illumination in computer graphics.Comment: Proceedings of the 35th International Conference on Machine Learning (ICML), PMLR 80:5369-5378, 201

arXiv.org e-Print Archive

UCL Discovery

Spiral - Imperial College Digital Repository