Search CORE

4,956 research outputs found

Solving the linear interval tolerance problem for weight initialization of neural networks

Author: Adam S.P.
Karras D.A.
Magoulas George D.
Vrahatis M.N.
Publication venue: 'Elsevier BV'
Publication date: 01/06/2014
Field of study

Determining good initial conditions for an algorithm used to train a neural network is considered a parameter estimation problem dealing with uncertainty about the initial weights. Interval Analysis approaches model uncertainty in parameter estimation problems using intervals and formulating tolerance problems. Solving a tolerance problem is defining lower and upper bounds of the intervals so that the system functionality is guaranteed within predefined limits. The aim of this paper is to show how the problem of determining the initial weight intervals of a neural network can be defined in terms of solving a linear interval tolerance problem. The proposed Linear Interval Tolerance Approach copes with uncertainty about the initial weights without any previous knowledge or specific assumptions on the input data as required by approaches such as fuzzy sets or rough sets. The proposed method is tested on a number of well known benchmarks for neural networks trained with the back-propagation family of algorithms. Its efficiency is evaluated with regards to standard performance measures and the results obtained are compared against results of a number of well known and established initialization methods. These results provide credible evidence that the proposed method outperforms classical weight initialization methods

Crossref

Birkbeck Institutional Research Online

Bounding the search space for global optimization of neural networks learning error: an interval analysis approach

Author: Adam S.P.
Karras D.A.
Magoulas George D.
Vrahatis M.N.
Publication venue: JMLR
Publication date: 01/09/2016
Field of study

Training a multilayer perceptron (MLP) with algorithms employing global search strategies has been an important research direction in the field of neural networks. Despite a number of significant results, an important matter concerning the bounds of the search region---typically defined as a box---where a global optimization method has to search for a potential global minimizer seems to be unresolved. The approach presented in this paper builds on interval analysis and attempts to define guaranteed bounds in the search space prior to applying a global search algorithm for training an MLP. These bounds depend on the machine precision and the term guaranteed denotes that the region defined surely encloses weight sets that are global minimizers of the neural network's error function. Although the solution set to the bounding problem for an MLP is in general non-convex, the paper presents the theoretical results that help deriving a box which is a convex set. This box is an outer approximation of the algebraic solutions to the interval equations resulting from the function implemented by the network nodes. An experimental study using well known benchmarks is presented in accordance with the theoretical results

Birkbeck Institutional Research Online

Hyperparameter optimization with approximate gradient

Author: Pedregosa Fabian
Publication venue
Publication date: 25/06/2016
Field of study

Most models in machine learning contain at least one hyperparameter to control for model complexity. Choosing an appropriate set of hyperparameters is both crucial in terms of model accuracy and computationally challenging. In this work we propose an algorithm for the optimization of continuous hyperparameters using inexact gradient information. An advantage of this method is that hyperparameters can be updated before model parameters have fully converged. We also give sufficient conditions for the global convergence of this method, based on regularity conditions of the involved functions and summability of errors. Finally, we validate the empirical performance of this method on the estimation of regularization constants of L2-regularized logistic regression and kernel Ridge regression. Empirical benchmarks indicate that our approach is highly competitive with respect to state of the art methods.Comment: Proceedings of the International conference on Machine Learning (ICML

arXiv.org e-Print Archive

Reservoir Computing Approach to Robust Computation using Unreliable Nanoscale Networks

Author: A. Atiya
A. Rodan
A.Z. Stieg
D. Verstraeten
F. Wyffels
G. Snider
H. Jaeger
H.O. Sillin
J.W. Lawson
K. Terabe
L. Žaloudek
M. Haselman
M. Hermans
M. Lukoševičius
M. Lukoševičius
M.C. Ozturk
P. Erdös
P. Xu
R. Penrose
S. Dasgupta
S. Sarangi
W. Maass
Y. Chen
Publication venue
Publication date: 01/01/2014
Field of study

As we approach the physical limits of CMOS technology, advances in materials science and nanotechnology are making available a variety of unconventional computing substrates that can potentially replace top-down-designed silicon-based computing devices. Inherent stochasticity in the fabrication process and nanometer scale of these substrates inevitably lead to design variations, defects, faults, and noise in the resulting devices. A key challenge is how to harness such devices to perform robust computation. We propose reservoir computing as a solution. In reservoir computing, computation takes place by translating the dynamics of an excited medium, called a reservoir, into a desired output. This approach eliminates the need for external control and redundancy, and the programming is done using a closed-form regression problem on the output, which also allows concurrent programming using a single device. Using a theoretical model, we show that both regular and irregular reservoirs are intrinsically robust to structural noise as they perform computation

arXiv.org e-Print Archive

Crossref

A Bayesian approach for initialization of weights in backpropagation neural net with application to character recognition

Author: Murru Nadir
Rossini Rosaria
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Institutional Research Information System University of Turin

Training Support Vector Machines Using Frank-Wolfe Optimization Methods

Author: Frandi Emanuele
Gasparo Maria Grazia
Lodi Stefano
Nanculef Ricardo
Sartori Claudio
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 04/12/2012
Field of study

Training a Support Vector Machine (SVM) requires the solution of a quadratic programming problem (QP) whose computational complexity becomes prohibitively expensive for large scale datasets. Traditional optimization methods cannot be directly applied in these cases, mainly due to memory restrictions. By adopting a slightly different objective function and under mild conditions on the kernel used within the model, efficient algorithms to train SVMs have been devised under the name of Core Vector Machines (CVMs). This framework exploits the equivalence of the resulting learning problem with the task of building a Minimal Enclosing Ball (MEB) problem in a feature space, where data is implicitly embedded by a kernel function. In this paper, we improve on the CVM approach by proposing two novel methods to build SVMs based on the Frank-Wolfe algorithm, recently revisited as a fast method to approximate the solution of a MEB problem. In contrast to CVMs, our algorithms do not require to compute the solutions of a sequence of increasingly complex QPs and are defined by using only analytic optimization steps. Experiments on a large collection of datasets show that our methods scale better than CVMs in most cases, sometimes at the price of a slightly lower accuracy. As CVMs, the proposed methods can be easily extended to machine learning problems other than binary classification. However, effective classifiers are also obtained using kernels which do not satisfy the condition required by CVMs and can thus be used for a wider set of problems

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna