4,184 research outputs found
ADINE: An Adaptive Momentum Method for Stochastic Gradient Descent
Two major momentum-based techniques that have achieved tremendous success in
optimization are Polyak's heavy ball method and Nesterov's accelerated
gradient. A crucial step in all momentum-based methods is the choice of the
momentum parameter which is always suggested to be set to less than .
Although the choice of is justified only under very strong theoretical
assumptions, it works well in practice even when the assumptions do not
necessarily hold. In this paper, we propose a new momentum based method
, which relaxes the constraint of and allows the
learning algorithm to use adaptive higher momentum. We motivate our hypothesis
on by experimentally verifying that a higher momentum () can help
escape saddles much faster. Using this motivation, we propose our method
that helps weigh the previous updates more (by setting the
momentum parameter ), evaluate our proposed algorithm on deep neural
networks and show that helps the learning algorithm to
converge much faster without compromising on the generalization error.Comment: 8 + 1 pages, 12 figures, accepted at CoDS-COMAD 201
Metaheuristic design of feedforward neural networks: a review of two decades of research
Over the past two decades, the feedforward neural network (FNN) optimization has been a key interest among the researchers and practitioners of multiple disciplines. The FNN optimization is often viewed from the various perspectives: the optimization of weights, network architecture, activation nodes, learning parameters, learning environment, etc. Researchers adopted such different viewpoints mainly to improve the FNN's generalization ability. The gradient-descent algorithm such as backpropagation has been widely applied to optimize the FNNs. Its success is evident from the FNN's application to numerous real-world problems. However, due to the limitations of the gradient-based optimization methods, the metaheuristic algorithms including the evolutionary algorithms, swarm intelligence, etc., are still being widely explored by the researchers aiming to obtain generalized FNN for a given problem. This article attempts to summarize a broad spectrum of FNN optimization methodologies including conventional and metaheuristic approaches. This article also tries to connect various research directions emerged out of the FNN optimization practices, such as evolving neural network (NN), cooperative coevolution NN, complex-valued NN, deep learning, extreme learning machine, quantum NN, etc. Additionally, it provides interesting research challenges for future research to cope-up with the present information processing era
DANTE: Deep AlterNations for Training nEural networks
We present DANTE, a novel method for training neural networks using the
alternating minimization principle. DANTE provides an alternate perspective to
traditional gradient-based backpropagation techniques commonly used to train
deep networks. It utilizes an adaptation of quasi-convexity to cast training a
neural network as a bi-quasi-convex optimization problem. We show that for
neural network configurations with both differentiable (e.g. sigmoid) and
non-differentiable (e.g. ReLU) activation functions, we can perform the
alternations effectively in this formulation. DANTE can also be extended to
networks with multiple hidden layers. In experiments on standard datasets,
neural networks trained using the proposed method were found to be promising
and competitive to traditional backpropagation techniques, both in terms of
quality of the solution, as well as training speed.Comment: 19 page
- …