ADINE: An Adaptive Momentum Method for Stochastic Gradient Descent

Balasubramanian, Vineeth N; Sankar, Adepu Ravi; Srinivasan, Vishwak

research

ADINE: An Adaptive Momentum Method for Stochastic Gradient Descent

Authors: Vineeth N Balasubramanian
Adepu Ravi Sankar
Vishwak Srinivasan
Publication date: 20 December 2017
Publisher
Doi

Abstract

Two major momentum-based techniques that have achieved tremendous success in optimization are Polyak's heavy ball method and Nesterov's accelerated gradient. A crucial step in all momentum-based methods is the choice of the momentum parameter

m

which is always suggested to be set to less than

1

. Although the choice of

m < 1

is justified only under very strong theoretical assumptions, it works well in practice even when the assumptions do not necessarily hold. In this paper, we propose a new momentum based method

\textit{ADINE}

, which relaxes the constraint of

m < 1

and allows the learning algorithm to use adaptive higher momentum. We motivate our hypothesis on

m

by experimentally verifying that a higher momentum (

\ge 1

) can help escape saddles much faster. Using this motivation, we propose our method

\textit{ADINE}

that helps weigh the previous updates more (by setting the momentum parameter

> 1

), evaluate our proposed algorithm on deep neural networks and show that

\textit{ADINE}

helps the learning algorithm to converge much faster without compromising on the generalization error.Comment: 8 + 1 pages, 12 figures, accepted at CoDS-COMAD 201

Similar works

Full text

Available Versions

Crossref

info:doi/10.1145%2F3152494.315...

Last time updated on 06/08/2021

Research Archive of Indian Institute of Technology Hyderabad

oai:raiith.iith.ac.in:5220

Last time updated on 22/05/2019