2 research outputs found

    Dual Stochastic Natural Gradient Descent

    Full text link
    Although theoretically appealing, Stochastic Natural Gradient Descent (SNGD) is computationally expensive, it has been shown to be highly sensitive to the learning rate, and it is not guaranteed to be convergent. Convergent Stochastic Natural Gradient Descent (CSNGD) aims at solving the last two problems. However, the computational expense of CSNGD is still unacceptable when the number of parameters is large. In this paper we introduce the Dual Stochastic Natural Gradient Descent (DSNGD) where we take benefit of dually flat manifolds to obtain a robust alternative to SNGD which is also computationally feasible.Comment: 16 page

    Convergent Stochastic Almost Natural Gradient Descent

    No full text
    Stochastic Gradient Descent (SGD) is the workhorse beneath the deep learning revolution. However, SGD is known to reduce its convergence speed due to the plateau phenomenon. Stochastic Natural Gradient Descent (SNGD) was proposed by Amari to resolve that problem by taking benefit of the geometry of the space. Nevertheless, the convergence of SNGD is not guaranteed. The aim of this article is to modify SNGD to obtain a convergent variant, that we name Convergent SNGD (CSNGD), and test it in a specific toy optimization problem. In particular, we concentrate on the problem of learning a discrete probability distribution. Based on variable metric convergence results presented by Sunehag et al. [13], we prove the convergence of CSNGD. Furthermore, we provide experimental results showing that it significantly improves over SGD. We claim that the approach developed in this paper could be extensible to more complex optimization problems making it a promising research line
    corecore