82 research outputs found
Online Natural Gradient as a Kalman Filter
We cast Amari's natural gradient in statistical learning as a specific case
of Kalman filtering. Namely, applying an extended Kalman filter to estimate a
fixed unknown parameter of a probabilistic model from a series of observations,
is rigorously equivalent to estimating this parameter via an online stochastic
natural gradient descent on the log-likelihood of the observations.
In the i.i.d. case, this relation is a consequence of the "information
filter" phrasing of the extended Kalman filter. In the recurrent (state space,
non-i.i.d.) case, we prove that the joint Kalman filter over states and
parameters is a natural gradient on top of real-time recurrent learning (RTRL),
a classical algorithm to train recurrent models.
This exact algebraic correspondence provides relevant interpretations for
natural gradient hyperparameters such as learning rates or initialization and
regularization of the Fisher information matrix.Comment: 3rd version: expanded intr
Towards Query-Efficient Black-Box Adversary with Zeroth-Order Natural Gradient Descent
Despite the great achievements of the modern deep neural networks (DNNs), the
vulnerability/robustness of state-of-the-art DNNs raises security concerns in
many application domains requiring high reliability. Various adversarial
attacks are proposed to sabotage the learning performance of DNN models. Among
those, the black-box adversarial attack methods have received special
attentions owing to their practicality and simplicity. Black-box attacks
usually prefer less queries in order to maintain stealthy and low costs.
However, most of the current black-box attack methods adopt the first-order
gradient descent method, which may come with certain deficiencies such as
relatively slow convergence and high sensitivity to hyper-parameter settings.
In this paper, we propose a zeroth-order natural gradient descent (ZO-NGD)
method to design the adversarial attacks, which incorporates the zeroth-order
gradient estimation technique catering to the black-box attack scenario and the
second-order natural gradient descent to achieve higher query efficiency. The
empirical evaluations on image classification datasets demonstrate that ZO-NGD
can obtain significantly lower model query complexities compared with
state-of-the-art attack methods.Comment: accepted by AAAI 202
- …