5 research outputs found
Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity
Zeroth-order methods powerful optimization tools for solving many machine
learning problems because it only need function values (not gradient) in the
optimization. Recently, although many zeroth-order methods have been developed,
these approaches still have two main drawbacks: 1) high function query
complexity; 2) not being well suitable for solving the problems with complex
penalties and constraints. To address these challenging drawbacks, in this
paper, we propose a class of faster zeroth-order stochastic alternating
direction method of multipliers (ADMM) methods (ZO-SPIDER-ADMM) to solve the
nonconvex finite-sum problems with multiple nonsmooth penalties. Moreover, we
prove that the ZO-SPIDER-ADMM methods can achieve a lower function query
complexity of for finding an
-stationary point, which improves the existing best nonconvex
zeroth-order ADMM methods by a factor of ,
where and denote the sample size and dimension of data, respectively.
At the same time, we propose a class of faster zeroth-order online ADMM methods
(ZOO-ADMM+) to solve the nonconvex online problems with multiple nonsmooth
penalties. We also prove that the proposed ZOO-ADMM+ methods can achieve a
lower function query complexity of , which
improves the existing best result by a factor of .
Extensive experimental results on the structure adversarial attack on black-box
deep neural networks demonstrate the efficiency of our new algorithms.Comment: 34 page
Accelerated Stochastic Gradient-free and Projection-free Methods
In the paper, we propose a class of accelerated stochastic gradient-free and
projection-free (a.k.a., zeroth-order Frank-Wolfe) methods to solve the
constrained stochastic and finite-sum nonconvex optimization. Specifically, we
propose an accelerated stochastic zeroth-order Frank-Wolfe (Acc-SZOFW) method
based on the variance reduced technique of SPIDER/SpiderBoost and a novel
momentum accelerated technique. Moreover, under some mild conditions, we prove
that the Acc-SZOFW has the function query complexity of
for finding an -stationary point in the
finite-sum problem, which improves the exiting best result by a factor of
, and has the function query complexity of
in the stochastic problem, which improves the exiting best
result by a factor of . To relax the large batches required
in the Acc-SZOFW, we further propose a novel accelerated stochastic
zeroth-order Frank-Wolfe (Acc-SZOFW*) based on a new variance reduced technique
of STORM, which still reaches the function query complexity of
in the stochastic problem without relying on any large
batches. In particular, we present an accelerated framework of the Frank-Wolfe
methods based on the proposed momentum accelerated technique. The extensive
experimental results on black-box adversarial attack and robust black-box
classification demonstrate the efficiency of our algorithms.Comment: Accepted to ICML 2020, 34 page
Zeroth-Order Algorithms for Stochastic Distributed Nonconvex Optimization
In this paper, we consider a stochastic distributed nonconvex optimization
problem with the cost function being distributed over agents having access
only to zeroth-order (ZO) information of the cost. This problem has various
machine learning applications. As a solution, we propose two distributed ZO
algorithms, in which at each iteration each agent samples the local stochastic
ZO oracle at two points with an adaptive smoothing parameter. We show that the
proposed algorithms achieve the linear speedup convergence rate
for smooth cost functions and
convergence rate when the global cost function
additionally satisfies the Polyak--Lojasiewicz (P--L) condition, where and
are the dimension of the decision variable and the total number of
iterations, respectively. To the best of our knowledge, this is the first
linear speedup result for distributed ZO algorithms, which enables systematic
processing performance improvements by adding more agents. We also show that
the proposed algorithms converge linearly when considering deterministic
centralized optimization problems under the P--L condition. We demonstrate
through numerical experiments the efficiency of our algorithms on generating
adversarial examples from deep neural networks in comparison with baseline and
recently proposed centralized and distributed ZO algorithms
Faster Stochastic Quasi-Newton Methods
Stochastic optimization methods have become a class of popular optimization
tools in machine learning. Especially, stochastic gradient descent (SGD) has
been widely used for machine learning problems such as training neural networks
due to low per-iteration computational complexity. In fact, the Newton or
quasi-newton methods leveraging second-order information are able to achieve a
better solution than the first-order methods. Thus, stochastic quasi-Newton
(SQN) methods have been developed to achieve the better solution efficiently
than the stochastic first-order methods by utilizing approximate second-order
information. However, the existing SQN methods still do not reach the best
known stochastic first-order oracle (SFO) complexity. To fill this gap, we
propose a novel faster stochastic quasi-Newton method (SpiderSQN) based on the
variance reduced technique of SIPDER. We prove that our SpiderSQN method
reaches the best known SFO complexity of
in the finite-sum setting to obtain an -first-order stationary point.
To further improve its practical performance, we incorporate SpiderSQN with
different momentum schemes. Moreover, the proposed algorithms are generalized
to the online setting, and the corresponding SFO complexity of
is developed, which also matches the existing best
result. Extensive experiments on benchmark datasets demonstrate that our new
algorithms outperform state-of-the-art approaches for nonconvex optimization.Comment: 11 pages, accepted for publication by TNNLS. arXiv admin note: text
overlap with arXiv:1902.02715 by other author
Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization
In the paper, we propose a class of accelerated zeroth-order and first-order
momentum methods for both nonconvex mini-optimization and minimax-optimization.
Specifically, we propose a new accelerated zeroth-order momentum (Acc-ZOM)
method to solve stochastic mini-optimization problems. We prove that the
Acc-ZOM method achieves a lower query complexity of
for finding an -stationary point,
which improves the best known result by a factor of where
denotes the parameter dimension. In particular, the Acc-ZOM does not require
large batches required in the existing zeroth-order stochastic algorithms. At
the same time, we propose an accelerated zeroth-order momentum descent ascent
(Acc-ZOMDA) method for black-box minimax-optimization. We prove that the
Acc-ZOMDA method reaches a lower query complexity of
for finding an
-stationary point, which improves the best known result by a factor
of where and denote dimensions of
optimization parameters and is condition number. Moreover, we
propose an accelerated first-order momentum descent ascent (Acc-MDA) method for
solving white-box minimax problems, and prove that it achieves a lower gradient
complexity of with for
finding an -stationary point, which improves the best known result by
a factor of . Extensive experimental results on the
black-box adversarial attack to deep neural networks (DNNs) and poisoning
attack demonstrate the efficiency of our algorithms.Comment: 66 pages. In this version, we change the Lyapunov functions for our
Acc-ZOMDA and Acc-MDA methods in the convergence analysis. Then our Acc-ZOMDA
method obtains a lower query complexity and our Acc-MDA method achieves a
lower gradient complexit