19 research outputs found
Stochastic Training of Neural Networks via Successive Convex Approximations
This paper proposes a new family of algorithms for training neural networks
(NNs). These are based on recent developments in the field of non-convex
optimization, going under the general name of successive convex approximation
(SCA) techniques. The basic idea is to iteratively replace the original
(non-convex, highly dimensional) learning problem with a sequence of (strongly
convex) approximations, which are both accurate and simple to optimize.
Differently from similar ideas (e.g., quasi-Newton algorithms), the
approximations can be constructed using only first-order information of the
neural network function, in a stochastic fashion, while exploiting the overall
structure of the learning problem for a faster convergence. We discuss several
use cases, based on different choices for the loss function (e.g., squared loss
and cross-entropy loss), and for the regularization of the NN's weights. We
experiment on several medium-sized benchmark problems, and on a large-scale
dataset involving simulated physical data. The results show how the algorithm
outperforms state-of-the-art techniques, providing faster convergence to a
better minimum. Additionally, we show how the algorithm can be easily
parallelized over multiple computational units without hindering its
performance. In particular, each computational unit can optimize a tailored
surrogate function defined on a randomly assigned subset of the input
variables, whose dimension can be selected depending entirely on the available
computational power.Comment: Preprint submitted to IEEE Transactions on Neural Networks and
Learning System
Energy efficiency optimization in MIMO interference channels: A successive pseudoconvex approximation approach
In this paper, we consider the (global and sum) energy efficiency
optimization problem in downlink multi-input multi-output multi-cell systems,
where all users suffer from multi-user interference. This is a challenging
problem due to several reasons: 1) it is a nonconvex fractional programming
problem, 2) the transmission rate functions are characterized by
(complex-valued) transmit covariance matrices, and 3) the processing-related
power consumption may depend on the transmission rate. We tackle this problem
by the successive pseudoconvex approximation approach, and we argue that
pseudoconvex optimization plays a fundamental role in designing novel iterative
algorithms, not only because every locally optimal point of a pseudoconvex
optimization problem is also globally optimal, but also because a descent
direction is easily obtained from every optimal point of a pseudoconvex
optimization problem. The proposed algorithms have the following advantages: 1)
fast convergence as the structure of the original optimization problem is
preserved as much as possible in the approximate problem solved in each
iteration, 2) easy implementation as each approximate problem is suitable for
parallel computation and its solution has a closed-form expression, and 3)
guaranteed convergence to a stationary point or a Karush-Kuhn-Tucker point. The
advantages of the proposed algorithm are also illustrated numerically.Comment: submitted to IEEE Transactions on Signal Processin