19 research outputs found

    Stochastic Training of Neural Networks via Successive Convex Approximations

    Full text link
    This paper proposes a new family of algorithms for training neural networks (NNs). These are based on recent developments in the field of non-convex optimization, going under the general name of successive convex approximation (SCA) techniques. The basic idea is to iteratively replace the original (non-convex, highly dimensional) learning problem with a sequence of (strongly convex) approximations, which are both accurate and simple to optimize. Differently from similar ideas (e.g., quasi-Newton algorithms), the approximations can be constructed using only first-order information of the neural network function, in a stochastic fashion, while exploiting the overall structure of the learning problem for a faster convergence. We discuss several use cases, based on different choices for the loss function (e.g., squared loss and cross-entropy loss), and for the regularization of the NN's weights. We experiment on several medium-sized benchmark problems, and on a large-scale dataset involving simulated physical data. The results show how the algorithm outperforms state-of-the-art techniques, providing faster convergence to a better minimum. Additionally, we show how the algorithm can be easily parallelized over multiple computational units without hindering its performance. In particular, each computational unit can optimize a tailored surrogate function defined on a randomly assigned subset of the input variables, whose dimension can be selected depending entirely on the available computational power.Comment: Preprint submitted to IEEE Transactions on Neural Networks and Learning System

    Energy efficiency optimization in MIMO interference channels: A successive pseudoconvex approximation approach

    Get PDF
    In this paper, we consider the (global and sum) energy efficiency optimization problem in downlink multi-input multi-output multi-cell systems, where all users suffer from multi-user interference. This is a challenging problem due to several reasons: 1) it is a nonconvex fractional programming problem, 2) the transmission rate functions are characterized by (complex-valued) transmit covariance matrices, and 3) the processing-related power consumption may depend on the transmission rate. We tackle this problem by the successive pseudoconvex approximation approach, and we argue that pseudoconvex optimization plays a fundamental role in designing novel iterative algorithms, not only because every locally optimal point of a pseudoconvex optimization problem is also globally optimal, but also because a descent direction is easily obtained from every optimal point of a pseudoconvex optimization problem. The proposed algorithms have the following advantages: 1) fast convergence as the structure of the original optimization problem is preserved as much as possible in the approximate problem solved in each iteration, 2) easy implementation as each approximate problem is suitable for parallel computation and its solution has a closed-form expression, and 3) guaranteed convergence to a stationary point or a Karush-Kuhn-Tucker point. The advantages of the proposed algorithm are also illustrated numerically.Comment: submitted to IEEE Transactions on Signal Processin
    corecore