103 research outputs found

    Distributed Optimization with Application to Power Systems and Control

    Get PDF
    In many engineering domains, systems are composed of partially independent subsystems—power systems are composed of distribution and transmission systems, teams of robots are composed of individual robots, and chemical process systems are composed of vessels, heat exchangers and reactors. Often, these subsystems should reach a common goal such as satisfying a power demand with minimum cost, flying in a formation, or reaching an optimal set-point. At the same time, limited information exchange is desirable—for confidentiality reasons but also due to communication constraints. Moreover, a fast and reliable decision process is key as applications might be safety-critical. Mathematical optimization techniques are among the most successful tools for controlling systems optimally with feasibility guarantees. Yet, they are often centralized—all data has to be collected in one central and computationally powerful entity. Methods from distributed optimization control the subsystems in a distributed or decentralized fashion, reducing or avoiding central coordination. These methods have a long and successful history. Classical distributed optimization algorithms, however, are typically designed for convex problems. Hence, they are only partially applicable in the above domains since many of them lead to optimization problems with non-convex constraints. This thesis develops one of the first frameworks for distributed and decentralized optimization with non-convex constraints. Based on the Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN) algorithm, a bi-level distributed ALADIN framework is presented, solving the coordination step of ALADIN in a decentralized fashion. This framework can handle various decentralized inner algorithms, two of which we develop here: a decentralized variant of the Alternating Direction Method of Multipliers (ADMM) and a novel decentralized Conjugate Gradient algorithm. Decentralized conjugate gradient is to the best of our knowledge the first decentralized algorithm with a guarantee of convergence to the exact solution in a finite number of iterates. Sufficient conditions for fast local convergence of bi-level ALADIN are derived. Bi-level ALADIN strongly reduces the communication and coordination effort of ALADIN and preserves its fast convergence guarantees. We illustrate these properties on challenging problems from power systems and control, and compare performance to the widely used ADMM. The developed methods are implemented in the open-source MATLAB toolbox ALADIN-—one of the first toolboxes for decentralized non-convex optimization. ALADIN- comes with a rich set of application examples from different domains showing its broad applicability. As an additional contribution, this thesis provides new insights why state-of-the-art distributed algorithms might encounter issues for constrained problems

    Low-Order Optimization Algorithms: Iteration Complexity and Applications

    Get PDF
    University of Minnesota Ph.D. dissertation. May 2018. Major: Industrial Engineering. Advisor: Shuzhong Zhang. 1 computer file (PDF); ix, 208 pages.Efficiency and scalability have become the new norms to evaluate optimization algorithms in the modern era of big data analytics. Despite its superior local convergence property, second or higher-order methods are often disadvantaged when dealing with large-scale problems arising from machine learning. The reason for this is that the second or higher-order methods require the amount of information, or to compute relevant quantities (e.g. Newton's direction), which is exceedingly large. Hence, they are not scalable, at least not in a naive way. Because of exactly the same reason, with substantially lower computational overhead per iteration, lower-order (first-order and zeroth-order) methods have received much attention and become popular in recent years. In this thesis, we present a systematic study of the lower-order algorithms for solving a wide range of different optimization models. As a starting point, the alternating direction method of multipliers (ADMM) will be studied and shown to be an efficient approach for solving large-scale separable optimization with linear constraint. However, the ADMM is originally designed for solving two-block optimization models and its subproblems are not always easy to solve. There are two possible ways to increase the scope of application for the ADMM: (1) to simplify its subroutines so as to fit a broader scheme of lower-order algorithms; (2) to extend it to solve a more general framework of multi-block problems. Depending on the informational structure of the underlying problem, we develop a suite of first-order and zeroth-order variants of the ADMM, where the trade-offs between the required information and the computational complexity are explicitly given. The new variants allow the method to be applicable to a much broader class of problems where only noisy estimations of the gradient or the function values are accessible. Moreover, we extend the ADMM framework to a general multi-block convex optimization model with coupled objective function and linear constraints. Based on a linearization scheme to decouple the objective function, several deterministic first-order algorithms have been developed for both two-block and multi-block problems. We show that, under suitable conditions, the sublinear convergence rate can be established for those methods. It is well known that the original ADMM may fail to converge when the number of blocks exceeds two. To overcome this difficulty, we propose a randomized primal-dual proximal block coordinate updating framework which includes several existing ADMM-type algorithms as special cases. Our result shows that if an appropriate randomization procedure is used, then a sublinear rate of convergence in expectation can be guaranteed for multi-block ADMM, without assuming strong convexity or any additional conditions. The new approach is also extended to solve problems where only a stochastic approximation of the (sub-)gradient of the objective is available. Furthermore, we study various zeroth-order algorithms for both black-box optimizations and online learning problems. In particular, for the black-box optimization, we consider three different settings: (1) the stochastic programming with the restriction that only one random sample can be drawn at any given decision point; (2) a general nonconvex optimization framework with what we call the weakly pseudo-convex property; (3) an estimation of objective value with controllable noise is available. We further extend the idea to the stochastic bandit online learning problem, where the nonsmoothness of the loss function and the one random sample scheme are discussed
    corecore