347 research outputs found

    Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies

    Full text link
    Gradient-based methods have been widely used for system design and optimization in diverse application domains. Recently, there has been a renewed interest in studying theoretical properties of these methods in the context of control and reinforcement learning. This article surveys some of the recent developments on policy optimization, a gradient-based iterative approach for feedback control synthesis, popularized by successes of reinforcement learning. We take an interdisciplinary perspective in our exposition that connects control theory, reinforcement learning, and large-scale optimization. We review a number of recently-developed theoretical results on the optimization landscape, global convergence, and sample complexity of gradient-based methods for various continuous control problems such as the linear quadratic regulator (LQR), H∞\mathcal{H}_\infty control, risk-sensitive control, linear quadratic Gaussian (LQG) control, and output feedback synthesis. In conjunction with these optimization results, we also discuss how direct policy optimization handles stability and robustness concerns in learning-based control, two main desiderata in control engineering. We conclude the survey by pointing out several challenges and opportunities at the intersection of learning and control.Comment: To Appear in Annual Review of Control, Robotics, and Autonomous System

    A Quasi-Newton Interior Point Method for Low Order H-Infinity Controller Synthesis

    Full text link

    An Optimal Control Model for Human Postural Regulation

    Get PDF
    Human upright stance is inherently unstable without a balance control scheme. Many biological behaviors are likely to be optimal with respect to some performance measure that involves energy. It is reasonable to believe that the human is (unconsciously) optimizing some performance measure as he regulates his balance posture. In experimental studies, a notable feature of postural control is a small constant sway. Specifically, there is greater sway than would occur with a linear feedback control without delay. A second notable feature of the human postural control is that the response to perturbations varies with their amplitude. Small disturbances produce motion only at the ankles with the hip and knee angles unchanging. Large perturbation evoke ankle and hip angular movement only. Still larger perturbation result in movement of all three joint angles. Inspired by these features, a biomechanical model resembling human balance control is proposed. The proposed model consists of three main components which are the body dynamics, a sensory estimator for delay and disturbance, and an optimal nonlinear control scheme providing minimum required corrective response. The human body is modeled as a multiple segment inverted pendulum in the sagittal plane and controlled by ankle and hip joint torques. A series of nonlinear optimal control problems are devised as mathematical models of human postural control during quiet standing. Several performance criteria that are high even orders in the body state or functions of these states (such as joint angle, Center of Pressure COP or Center of Mass COM) and quadratic in the joint control are utilized. This objective function provides a trade-off between the allowed deviations of the position from its nominal value and the neuromuscular energy required to correct for these deviations. Note that this performance measure reduces the actuator energy used by penalizing small postural errors very lightly. By using the Model Predictive Control (MPC) technique, the discrete-time approximation to each of these problems can be converted into a nonlinear programming problem and then solved by optimization methods. The solution gives a control scheme that agrees with the main features of the joint kinematics and its coordination process. The derived model is simulated for different scenarios to validate and test the performance of the proposed postural control architecture

    High Performance, Robust Control of Flexible Space Structures: MSFC Center Director's Discretionary Fund

    Get PDF
    Many spacecraft systems have ambitious objectives that place stringent requirements on control systems. Achievable performance is often limited because of difficulty of obtaining accurate models for flexible space structures. To achieve sufficiently high performance to accomplish mission objectives may require the ability to refine the control design model based on closed-loop test data and tune the controller based on the refined model. A control system design procedure is developed based on mixed H2/H(infinity) optimization to synthesize a set of controllers explicitly trading between nominal performance and robust stability. A homotopy algorithm is presented which generates a trajectory of gains that may be implemented to determine maximum achievable performance for a given model error bound. Examples show that a better balance between robustness and performance is obtained using the mixed H2/H(infinity) design method than either H2 or mu-synthesis control design. A second contribution is a new procedure for closed-loop system identification which refines parameters of a control design model in a canonical realization. Examples demonstrate convergence of the parameter estimation and improved performance realized by using the refined model for controller redesign. These developments result in an effective mechanism for achieving high-performance control of flexible space structures

    Fourth NASA Workshop on Computational Control of Flexible Aerospace Systems, part 1

    Get PDF
    The proceedings of the workshop are presented. Some areas of discussion are as follows: modeling, systems identification, and control of flexible aircraft, spacecraft, and robotic systems

    Gradient Methods for Large-Scale and Distributed Linear Quadratic Control

    Get PDF
    This thesis considers methods for synthesis of linear quadratic controllers for large-scale, interconnected systems. Conventional methods that solve the linear quadratic control problem are only applicable to systems with moderate size, due to the rapid increase in both computational time and memory requirements as the system size increases. The methods presented in this thesis show a much slower increase in these requirements when faced with system matrices with a sparse structure. Hence, they are useful for control design for systems of large order, since they usually have sparse systems matrices. An equally important feature of the methods is that the controllers are restricted to have a distributed nature, meaning that they respect a potential interconnection structure of the system. The controllers considered in the thesis have the same structure as the centralized LQG solution, that is, they are consisting of a state predictor and feedback from the estimated states. Strategies for determining the feedback matrix and predictor matrix separately, are suggested. The strategies use gradient directions of the cost function to iteratively approach a locally optimal solution in either problem. A scheme to determine bounds on the degree of suboptimality of the partial solution in every iteration, is presented. It is also shown that these bounds can be combined to give a bound on the degree of suboptimality of the full output feedback controller. Another method that treats the synthesis of the feedback matrix and predictor matrix simultaneously is also presented. The functionality of the developed methods is illustrated by an application, where the methods are used to compute controllers for a large deformable mirror, found in a telescope to compensate for atmospheric disturbances. The model of the mirror is obtained by discretizing a partial differential equation. This gives a linear, sparse representation of the mirror with a very large state space, which is suitable for the methods presented in the thesis. The performance of the controllers is evaluated using performance measures from the adaptive optics community

    Machine-In-The-Loop control optimization:a literature survey

    Get PDF
    • …
    corecore