905 research outputs found

    On the Approximation of Constrained Linear Quadratic Regulator Problems and their Application to Model Predictive Control - Supplementary Notes

    Full text link
    By parametrizing input and state trajectories with basis functions different approximations to the constrained linear quadratic regulator problem are obtained. These notes present and discuss technical results that are intended to supplement a corresponding journal article. The results can be applied in a model predictive control context.Comment: 19 pages, 1 figur

    Introduction to the functional RG and applications to gauge theories

    Get PDF
    These lectures contain an introduction to modern renormalization group (RG) methods as well as functional RG approaches to gauge theories. In the first lecture, the functional renormalization group is introduced with a focus on the flow equation for the effective average action. The second lecture is devoted to a discussion of flow equations and symmetries in general, and flow equations and gauge symmetries in particular. The third lecture deals with the flow equation in the background formalism which is particularly convenient for analytical computations of truncated flows. The fourth lecture concentrates on the transition from microscopic to macroscopic degrees of freedom; even though this is discussed here in the language and the context of QCD, the developed formalism is much more general and will be useful also for other systems.Comment: 60 pages, 14 figures, Lectures held at the 2006 ECT* School "Renormalization Group and Effective Field Theory Approaches to Many-Body Systems", Trento, Ital

    Control Strategies for Multi-Evaporator Vapor Compression Cycles

    Get PDF
    Next-generation military aircraft must be able to handle highly transient thermal loads that exceed the ability of current aircraft thermal subsystems. Vapor compression cycle systems are a particular refrigeration technology that is an attractive solution for dealing with this challenge, due primarily to their high efficiency. However, there are several barriers to realizing the benefits of vapor cycles systems for controlling thermal loads in military aircraft. This thesis focuses on addressing the challenge of controlling vapor cycles in the presence of highly transient evaporator heat loads. Specifically, a linear quadratic regulator (LQR) is designed for a simple vapor cycle system, and closed-loop performance is compared with a set of proportional-integral (PI) controllers. Simulation results show significant advantages of using the LQR method, and the same approach is repeated for a larger dual-evaporator vapor cycle system. The LQR method retains some of its benefits, but several issues associated with relying on a single linear model for the full nonlinear system are identified, and recommendations for future work are made at the end

    Reachability-based Identification, Analysis, and Control Synthesis of Robot Systems

    Full text link
    We introduce reachability analysis for the formal examination of robots. We propose a novel identification method, which preserves reachset conformance of linear systems. We additionally propose a simultaneous identification and control synthesis scheme to obtain optimal controllers with formal guarantees. In a case study, we examine the effectiveness of using reachability analysis to synthesize a state-feedback controller, a velocity observer, and an output feedback controller.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Bayesian Optimization in Robot Learning - Automatic Controller Tuning and Sample-Efficient Methods

    Get PDF
    Das Problem des Reglerentwurfs für dynamische Systeme wurde von Ingenieuren in den letzten Jahrtausenden untersucht. Seit diesen Tagen ist suboptimales Verhalten ein unvermeidlicher Nebeneffekt der manuellen Einstellung von Reglerparametern. Heutzutage steht man in industriellen Anwendungen datengestriebenen Methoden, die das automatische Lernen von Reglerparametern ermöglichen, nach wie vor skeptisch gegenüber. Im Bereich der Robotik gewinnt das maschinelle Lernen (ML) immer mehr an Einfluss und ermöglicht einen erhöhten Grad der Autonomie und AnpassungsfĂ€higkeit, z.B. indem es dabei unterstützt, den Prozess der Reglereinstellung zu automatisieren. Datenintensive Methoden, wie z.B. Methoden des Reinforcement Learning, erfordern jedoch eine große Anzahl experimenteller Versuche, was in der Robotik nicht möglich ist, da die Hardware sich abnutzt und kaputt gehen kann. Das wirft folgende Frage auf: Kann die manuelle Reglereinstellung in der Robotik durch den Einsatz dateneffizienter Techniken des maschinellen Lernens ersetzt werden? In dieser Arbeit gehen wir die obige Frage an, indem wir den Einsatz von Bayes’scher Optimierung (BO), ein dateneffizientes ML-Framework, als Ersatz für manuelles Einstellen unter Beibehaltung einer geringen Anzahl von experimentellen Versuchen untersuchen. Der Fokus dieser Arbeit liegt auf Robotersystemen. Dabei prĂ€sentieren wir Demonstrationen mit realen Robotern, sowie fundierte theoretische Ergebnisse zur Steigerung der Dateneffizienz. Im Einzelnen stellen wir vier HauptbeitrĂ€ge vor. ZunĂ€chst betrachten wir die Verwendung von BO als Ersatz für das manuelle Einstellen auf einer Roboterplattform. Zu diesem Zweck parametrisieren wir die Einstellgewichtungen eines linear-quadratischen Reglers (LQR) und lernen diese Parameter mit einem informationseffizienten BO-Algorithmus. Dieser Algorithmus nutzt Gauß-Prozesse (GPs), um das unbekannte Zielfunktion zu modellieren. Das GP-Modell wird vom BO-Algorithmus genutzt, um Reglerparameter vorzuschlagen von denen erwartet wird, dass sie die Informationen über die optimalen Parameter erhöhen, gemessen als eine Zunahme der Entropie. Das resultierende Framework zur automatischen LQR-Einstellung wird auf zwei Roboterplattformen demonstriert: Ein Robterarm, der einen umgekehrten Stab ausbalanciert und ein humanoider Roboter, der Kniebeugen ausführt. In beiden FĂ€llen wird ein vorhandener Regler in einer handvoll Experimenten automatisch verbessert, ohne dass ein Mensch eingreifen muss. vii BO kompensiert Datenknappheit durch den GP, ein probabilistisches Modell, das a priori Annahmen über das unbekannte Zielfunktion enthĂ€lt. Normalerweise haben falsche oder uninformierte Annahmen negative Folgen, wie z.B. eine höhere Anzahl von Roboterexperimenten, ein schlechteres Reglerverhalten oder eine verringerte Dateneffizienz. Die hier vorgestellten BeitrĂ€ge Zwei bis Vier beschĂ€ftigen sich mit diesem Problem. Der zweite Beitrag schlĂ€gt vor, den Robotersimulator als zusĂ€tzliche Informationsquelle für die automatische Reglereinstellung in die Lernschleife miteinzubeziehen. WĂ€hrend reale Roboterexperimente im Allgemeinen hohe Kosten mit sich bringen, sind Simulationen günstiger (sie können z.B. schneller berechnet werden). Da der Simulator aber ein unvollkommenes Modell des Roboters ist, sind seine Informationen einseitig verfĂ€lscht und können negative Auswirkungen auf das Lernverhalten haben. Um dieses Problem anzugehen, schlagen wir “sim-vs-real” vor, einen auf grundlegenden Prinzipien beruhenden BO-Algorithmus, der Daten aus Simulationen und Experimenten nutzt. Der Algorithmus wĂ€gt dabei die günstigen, aber ungenauen Informationen des Simulators gegen die teuren und exakten physikalischen Experimente in einer kostengünstigen Weise ab. Der daraus resultierende Algorithmus wird an einem inversen Pendels auf einem Wagen demonstriert, bei dem sich Simulationen und reale Experimente abwechseln, wodurch viele reale Experimente eingespart werden. Der dritte Beitrag untersucht, wie die Aussagekraft der probabilistischen Annahmen des vorliegenden Regelungsproblem adĂ€quat behandelt werden kann. Zu diesem Zweck wird die mathematische Struktur des LQR-Reglers genutzt und durch die Kernel-Funktion in den GP eingebaut. Insbesondere schlagen wir zwei verschiedene “LQR-Kernel”-Entwürfe vor, die die FlexibilitĂ€t des Bayes’schen, nichtparametrischen Lernens beibehalten. Simulierte Ergebnisse deuten darauf hin, dass die LQR-Kernel bessere Ergebnisse erzielen als uninformierte Kernel, wenn sie zum Lernen von Reglern mit BO verwendet werden. Der vierte Beitrag schließlich befasst sich speziell mit dem Problem, wie ein Versagen des Reglers behandelt werden soll. FehlschlĂ€ge von Reglern sind beim Lernen aus Daten typischerweise unvermeidbar, insbesondere wenn nichtkonservative Lösungen erwartet werden. Obwohl ein Versagen des Reglers im Allgemeinen problematisch ist (z.B. muss der Roboter mit einem Not-Aus angehalten werden), ist es gleichzeitig eine reichhaltige Informationsquelle darüber, was vermieden werden sollte. Wir schlagen “failures-aware excursion search” vor, einen neuen Algorithmus für Bayes’sche Optimierung mit unbekannten BeschrĂ€nkungen, bei dem die Anzahl an Fehlern begrenzt ist. Unsere Ergebnisse in numerischen Vergleichsstudien deuten darauf hin, dass, verglichen mit dem aktuellen Stand der Technik, durch das Zulassen einer begrenzten Anzahl von FehlschlĂ€gen bessere Optima aufgedeckt werden. Der erste Beitrag dieser Dissertation ist unter den ersten die BO an realen Robotern anwenden. Diese Arbeit diente dazu, mehrere Probleme zu identifizieren, wie zum Beispiel den Bedarf nach einer höheren Dateneffizienz, was mehrere neue Forschungsrichtungen aufzeigte, die wir durch verschiedene methodische BeitrĂ€ge addressiert haben. Zusammengefasst haben wir “sim-vs-real”, einen neuen BOAlgorithmus der den Simulator as zusĂ€tzliche Informationsquelle miteinbezieht, einen “LQR-Kernel”-Entwurf, der schneller lernt als Standardkernel und “failures-aware excursion search”, einen neuen BO-Algorithmus für beschrĂ€nkte Black-Box-Optimierungsprobleme, bei denen die Anzahl der Fehler begrenzt ist, vorgeschlagen.In reference to IEEE copyrighted material which is used with permission in this thesis, the IEEE does not endorse any of Eberhard Karls UniversitĂ€t TĂŒbingen’s products or services. Internal or personal use of this material is permitted. If interested in reprinting/republishing IEEE copyrighted material for advertising or promotional purposes or for creating new collective works for resale or redistribution, please go to http://www.ieee.org/publications_standards/publications/rights/rights_link.html to learn how to obtain a License from RightsLink.The problem of designing controllers to regulate dynamical systems has been studied by engineers during the past millennia. Ever since, suboptimal performance lingers in many closed loops as an unavoidable side effect of manually tuning the parameters of the controllers. Nowadays, industrial settings remain skeptic about data-driven methods that allow one to automatically learn controller parameters. In the context of robotics, machine learning (ML) keeps growing its influence on increasing autonomy and adaptability, for example to aid automating controller tuning. However, data-hungry ML methods, such as standard reinforcement learning, require a large number of experimental samples, prohibitive in robotics, as hardware can deteriorate and break. This brings about the following question: Can manual controller tuning, in robotics, be automated by using data-efficient machine learning techniques? In this thesis, we tackle the question above by exploring Bayesian optimization (BO), a data-efficient ML framework, to buffer the human effort and side effects of manual controller tuning, while retaining a low number of experimental samples. We focus this work in the context of robotic systems, providing thorough theoretical results that aim to increase data-efficiency, as well as demonstrations in real robots. Specifically, we present four main contributions. We first consider using BO to replace manual tuning in robotic platforms. To this end, we parametrize the design weights of a linear quadratic regulator (LQR) and learn its parameters using an information-efficient BO algorithm. Such algorithm uses Gaussian processes (GPs) to model the unknown performance objective. The GP model is used by BO to suggest controller parameters that are expected to increment the information about the optimal parameters, measured as a gain in entropy. The resulting “automatic LQR tuning” framework is demonstrated on two robotic platforms: A robot arm balancing an inverted pole and a humanoid robot performing a squatting task. In both cases, an existing controller is automatically improved in a handful of experiments without human intervention. BO compensates for data scarcity by means of the GP, which is a probabilistic model that encodes prior assumptions about the unknown performance objective. Usually, incorrect or non-informed assumptions have negative consequences, such as higher number of robot experiments, poor tuning performance or reduced sample-efficiency. The second to fourth contributions presented herein attempt to alleviate this issue. The second contribution proposes to include the robot simulator into the learning loop as an additional information source for automatic controller tuning. While doing a real robot experiment generally entails high associated costs (e.g., require preparation and take time), simulations are cheaper to obtain (e.g., they can be computed faster). However, because the simulator is an imperfect model of the robot, its information is biased and could have negative repercussions in the learning performance. To address this problem, we propose “simu-vs-real”, a principled multi-fidelity BO algorithm that trades off cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. The resulting algorithm is demonstrated on a cart-pole system, where simulations and real experiments are alternated, thus sparing many real evaluations. The third contribution explores how to adequate the expressiveness of the probabilistic prior to the control problem at hand. To this end, the mathematical structure of LQR controllers is leveraged and embedded into the GP, by means of the kernel function. Specifically, we propose two different “LQR kernel” designs that retain the flexibility of Bayesian nonparametric learning. Simulated results indicate that the LQR kernel yields superior performance than non-informed kernel choices when used for controller learning with BO. Finally, the fourth contribution specifically addresses the problem of handling controller failures, which are typically unavoidable in practice while learning from data, specially if non-conservative solutions are expected. Although controller failures are generally problematic (e.g., the robot has to be emergency-stopped), they are also a rich information source about what should be avoided. We propose “failures-aware excursion search”, a novel algorithm for Bayesian optimization under black-box constraints, where failures are limited in number. Our results in numerical benchmarks indicate that by allowing a confined number of failures, better optima are revealed as compared with state-of-the-art methods. The first contribution of this thesis, “automatic LQR tuning”, lies among the first on applying BO to real robots. While it demonstrated automatic controller learning from few experimental samples, it also revealed several important challenges, such as the need of higher sample-efficiency, which opened relevant research directions that we addressed through several methodological contributions. Summarizing, we proposed “simu-vs-real”, a novel BO algorithm that includes the simulator as an additional information source, an “LQR kernel” design that learns faster than standard choices and “failures-aware excursion search”, a new BO algorithm for constrained black-box optimization problems, where the number of failures is limited

    Distributed Control of Electric Vehicle Charging: Privacy, Performance, and Processing Tradeoffs

    Get PDF
    As global climate change concerns, technological advancements, and economic shifts increase the adoption of electric vehicles, it is vital to study how best to integrate these into our existing energy systems. Electric vehicles (EVs) are on track to quickly become a large factor in the energy grid. If left uncoordinated, the charging of EVs will become a burden on the grid by increasing peak demand and overloading transformers. However, with proper charging control strategies, the problems can be mitigated without the need for expensive capital investments. Distributed control methods are a powerful tool to coordinate the charging, but it will be important to assess the trade-offs between performance, information privacy, and computational speed between different control strategies. This work presents a comprehensive comparison between four distributed control algorithms simulating two case studies constrained by dynamic transformer temperature and current limits. The transformer temperature dynamics are inherently nonlinear and this implementation is contrasted with a piece-wise linear convex relaxation. The more commonly distributed control methods of Dual Decomposition and Alternating Direction Method of Multipliers (ADMM) are compared against a relatively new algorithm, Augmented Lagrangian based Alternating Direction Inexact Newton (ALADIN), as well as against a low-information packetized energy management control scheme (PEM). These algorithms are implemented with a receding horizon in two distinct case studies: a local neighborhood scenario with EVs at each network node and a hub scenario where each node represents a collection of EVs. Finally, these simulation results are compared and analyzed to assess the methods’ performance, privacy, and processing metrics for each case study as no algorithm is found to be optimal for all applications

    Summary of research in applied mathematics, numerical analysis, and computer sciences

    Get PDF
    The major categories of current ICASE research programs addressed include: numerical methods, with particular emphasis on the development and analysis of basic numerical algorithms; control and parameter identification problems, with emphasis on effective numerical methods; computational problems in engineering and physical sciences, particularly fluid dynamics, acoustics, and structural analysis; and computer systems and software, especially vector and parallel computers

    Concurrent design and motion planning in robotics using differentiable optimal control

    Get PDF
    Robot design optimization (what the robot is) and motion planning (how the robot moves) are two problems that are connected. Robots are limited by their design in terms of what motions they can execute – for instance a robot with a heavy base has less payload capacity compared to the same robot with a lighter base. On the other hand, the motions that the robot executes guide which design is best for the task. Concurrent design (co-design) is the process of performing robot design and motion planning together. Although traditionally co-design has been viewed as an offline process that can take hours or days, we view interactive co-design tools as the next step as they enable quick prototyping and evaluation of designs across different tasks and environments. In this thesis we adopt a gradient-based approach to co-design. Our baseline approach embeds the motion planning into bi-level optimization and uses gradient information via finite differences from the lower motion planning level to optimize the design in the upper level. Our approach uses the full rigid-body dynamics of the robot and allows for arbitrary upper-level design constraints, which is key for finding physically realizable designs. Our approach is also between 1.8 and 8.4 times faster on a quadruped trotting and jumping co-design task as compared to the popular genetic algorithm covariance matrix adaptation evolutionary strategy (CMA-ES). We further demonstrate the speed of our approach by building an interactive co-design tool that allows for optimization over uneven terrain with varying height. Furthermore, we propose an algorithm to analytically take the derivative of nonlinear optimal control problems via differential dynamic programming (DDP). Analytical derivatives are a step towards addressing the scalability and accuracy issues of finite differences. We further compared with a simultaneous approach for co-design that optimizes both motion and design in one nonlinear program. On a co-design task for the Kinova robotic arm we observed a 54-times improvement in computational speed. We additionally carry out hardware validation experiments on the quadruped robot Solo. We designed longer lower legs for the robot, which minimize the peak torque used during trotting. Although we always observed an improvement in peak torque, it was less than in simulation (7.609% versus 28.271%). We discuss some of the sim-toreal issues including the structural stability of joints and slipping of feet that need to be considered and how they can be addressed using our framework. In the second part of this thesis we propose solutions to some open problems in motion planning. Firstly, in our co-design approach we assumed fixed contact locations and timings. Ideally we would like the motion planner to choose the contacts instead. We solve a related, but simpler problem, which is the control of satellite thrusters, which are similar to robot feet but do not have the constraint of having to be in contact with the ground to exert force on the robot. We introduce a sparse, L1 cost on control inputs (thrusters) and implement optimization via DDP-style solvers. We use full rigid-body dynamics and achieve bang-bang control via optimization, which is a difficult problem due to the discrete switching nature of the thrusters. Lastly, we present a method for planning and control of a hybrid, wheel-legged robot. This is a difficult problem, as the robot needs to always actively balance on the wheel even when not driving or jumping forward. We propose the variablelength wheeled inverted pendulum (VL-WIP) template model that captures only the necessary dynamic interactions between wheels and base. We embedded this into a model-predictive controller (MPC) and demonstrated highly dynamic behaviors, including swinging-up and jumping over a gap. Both of these motion planning problems expand the ability of our motion planning tools to new domains, which is an integral part also of the co-design algorithms, as co-design aims to optimize both design, and motion, together

    Coordinating Dispatch of Distributed Energy Resources with Model Predictive Control and Q-Learning

    Get PDF
    Distributed energy resources such as renewable generators (wind, solar), energy storage, and demand response can be used to complement fossil-fueled generators. The uncertainty and variability due to high penetration of renewable resources make power system operations and controls challenging. This work addresses the coordinated operation of these distributed resources to meet economic, reliability, and environmental objectives. Recent research proposes Model Predictive Control (MPC) to solve the problem. However, MPC may yield a poor performance if the terminal penalty function is not chosen correctly. In this work, a parameterized Q-learning algorithm is devised to approximate the optimal terminal penalty function. This approximate penalty function is then used in MPC, thus effectively combining the two techniques. It is argued that this combination approach would lead to the best solution in terms of computation, and adaptability to a changing environment. Simulation studies demonstrating the efficacy of the proposed methodology for power system dispatch problems are presented.National Science Foundation / CPS-0931416Department of Energy / DE-OE0000097 and DE-SC0003879Pacific Northwest National LaboratoryOpe
    • 

    corecore