271 research outputs found

    EduBal: An open balancing robot platform for teaching control and system theory

    Full text link
    In this work we present EduBal, an educational open-source hardware and software platform for a balancing robot. The robot is designed to be low-cost, safe and easy to use by students for control education. Along with the robot we present example tasks from system identification as well as SISO and MIMO control. Using Simulink, students can quickly implement their control algorithms on the robot. Individual control parameters can be tuned online while analyzing the resulting behavior in live signal plots. At RWTH Aachen University and ETH Zurich 28 units have so far been built and used in control classes. In first laboratory experiences students show high intrinsic motivation and creativity to apply the studied concepts of control theory to the real system.Comment: Accepted for publication at the 21st IFAC World Congress 202

    A Survey on Physics Informed Reinforcement Learning: Review and Open Problems

    Full text link
    The inclusion of physical information in machine learning frameworks has revolutionized many application areas. This involves enhancing the learning process by incorporating physical constraints and adhering to physical laws. In this work we explore their utility for reinforcement learning applications. We present a thorough review of the literature on incorporating physics information, as known as physics priors, in reinforcement learning approaches, commonly referred to as physics-informed reinforcement learning (PIRL). We introduce a novel taxonomy with the reinforcement learning pipeline as the backbone to classify existing works, compare and contrast them, and derive crucial insights. Existing works are analyzed with regard to the representation/ form of the governing physics modeled for integration, their specific contribution to the typical reinforcement learning architecture, and their connection to the underlying reinforcement learning pipeline stages. We also identify core learning architectures and physics incorporation biases (i.e., observational, inductive and learning) of existing PIRL approaches and use them to further categorize the works for better understanding and adaptation. By providing a comprehensive perspective on the implementation of the physics-informed capability, the taxonomy presents a cohesive approach to PIRL. It identifies the areas where this approach has been applied, as well as the gaps and opportunities that exist. Additionally, the taxonomy sheds light on unresolved issues and challenges, which can guide future research. This nascent field holds great potential for enhancing reinforcement learning algorithms by increasing their physical plausibility, precision, data efficiency, and applicability in real-world scenarios

    From Flies to Robots: Inverted Landing in Small Quadcopters with Dynamic Perching

    Full text link
    Inverted landing is a routine behavior among a number of animal fliers. However, mastering this feat poses a considerable challenge for robotic fliers, especially to perform dynamic perching with rapid body rotations (or flips) and landing against gravity. Inverted landing in flies have suggested that optical flow senses are closely linked to the precise triggering and control of body flips that lead to a variety of successful landing behaviors. Building upon this knowledge, we aimed to replicate the flies' landing behaviors in small quadcopters by developing a control policy general to arbitrary ceiling-approach conditions. First, we employed reinforcement learning in simulation to optimize discrete sensory-motor pairs across a broad spectrum of ceiling-approach velocities and directions. Next, we converted the sensory-motor pairs to a two-stage control policy in a continuous augmented-optical flow space. The control policy consists of a first-stage Flip-Trigger Policy, which employs a one-class support vector machine, and a second-stage Flip-Action Policy, implemented as a feed-forward neural network. To transfer the inverted-landing policy to physical systems, we utilized domain randomization and system identification techniques for a zero-shot sim-to-real transfer. As a result, we successfully achieved a range of robust inverted-landing behaviors in small quadcopters, emulating those observed in flies.Comment: 17 pages, 19 Figures, Journal paper currently under revie

    Viability in State-Action Space: Connecting Morphology, Control, and Learning

    Get PDF
    Wie können wir Robotern ermöglichen, modellfrei und direkt auf der Hardware zu lernen? Das maschinelle Lernen nimmt als Standardwerkzeug im Arsenal des Robotikers seinen Platz ein. Es gibt jedoch einige offene Fragen, wie man die Kontrolle ĂŒber physikalische Systeme lernen kann. Diese Arbeit gibt zwei Antworten auf diese motivierende Frage. Das erste ist ein formales Mittel, um die inhĂ€rente Robustheit eines gegebenen Systemdesigns zu quantifizieren, bevor der Controller oder das Lernverfahren entworfen wird. Dies unterstreicht die Notwendigkeit, sowohl das Hardals auch das Software-Design eines Roboters zu berĂŒcksichtigen, da beide Aspekte in der Systemdynamik untrennbar miteinander verbunden sind. Die zweite ist die Formalisierung einer Sicherheitsmass, die modellfrei erlernt werden kann. Intuitiv zeigt diese Mass an, wie leicht ein Roboter FehlschlĂ€ge vermeiden kann. Auf diese Weise können Roboter unbekannte Umgebungen erkunden und gleichzeitig AusfĂ€lle vermeiden. Die wichtigsten BeitrĂ€ge dieser Dissertation basieren sich auf der ViabilitĂ€tstheorie. ViabilitĂ€t bietet eine alternative Sichtweise auf dynamische Systeme: Anstatt sich auf die Konvergenzeigenschaften eines Systems in Richtung Gleichgewichte zu konzentrieren, wird der Fokus auf Menge von FehlerzustĂ€nden und die FĂ€higkeit des Systems, diese zu vermeiden, verlagert. Diese Sichtweise eignet sich besonders gut fĂŒr das Studium der Lernkontrolle an Robotern, da StabilitĂ€t im Sinne einer Konvergenz wĂ€hrend des Lernprozesses selten gewĂ€hrleistet werden kann. Der Begriff der ViabilitĂ€t wird formal auf den Zustand-Aktion-Raum erweitert, mit ViabilitĂ€tsmengen von Staat-Aktionspaaren. Eine ĂŒber diese Mengen definierte Mass ermöglicht eine quantifizierte Bewertung der Robustheit, die fĂŒr die Familie aller fehlervermeidenden Regler gilt, und ebnet den Weg fĂŒr ein sicheres, modellfreies Lernen. Die Arbeit beinhaltet auch zwei kleinere BeitrĂ€ge. Der erste kleine Beitrag ist eine empirische Demonstration der Shaping durch ausschliessliche Modifikation der Systemdynamik. Diese Demonstration verdeutlicht die Bedeutung der Robustheit gegenĂŒber Fehlern fĂŒr die Lernkontrolle: AusfĂ€lle können nicht nur SchĂ€den verursachen, sondern liefern in der Regel auch keine nĂŒtzlichen Gradienteninformationen fĂŒr den Lernprozess. Der zweite kleine Beitrag ist eine Studie ĂŒber die Wahl der Zustandsinitialisierungen. Entgegen der Intuition und der ĂŒblichen Praxis zeigt diese Studie, dass es zuverlĂ€ssiger sein kann, das System gelegentlich aus einem Zustand zu initialisieren, der bekanntermassen unkontrollierbar ist.How can we enable robots to learn control model-free and directly on hardware? Machine learning is taking its place as a standard tool in the roboticist’s arsenal. However, there are several open questions on how to learn control for physical systems. This thesis provides two answers to this motivating question. The first is a formal means to quantify the inherent robustness of a given system design, prior to designing the controller or learning agent. This emphasizes the need to consider both the hardware and software design of a robot, which are inseparably intertwined in the system dynamics. The second is the formalization of a safety-measure, which can be learned model-free. Intuitively, this measure indicates how easily a robot can avoid failure, and enables robots to explore unknown environments while avoiding failures. The main contributions of this dissertation are based on viability theory. Viability theory provides a slightly unconventional view of dynamical systems: instead of focusing on a system’s convergence properties towards equilibria, the focus is shifted towards sets of failure states and the system’s ability to avoid these sets. This view is particularly well suited to studying learning control in robots, since stability in the sense of convergence can rarely be guaranteed during the learning process. The notion of viability is formally extended to state-action space, with viable sets of state-action pairs. A measure defined over these sets allows a quantified evaluation of robustness valid for the family of all failure-avoiding control policies, and also paves the way for enabling safe model-free learning. The thesis also includes two minor contributions. The first minor contribution is an empirical demonstration of shaping by exclusively modifying the system dynamics. This demonstration highlights the importance of robustness to failures for learning control: not only can failures cause damage, but they typically do not provide useful gradient information for the learning process. The second minor contribution is a study on the choice of state initializations. Counter to intuition and common practice, this study shows it can be more reliable to occasionally initialize the system from a state that is known to be uncontrollable

    Lab experiences for teaching undergraduate dynamics

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2003.Includes bibliographical references (p. 443-466).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.This thesis describes several projects developed to teach undergraduate dynamics and controls. The materials were developed primarily for the class 2.003 Modeling Dynamics and Control I. These include (1) a set of ActivLab modular experiments that illustrate the dynamics of linear time-invariant (LTI) systems and (2) a two wheeled mobile inverted pendulum. The ActivLab equipment has been designed as shareware, and plans for it are available on the web. The inverted pendulum robot developed here is largely inspired by the iBOT and Segway transportation devices invented by Dean Kamen.by Katherine A. Lilienkamp.S.M

    OBSERVER-BASED-CONTROLLER FOR INVERTED PENDULUM MODEL

    Get PDF
    This paper presents a state space control technique for inverted pendulum system. The system is a common classical control problem that has been widely used to test multiple control algorithms because of its nonlinear and unstable behavior. Full state feedback based on pole placement and optimal control is applied to the inverted pendulum system to achieve desired design specification which are 4 seconds settling time and 5% overshoot. The simulation and optimization of the full state feedback controller based on pole placement and optimal control techniques as well as the performance comparison between these techniques is described comprehensively. The comparison is made to choose the most suitable technique for the system that have the best trade-off between settling time and overshoot. Besides that, the observer design is analyzed to see the effect of pole location and noise present in the system

    A Review of Resonant Converter Control Techniques and The Performances

    Get PDF
    paper first discusses each control technique and then gives experimental results and/or performance to highlights their merits. The resonant converter used as a case study is not specified to just single topology instead it used few topologies such as series-parallel resonant converter (SPRC), LCC resonant converter and parallel resonant converter (PRC). On the other hand, the control techniques presented in this paper are self-sustained phase shift modulation (SSPSM) control, self-oscillating power factor control, magnetic control and the H-∞ robust control technique

    A Review of Resonant Converter Control Techniques and The Performances

    Get PDF
    paper first discusses each control technique and then gives experimental results and/or performance to highlights their merits. The resonant converter used as a case study is not specified to just single topology instead it used few topologies such as series-parallel resonant converter (SPRC), LCC resonant converter and parallel resonant converter (PRC). On the other hand, the control techniques presented in this paper are self-sustained phase shift modulation (SSPSM) control, self-oscillating power factor control, magnetic control and the H-∞ robust control technique

    State-Feedback Controller Based on Pole Placement Technique for Inverted Pendulum System

    Get PDF
    This paper presents a state space control technique for inverted pendulum system using simulation and real experiment via MATLAB/SIMULINK software. The inverted pendulum is difficult system to control in the field of control engineering. It is also one of the most important classical control system problems because of its nonlinear characteristics and unstable system. It has three main problems that always appear in control application which are nonlinear system, unstable and non-minimumbehavior phase system. This project will apply state feedback controller based on pole placement technique which is capable in stabilizing the practical based inverted pendulum at vertical position. Desired design specifications which are 4 seconds settling time and 5 % overshoot is needed to apply in full state feedback controller based on pole placement technique. First of all, the mathematical model of an inverted pendulum system is derived to obtain the state space representation of the system. Then, the design phase of the State-Feedback Controller can be conducted after linearization technique is performed to the nonlinear equation with the aid of mathematical aided software such as Mathcad. After that, the design is simulated using MATLAB/Simulink software. The controller design of the inverted pendulum system is verified using simulation and experiment test. Finally the controller design is compared with PID controller for benchmarking purpose

    Learning Control Policies for Fall Prevention and Safety in Bipedal Locomotion

    Get PDF
    The ability to recover from an unexpected external perturbation is a fundamental motor skill in bipedal locomotion. An effective response includes the ability to not just recover balance and maintain stability but also to fall in a safe manner when balance recovery is physically infeasible. For robots associated with bipedal locomotion, such as humanoid robots and assistive robotic devices that aid humans in walking, designing controllers which can provide this stability and safety can prevent damage to robots or prevent injury related medical costs. This is a challenging task because it involves generating highly dynamic motion for a high-dimensional, non-linear and under-actuated system with contacts. Despite prior advancements in using model-based and optimization methods, challenges such as requirement of extensive domain knowledge, relatively large computational time and limited robustness to changes in dynamics still make this an open problem. In this thesis, to address these issues we develop learning-based algorithms capable of synthesizing push recovery control policies for two different kinds of robots : Humanoid robots and assistive robotic devices that assist in bipedal locomotion. Our work can be branched into two closely related directions : 1) Learning safe falling and fall prevention strategies for humanoid robots and 2) Learning fall prevention strategies for humans using a robotic assistive devices. To achieve this, we introduce a set of Deep Reinforcement Learning (DRL) algorithms to learn control policies that improve safety while using these robots. To enable efficient learning, we present techniques to incorporate abstract dynamical models, curriculum learning and a novel method of building a graph of policies into the learning framework. We also propose an approach to create virtual human walking agents which exhibit similar gait characteristics to real-world human subjects, using which, we learn an assistive device controller to help virtual human return to steady state walking after an external push is applied. Finally, we extend our work on assistive devices and address the challenge of transferring a push-recovery policy to different individuals. As walking and recovery characteristics differ significantly between individuals, exoskeleton policies have to be fine-tuned for each person which is a tedious, time consuming and potentially unsafe process. We propose to solve this by posing it as a transfer learning problem, where a policy trained for one individual can adapt to another without fine tuning.Ph.D
    • 

    corecore