271 research outputs found
EduBal: An open balancing robot platform for teaching control and system theory
In this work we present EduBal, an educational open-source hardware and
software platform for a balancing robot. The robot is designed to be low-cost,
safe and easy to use by students for control education. Along with the robot we
present example tasks from system identification as well as SISO and MIMO
control. Using Simulink, students can quickly implement their control
algorithms on the robot. Individual control parameters can be tuned online
while analyzing the resulting behavior in live signal plots. At RWTH Aachen
University and ETH Zurich 28 units have so far been built and used in control
classes. In first laboratory experiences students show high intrinsic
motivation and creativity to apply the studied concepts of control theory to
the real system.Comment: Accepted for publication at the 21st IFAC World Congress 202
A Survey on Physics Informed Reinforcement Learning: Review and Open Problems
The inclusion of physical information in machine learning frameworks has
revolutionized many application areas. This involves enhancing the learning
process by incorporating physical constraints and adhering to physical laws. In
this work we explore their utility for reinforcement learning applications. We
present a thorough review of the literature on incorporating physics
information, as known as physics priors, in reinforcement learning approaches,
commonly referred to as physics-informed reinforcement learning (PIRL). We
introduce a novel taxonomy with the reinforcement learning pipeline as the
backbone to classify existing works, compare and contrast them, and derive
crucial insights. Existing works are analyzed with regard to the
representation/ form of the governing physics modeled for integration, their
specific contribution to the typical reinforcement learning architecture, and
their connection to the underlying reinforcement learning pipeline stages. We
also identify core learning architectures and physics incorporation biases
(i.e., observational, inductive and learning) of existing PIRL approaches and
use them to further categorize the works for better understanding and
adaptation. By providing a comprehensive perspective on the implementation of
the physics-informed capability, the taxonomy presents a cohesive approach to
PIRL. It identifies the areas where this approach has been applied, as well as
the gaps and opportunities that exist. Additionally, the taxonomy sheds light
on unresolved issues and challenges, which can guide future research. This
nascent field holds great potential for enhancing reinforcement learning
algorithms by increasing their physical plausibility, precision, data
efficiency, and applicability in real-world scenarios
From Flies to Robots: Inverted Landing in Small Quadcopters with Dynamic Perching
Inverted landing is a routine behavior among a number of animal fliers.
However, mastering this feat poses a considerable challenge for robotic fliers,
especially to perform dynamic perching with rapid body rotations (or flips) and
landing against gravity. Inverted landing in flies have suggested that optical
flow senses are closely linked to the precise triggering and control of body
flips that lead to a variety of successful landing behaviors. Building upon
this knowledge, we aimed to replicate the flies' landing behaviors in small
quadcopters by developing a control policy general to arbitrary
ceiling-approach conditions. First, we employed reinforcement learning in
simulation to optimize discrete sensory-motor pairs across a broad spectrum of
ceiling-approach velocities and directions. Next, we converted the
sensory-motor pairs to a two-stage control policy in a continuous
augmented-optical flow space. The control policy consists of a first-stage
Flip-Trigger Policy, which employs a one-class support vector machine, and a
second-stage Flip-Action Policy, implemented as a feed-forward neural network.
To transfer the inverted-landing policy to physical systems, we utilized domain
randomization and system identification techniques for a zero-shot sim-to-real
transfer. As a result, we successfully achieved a range of robust
inverted-landing behaviors in small quadcopters, emulating those observed in
flies.Comment: 17 pages, 19 Figures, Journal paper currently under revie
Viability in State-Action Space: Connecting Morphology, Control, and Learning
Wie können wir Robotern ermöglichen, modellfrei und direkt auf der Hardware zu lernen? Das maschinelle Lernen nimmt als Standardwerkzeug im Arsenal des Robotikers seinen Platz ein. Es gibt jedoch einige offene Fragen, wie man die Kontrolle ĂŒber physikalische Systeme lernen kann. Diese Arbeit gibt zwei Antworten auf diese motivierende Frage. Das erste ist ein formales Mittel, um die inhĂ€rente Robustheit eines gegebenen Systemdesigns zu quantifizieren, bevor der Controller oder das Lernverfahren entworfen wird. Dies unterstreicht die Notwendigkeit, sowohl das Hardals auch das Software-Design eines Roboters zu berĂŒcksichtigen, da beide Aspekte in der Systemdynamik untrennbar miteinander verbunden sind. Die zweite ist die Formalisierung einer Sicherheitsmass, die modellfrei erlernt werden kann. Intuitiv zeigt diese Mass an, wie leicht ein Roboter FehlschlĂ€ge vermeiden kann. Auf diese Weise können Roboter unbekannte Umgebungen erkunden und gleichzeitig AusfĂ€lle vermeiden. Die wichtigsten BeitrĂ€ge dieser Dissertation basieren sich auf der ViabilitĂ€tstheorie. ViabilitĂ€t bietet eine alternative Sichtweise auf dynamische Systeme: Anstatt sich auf die Konvergenzeigenschaften eines Systems in Richtung Gleichgewichte zu konzentrieren, wird der Fokus auf Menge von FehlerzustĂ€nden und die FĂ€higkeit des Systems, diese zu vermeiden, verlagert. Diese Sichtweise eignet sich besonders gut fĂŒr das Studium der Lernkontrolle an Robotern, da StabilitĂ€t im Sinne einer Konvergenz wĂ€hrend des Lernprozesses selten gewĂ€hrleistet werden kann. Der Begriff der ViabilitĂ€t wird formal auf den Zustand-Aktion-Raum erweitert, mit ViabilitĂ€tsmengen von Staat-Aktionspaaren. Eine ĂŒber diese Mengen definierte Mass ermöglicht eine quantifizierte Bewertung der Robustheit, die fĂŒr die Familie aller fehlervermeidenden Regler gilt, und ebnet den Weg fĂŒr ein sicheres, modellfreies Lernen. Die Arbeit beinhaltet auch zwei kleinere BeitrĂ€ge. Der erste kleine Beitrag ist eine empirische Demonstration der Shaping durch ausschliessliche Modifikation der Systemdynamik. Diese Demonstration verdeutlicht die Bedeutung der Robustheit gegenĂŒber Fehlern fĂŒr die Lernkontrolle: AusfĂ€lle können nicht nur SchĂ€den verursachen, sondern liefern in der Regel auch keine nĂŒtzlichen Gradienteninformationen fĂŒr den Lernprozess. Der zweite kleine Beitrag ist eine Studie ĂŒber die Wahl der Zustandsinitialisierungen. Entgegen der Intuition und der ĂŒblichen Praxis zeigt diese Studie, dass es zuverlĂ€ssiger sein kann, das System gelegentlich aus einem Zustand zu initialisieren, der bekanntermassen unkontrollierbar ist.How can we enable robots to learn control model-free and directly on hardware? Machine learning is taking its place as a standard tool in the roboticistâs arsenal. However, there are several open questions on how to learn control for physical systems. This thesis provides two answers to this motivating question. The first is a formal means to quantify the inherent robustness of a given system design, prior to designing the controller or learning agent. This emphasizes the need to consider both the hardware and software design of a robot, which are inseparably intertwined in the system dynamics. The second is the formalization of a safety-measure, which can be learned model-free. Intuitively, this measure indicates how easily a robot can avoid failure, and enables robots to explore unknown environments while avoiding failures. The main contributions of this dissertation are based on viability theory. Viability theory provides a slightly unconventional view of dynamical systems: instead of focusing on a systemâs convergence properties towards equilibria, the focus is shifted towards sets of failure states and the systemâs ability to avoid these sets. This view is particularly well suited to studying learning control in robots, since stability in the sense of convergence can rarely be guaranteed during the learning process. The notion of viability is formally extended to state-action space, with viable sets of state-action pairs. A measure defined over these sets allows a quantified evaluation of robustness valid for the family of all failure-avoiding control policies, and also paves the way for enabling safe model-free learning. The thesis also includes two minor contributions. The first minor contribution is an empirical demonstration of shaping by exclusively modifying the system dynamics. This demonstration highlights the importance of robustness to failures for learning control: not only can failures cause damage, but they typically do not provide useful gradient information for the learning process. The second minor contribution is a study on the choice of state initializations. Counter to intuition and common practice, this study shows it can be more reliable to occasionally initialize the system from a state that is known to be uncontrollable
Lab experiences for teaching undergraduate dynamics
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2003.Includes bibliographical references (p. 443-466).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.This thesis describes several projects developed to teach undergraduate dynamics and controls. The materials were developed primarily for the class 2.003 Modeling Dynamics and Control I. These include (1) a set of ActivLab modular experiments that illustrate the dynamics of linear time-invariant (LTI) systems and (2) a two wheeled mobile inverted pendulum. The ActivLab equipment has been designed as shareware, and plans for it are available on the web. The inverted pendulum robot developed here is largely inspired by the iBOT and Segway transportation devices invented by Dean Kamen.by Katherine A. Lilienkamp.S.M
OBSERVER-BASED-CONTROLLER FOR INVERTED PENDULUM MODEL
This paper presents a state space control technique for inverted pendulum system. The system is a common classical control problem that has been widely used to test multiple control algorithms because of its nonlinear and unstable behavior. Full state feedback based on pole placement and optimal control is applied to the inverted pendulum system to achieve desired design specification which are 4 seconds settling time and 5% overshoot. The simulation and optimization of the full state feedback controller based on pole placement and optimal control techniques as well as the performance comparison between these techniques is described comprehensively. The comparison is made to choose the most suitable technique for the system that have the best trade-off between settling time and overshoot. Besides that, the observer design is analyzed to see the effect of pole location and noise present in the system
A Review of Resonant Converter Control Techniques and The Performances
paper first discusses each control technique and then gives experimental results and/or performance to highlights their merits. The resonant converter used as a case study is not specified to just single topology instead it used few topologies such as series-parallel resonant converter (SPRC), LCC resonant converter and parallel resonant converter (PRC). On the other hand, the control techniques presented in this paper are self-sustained phase shift modulation (SSPSM) control, self-oscillating power factor
control, magnetic control and the H-â robust control technique
A Review of Resonant Converter Control Techniques and The Performances
paper first discusses each control technique and then gives experimental results and/or performance to highlights their merits. The resonant converter used as a case study is not specified to just single topology instead it used few topologies such as series-parallel resonant converter (SPRC), LCC resonant converter and parallel resonant converter (PRC). On the other hand, the control techniques presented in this paper are self-sustained phase shift modulation (SSPSM) control, self-oscillating power factor
control, magnetic control and the H-â robust control technique
State-Feedback Controller Based on Pole Placement Technique for Inverted Pendulum System
This paper presents a state space control technique for inverted pendulum system using simulation and real experiment via MATLAB/SIMULINK software. The inverted pendulum is difficult system to control in the field of control engineering. It is also one of the most important classical control system problems because of its nonlinear characteristics and unstable system. It has three main problems that always appear in control application which are nonlinear system, unstable and non-minimumbehavior
phase system. This project will apply state feedback controller based on pole placement technique which is capable in stabilizing the practical based inverted pendulum at vertical position. Desired design specifications which are 4 seconds settling time and 5 % overshoot is needed to apply in full state feedback controller based on pole placement technique. First of all, the mathematical model of an inverted pendulum system is derived to obtain the state space representation of the system. Then, the design phase of the State-Feedback Controller can be conducted after linearization technique is
performed to the nonlinear equation with the aid of mathematical aided software such as Mathcad. After that, the design is simulated using MATLAB/Simulink software. The controller design of the inverted pendulum system is verified using simulation and experiment test. Finally the controller design is compared with PID controller for benchmarking purpose
Learning Control Policies for Fall Prevention and Safety in Bipedal Locomotion
The ability to recover from an unexpected external perturbation is a fundamental motor skill in bipedal locomotion. An effective response includes the ability to not just recover balance and maintain stability but also to fall in a safe manner when balance recovery is physically infeasible. For robots associated with bipedal locomotion, such as humanoid robots and assistive robotic devices that aid humans in walking, designing controllers which can provide this stability and safety can prevent damage to robots or prevent injury related medical costs. This is a challenging task because it involves generating highly dynamic motion for a high-dimensional, non-linear and under-actuated system with contacts. Despite prior advancements in using model-based and optimization methods, challenges such as requirement of extensive domain knowledge, relatively large computational time and limited robustness to changes in dynamics still make this an open problem.
In this thesis, to address these issues we develop learning-based algorithms capable of synthesizing push recovery control policies for two different kinds of robots : Humanoid robots and assistive robotic devices that assist in bipedal locomotion. Our work can be branched into two closely related directions : 1) Learning safe falling and fall prevention strategies for humanoid robots and 2) Learning fall prevention strategies for humans using a robotic assistive devices. To achieve this, we introduce a set of Deep Reinforcement Learning (DRL) algorithms to learn control policies that improve safety while using these robots. To enable efficient learning, we present techniques to incorporate abstract dynamical models, curriculum learning and a novel method of building a graph of policies into the learning framework. We also propose an approach to create virtual human walking agents which exhibit similar gait characteristics to real-world human subjects, using which, we learn an assistive device controller to help virtual human return to steady state walking after an external push is applied.
Finally, we extend our work on assistive devices and address the challenge of transferring a push-recovery policy to different individuals. As walking and recovery characteristics differ significantly between individuals, exoskeleton policies have to be fine-tuned for each person which is a tedious, time consuming and potentially unsafe process. We propose to solve this by posing it as a transfer learning problem, where a policy trained for one individual can adapt to another without fine tuning.Ph.D
- âŠ