1,433 research outputs found

    Shimyureta to jikki o mochiita haiburiddo-gata kikai gakushuho ni kansuru kenkyu

    Get PDF
    制度:新 ; 報告番号:甲2816号 ; 学位の種類:博士(工学) ; 授与年月日:2009/2/25 ; 早大学位記番号:新503

    Efficient Reinforcement Learning for Motor Control

    No full text
    Abstract — Artificial learners often require many more trials than humans or animals when learning motor control tasks in the absence of expert knowledge. We implement two key ingredients of biological learning systems, generalization and incorporation of uncertainty into the decision-making process, to speed up artificial learning. We present a coherent and fully Bayesian framework that allows for efficient artificial learning in the absence of expert knowledge. The success of our learning framework is demonstrated on challenging nonlinear control problems in simulation and in hardware. I

    Resolved Motion Control for 3D Underactuated Bipedal Walking using Linear Inverted Pendulum Dynamics and Neural Adaptation

    Full text link
    We present a framework to generate periodic trajectory references for a 3D under-actuated bipedal robot, using a linear inverted pendulum (LIP) based controller with adaptive neural regulation. We use the LIP template model to estimate the robot's center of mass (CoM) position and velocity at the end of the current step, and formulate a discrete controller that determines the next footstep location to achieve a desired walking profile. This controller is equipped on the frontal plane with a Neural-Network-based adaptive term that reduces the model mismatch between the template and physical robot that particularly affects the lateral motion. Then, the foot placement location computed for the LIP model is used to generate task space trajectories (CoM and swing foot trajectories) for the actual robot to realize stable walking. We use a fast, real-time QP-based inverse kinematics algorithm that produces joint references from the task space trajectories, which makes the formulation independent of the knowledge of the robot dynamics. Finally, we implemented and evaluated the proposed approach in simulation and hardware experiments with a Digit robot obtaining stable periodic locomotion for both cases.Comment: 7 pages, to appear in IROS 202

    ROS Based High Performance Control Architecture for an Aerial Robotic Testbed

    Get PDF
    The purpose of this thesis is to show the development of an aerial testbed based on the Robot Operating System (ROS). Such a testbed provides flexibility to control heterogenous vehicles, since the robots are able to simply communication with each other on the High Level (HL) control side. ROS runs on an embedded computer on-board each quadrotor. This eliminates the need of a Ground Base Station, since the complete HL control runs on-board the Unmanned Aerial Vehicle (UAV). The architecture of the system is explained throughout the thesis with detailed explanations of the specific hardware and software used for the system. The implementation on two different quadrotor models is documented and shows that even though they have different components, they can be controlled similarly by the framework. The user is able to control every unit of the testbed with position, velocity and/or acceleration data. To show this independency, control architectures are shown and implemented. Extensive tests verify their effectiveness. The flexibility of the proposed aerial testbed is demonstrated by implementing several applications that require high-performance control. Additionally, a framework for a flying inverted pendulum on a quadrotor using robust hybrid control is presented. The goal is to have a universal controller which is able to swing-up and balance an off-centered pendulum that is attached to the UAV linearly and rotationally. The complete dynamic model is derived and a control strategy is presented. The performance of the controller is demonstrated using realistic simulation studies. The realization in the testbed is documented with modifications that were made to the quadrotor to attach the pendulum. First flight tests are conducted and are presented. The possibilities of using a ROS based framework is shown at every step. It has many advantages for implementation purposes, especially in a heterogeneous robotic environment with many agents. Real-time data of the robot is provided by ROS topics and can be used at any point in the system. The control architecture has been validated and verified with different practical tests, which also allowed improving the system by tuning the specific control parameters

    Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing

    Full text link
    Within the context of autonomous driving a model-based reinforcement learning algorithm is proposed for the design of neural network-parameterized controllers. Classical model-based control methods, which include sampling- and lattice-based algorithms and model predictive control, suffer from the trade-off between model complexity and computational burden required for the online solution of expensive optimization or search problems at every short sampling time. To circumvent this trade-off, a 2-step procedure is motivated: first learning of a controller during offline training based on an arbitrarily complicated mathematical system model, before online fast feedforward evaluation of the trained controller. The contribution of this paper is the proposition of a simple gradient-free and model-based algorithm for deep reinforcement learning using task separation with hill climbing (TSHC). In particular, (i) simultaneous training on separate deterministic tasks with the purpose of encoding many motion primitives in a neural network, and (ii) the employment of maximally sparse rewards in combination with virtual velocity constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl

    Stair Climbing using the Angular Momentum Linear Inverted Pendulum Model and Model Predictive Control

    Full text link
    A new control paradigm using angular momentum and foot placement as state variables in the linear inverted pendulum model has expanded the realm of possibilities for the control of bipedal robots. This new paradigm, known as the ALIP model, has shown effectiveness in cases where a robot's center of mass height can be assumed to be constant or near constant as well as in cases where there are no non-kinematic restrictions on foot placement. Walking up and down stairs violates both of these assumptions, where center of mass height varies significantly within a step and the geometry of the stairs restrict the effectiveness of foot placement. In this paper, we explore a variation of the ALIP model that allows the length of the virtual pendulum formed by the robot's stance foot and center of mass to follow smooth trajectories during a step. We couple this model with a control strategy constructed from a novel combination of virtual constraint-based control and a model predictive control algorithm to stabilize a stair climbing gait that does not soley rely on foot placement. Simulations on a 20-degree of freedom model of the Cassie biped in the SimMechanics simulation environment show that the controller is able to achieve periodic gait

    Nonlinear Model Predictive Control for Motion Generation of Humanoids

    Get PDF
    Das Ziel dieser Arbeit ist die Untersuchung und Entwicklung numerischer Methoden zur Bewegungserzeugung von humanoiden Robotern basierend auf nichtlinearer modell-prädiktiver Regelung. Ausgehend von der Modellierung der Humanoiden als komplexe Mehrkörpermodelle, die sowohl durch unilaterale Kontaktbedingungen beschränkt als auch durch die Formulierung unteraktuiert sind, wird die Bewegungserzeugung als Optimalsteuerungsproblem formuliert. In dieser Arbeit werden numerische Erweiterungen basierend auf den Prinzipien der Automatischen Differentiation für rekursive Algorithmen, die eine effiziente Auswertung der dynamischen Größen der oben genannten Mehrkörperformulierung erlauben, hergeleitet, sodass sowohl die nominellen Größen als auch deren ersten Ableitungen effizient ausgewertet werden können. Basierend auf diesen Ideen werden Erweiterungen für die Auswertung der Kontaktdynamik und der Berechnung des Kontaktimpulses vorgeschlagen. Die Echtzeitfähigkeit der Berechnung von Regelantworten hängt stark von der Komplexität der für die Bewegungerzeugung gewählten Mehrkörperformulierung und der zur Verfügung stehenden Rechenleistung ab. Um einen optimalen Trade-Off zu ermöglichen, untersucht diese Arbeit einerseits die mögliche Reduktion der Mehrkörperdynamik und andererseits werden maßgeschneiderte numerische Methoden entwickelt, um die Echtzeitfähigkeit der Regelung zu realisieren. Im Rahmen dieser Arbeit werden hierfür zwei reduzierte Modelle hergeleitet: eine nichtlineare Erweiterung des linearen inversen Pendelmodells sowie eine reduzierte Modellvariante basierend auf der centroidalen Mehrkörperdynamik. Ferner wird ein Regelaufbau zur GanzkörperBewegungserzeugung vorgestellt, deren Hauptbestandteil jeweils aus einem speziell diskretisierten Problem der nichtlinearen modell-prädiktiven Regelung sowie einer maßgeschneiderter Optimierungsmethode besteht. Die Echtzeitfähigkeit des Ansatzes wird durch Experimente mit den Robotern HRP-2 und HeiCub verifiziert. Diese Arbeit schlägt eine Methode der nichtlinear modell-prädiktiven Regelung vor, die trotz der Komplexität der vollen Mehrkörperformulierung eine Berechnung der Regelungsantwort in Echtzeit ermöglicht. Dies wird durch die geschickte Kombination von linearer und nichtlinearer modell-prädiktiver Regelung auf der aktuellen beziehungsweise der letzten Linearisierung des Problems in einer parallelen Regelstrategie realisiert. Experimente mit dem humanoiden Roboter Leo zeigen, dass, im Vergleich zur nominellen Strategie, erst durch den Einsatz dieser Methode eine Bewegungserzeugung auf dem Roboter möglich ist. Neben Methoden der modell-basierten Optimalsteuerung werden auch modell-freie Methoden des verstärkenden Lernens (Reinforcement Learning) für die Bewegungserzeugung untersucht, mit dem Fokus auf den schwierig zu modellierenden Modellunsicherheiten der Roboter. Im Rahmen dieser Arbeit werden eine allgemeine vergleichende Studie sowie Leistungskennzahlen entwickelt, die es erlauben, modell-basierte und -freie Methoden quantitativ bezüglich ihres Lösungsverhaltens zu vergleichen. Die Anwendung der Studie auf ein akademisches Beispiel zeigt Unterschiede und Kompromisse sowie Break-Even-Punkte zwischen den Problemformulierungen. Diese Arbeit schlägt basierend auf dieser Grundlage zwei mögliche Kombinationen vor, deren Eigenschaften bewiesen und in Simulation untersucht werden. Außerdem wird die besser abschneidende Variante auf dem humanoiden Roboter Leo implementiert und mit einem nominellen modell-basierten Regler verglichen
    corecore