566 research outputs found

    Continuous-Time Reinforcement Learning: New Design Algorithms with Theoretical Insights and Performance Guarantees

    Full text link
    Continuous-time nonlinear optimal control problems hold great promise in real-world applications. After decades of development, reinforcement learning (RL) has achieved some of the greatest successes as a general nonlinear control design method. However, a recent comprehensive analysis of state-of-the-art continuous-time RL (CT-RL) methods, namely, adaptive dynamic programming (ADP)-based CT-RL algorithms, reveals they face significant design challenges due to their complexity, numerical conditioning, and dimensional scaling issues. Despite advanced theoretical results, existing ADP CT-RL synthesis methods are inadequate in solving even small, academic problems. The goal of this work is thus to introduce a suite of new CT-RL algorithms for control of affine nonlinear systems. Our design approach relies on two important factors. First, our methods are applicable to physical systems that can be partitioned into smaller subproblems. This constructive consideration results in reduced dimensionality and greatly improved intuitiveness of design. Second, we introduce a new excitation framework to improve persistence of excitation (PE) and numerical conditioning performance via classical input/output insights. Such a design-centric approach is the first of its kind in the ADP CT-RL community. In this paper, we progressively introduce a suite of (decentralized) excitable integral reinforcement learning (EIRL) algorithms. We provide convergence and closed-loop stability guarantees, and we demonstrate these guarantees on a significant application problem of controlling an unstable, nonminimum phase hypersonic vehicle (HSV)

    Nonlinear Model Predictive Control for Motion Generation of Humanoids

    Get PDF
    Das Ziel dieser Arbeit ist die Untersuchung und Entwicklung numerischer Methoden zur Bewegungserzeugung von humanoiden Robotern basierend auf nichtlinearer modell-prรคdiktiver Regelung. Ausgehend von der Modellierung der Humanoiden als komplexe Mehrkรถrpermodelle, die sowohl durch unilaterale Kontaktbedingungen beschrรคnkt als auch durch die Formulierung unteraktuiert sind, wird die Bewegungserzeugung als Optimalsteuerungsproblem formuliert. In dieser Arbeit werden numerische Erweiterungen basierend auf den Prinzipien der Automatischen Differentiation fรผr rekursive Algorithmen, die eine effiziente Auswertung der dynamischen GrรถรŸen der oben genannten Mehrkรถrperformulierung erlauben, hergeleitet, sodass sowohl die nominellen GrรถรŸen als auch deren ersten Ableitungen effizient ausgewertet werden kรถnnen. Basierend auf diesen Ideen werden Erweiterungen fรผr die Auswertung der Kontaktdynamik und der Berechnung des Kontaktimpulses vorgeschlagen. Die Echtzeitfรคhigkeit der Berechnung von Regelantworten hรคngt stark von der Komplexitรคt der fรผr die Bewegungerzeugung gewรคhlten Mehrkรถrperformulierung und der zur Verfรผgung stehenden Rechenleistung ab. Um einen optimalen Trade-Off zu ermรถglichen, untersucht diese Arbeit einerseits die mรถgliche Reduktion der Mehrkรถrperdynamik und andererseits werden maรŸgeschneiderte numerische Methoden entwickelt, um die Echtzeitfรคhigkeit der Regelung zu realisieren. Im Rahmen dieser Arbeit werden hierfรผr zwei reduzierte Modelle hergeleitet: eine nichtlineare Erweiterung des linearen inversen Pendelmodells sowie eine reduzierte Modellvariante basierend auf der centroidalen Mehrkรถrperdynamik. Ferner wird ein Regelaufbau zur GanzkรถrperBewegungserzeugung vorgestellt, deren Hauptbestandteil jeweils aus einem speziell diskretisierten Problem der nichtlinearen modell-prรคdiktiven Regelung sowie einer maรŸgeschneiderter Optimierungsmethode besteht. Die Echtzeitfรคhigkeit des Ansatzes wird durch Experimente mit den Robotern HRP-2 und HeiCub verifiziert. Diese Arbeit schlรคgt eine Methode der nichtlinear modell-prรคdiktiven Regelung vor, die trotz der Komplexitรคt der vollen Mehrkรถrperformulierung eine Berechnung der Regelungsantwort in Echtzeit ermรถglicht. Dies wird durch die geschickte Kombination von linearer und nichtlinearer modell-prรคdiktiver Regelung auf der aktuellen beziehungsweise der letzten Linearisierung des Problems in einer parallelen Regelstrategie realisiert. Experimente mit dem humanoiden Roboter Leo zeigen, dass, im Vergleich zur nominellen Strategie, erst durch den Einsatz dieser Methode eine Bewegungserzeugung auf dem Roboter mรถglich ist. Neben Methoden der modell-basierten Optimalsteuerung werden auch modell-freie Methoden des verstรคrkenden Lernens (Reinforcement Learning) fรผr die Bewegungserzeugung untersucht, mit dem Fokus auf den schwierig zu modellierenden Modellunsicherheiten der Roboter. Im Rahmen dieser Arbeit werden eine allgemeine vergleichende Studie sowie Leistungskennzahlen entwickelt, die es erlauben, modell-basierte und -freie Methoden quantitativ bezรผglich ihres Lรถsungsverhaltens zu vergleichen. Die Anwendung der Studie auf ein akademisches Beispiel zeigt Unterschiede und Kompromisse sowie Break-Even-Punkte zwischen den Problemformulierungen. Diese Arbeit schlรคgt basierend auf dieser Grundlage zwei mรถgliche Kombinationen vor, deren Eigenschaften bewiesen und in Simulation untersucht werden. AuรŸerdem wird die besser abschneidende Variante auf dem humanoiden Roboter Leo implementiert und mit einem nominellen modell-basierten Regler verglichen

    Socially Cognizant Robotics for a Technology Enhanced Society

    Full text link
    Emerging applications of robotics, and concerns about their impact, require the research community to put human-centric objectives front-and-center. To meet this challenge, we advocate an interdisciplinary approach, socially cognizant robotics, which synthesizes technical and social science methods. We argue that this approach follows from the need to empower stakeholder participation (from synchronous human feedback to asynchronous societal assessment) in shaping AI-driven robot behavior at all levels, and leads to a range of novel research perspectives and problems both for improving robots' interactions with individuals and impacts on society. Drawing on these arguments, we develop best practices for socially cognizant robot design that balance traditional technology-based metrics (e.g. efficiency, precision and accuracy) with critically important, albeit challenging to measure, human and society-based metrics

    Superando la brecha de la realidad: Algoritmos de aprendizaje por imitaciรณn y por refuerzos para problemas de locomociรณn robรณtica bรญpeda

    Get PDF
    ilustraciones, diagramas, fotografรญasEsta tesis presenta una estrategia de entrenamiento de robots que utiliza tรฉcnicas de aprendizaje artificial para optimizar el rendimiento de los robots en tareas complejas. Motivado por los impresionantes logros recientes en el aprendizaje automรกtico, especialmente en juegos y escenarios virtuales, el proyecto tiene como objetivo explorar el potencial de estas tรฉcnicas para mejorar las capacidades de los robots mรกs allรก de la programaciรณn humana tradicional a pesar de las limitaciones impuestas por la brecha de la realidad. El caso de estudio seleccionado para esta investigaciรณn es la locomociรณn bรญpeda, ya que permite dilucidar los principales desafรญos y ventajas de utilizar mรฉtodos de aprendizaje artificial para el aprendizaje de robots. La tesis identifica cuatro desafรญos principales en este contexto: la variabilidad de los resultados obtenidos de los algoritmos de aprendizaje artificial, el alto costo y riesgo asociado con la realizaciรณn de experimentos en robots reales, la brecha entre la simulaciรณn y el comportamiento del mundo real, y la necesidad de adaptar los patrones de movimiento humanos a los sistemas robรณticos. La propuesta consiste en tres mรณdulos principales para abordar estos desafรญos: Enfoques de Control No Lineal, Aprendizaje por Imitaciรณn y Aprendizaje por Reforzamiento. El mรณdulo de Enfoques de Control No Lineal establece una base al modelar robots y emplear tรฉcnicas de control bien establecidas. El mรณdulo de Aprendizaje por Imitaciรณn utiliza la imitaciรณn para generar polรญticas iniciales basadas en datos de captura de movimiento de referencia o resultados preliminares de polรญticas para crear patrones de marcha similares a los humanos y factibles. El mรณdulo de Aprendizaje por Refuerzos complementa el proceso mejorando de manera iterativa las polรญticas paramรฉtricas, principalmente a travรฉs de la simulaciรณn pero con el rendimiento en el mundo real como objetivo final. Esta tesis enfatiza la modularidad del enfoque, permitiendo la implementaciรณn de los mรณdulos individuales por separado o su combinaciรณn para determinar la estrategia mรกs efectiva para diferentes escenarios de entrenamiento de robots. Al utilizar una combinaciรณn de tรฉcnicas de control establecidas, aprendizaje por imitaciรณn y aprendizaje por refuerzos, la estrategia de entrenamiento propuesta busca desbloquear el potencial para que los robots alcancen un rendimiento optimizado en tareas complejas, contribuyendo al avance de la inteligencia artificial en la robรณtica no solo en sistemas virtuales sino en sistemas reales.The thesis introduces a comprehensive robot training framework that utilizes artificial learning techniques to optimize robot performance in complex tasks. Motivated by recent impressive achievements in machine learning, particularly in games and virtual scenarios, the project aims to explore the potential of these techniques for improving robot capabilities beyond traditional human programming. The case study selected for this investigation is bipedal locomotion, as it allows for elucidating key challenges and advantages of using artificial learning methods for robot learning. The thesis identifies four primary challenges in this context: the variability of results obtained from artificial learning algorithms, the high cost and risk associated with conducting experiments on real robots, the reality gap between simulation and real-world behavior, and the need to adapt human motion patterns to robotic systems. The proposed approach consists of three main modules to address these challenges: Non-linear Control Approaches, Imitation Learning, and Reinforcement Learning. The Non-linear Control module establishes a foundation by modeling robots and employing well-established control techniques. The Imitation Learning module utilizes imitation to generate initial policies based on reference motion capture data or preliminary policy results to create feasible human-like gait patterns. The Reinforcement Learning module complements the process by iteratively improving parametric policies, primarily through simulation but ultimately with real-world performance as the ultimate goal. The thesis emphasizes the modularity of the approach, allowing for the implementation of individual modules separately or their combination to determine the most effective strategy for different robot training scenarios. By employing a combination of established control techniques, imitation learning, and reinforcement learning, the framework seeks to unlock the potential for robots to achieve optimized performances in complex tasks, contributing to the advancement of artificial intelligence in robotics.DoctoradoDoctor en ingenierรญa mecรกnica y mecatrรณnic

    Dynamically Stable 3D Quadrupedal Walking with Multi-Domain Hybrid System Models and Virtual Constraint Controllers

    Get PDF
    Hybrid systems theory has become a powerful approach for designing feedback controllers that achieve dynamically stable bipedal locomotion, both formally and in practice. This paper presents an analytical framework 1) to address multi-domain hybrid models of quadruped robots with high degrees of freedom, and 2) to systematically design nonlinear controllers that asymptotically stabilize periodic orbits of these sophisticated models. A family of parameterized virtual constraint controllers is proposed for continuous-time domains of quadruped locomotion to regulate holonomic and nonholonomic outputs. The properties of the Poincare return map for the full-order and closed-loop hybrid system are studied to investigate the asymptotic stabilization problem of dynamic gaits. An iterative optimization algorithm involving linear and bilinear matrix inequalities is then employed to choose stabilizing virtual constraint parameters. The paper numerically evaluates the analytical results on a simulation model of an advanced 3D quadruped robot, called GR Vision 60, with 36 state variables and 12 control inputs. An optimal amble gait of the robot is designed utilizing the FROST toolkit. The power of the analytical framework is finally illustrated through designing a set of stabilizing virtual constraint controllers with 180 controller parameters.Comment: American Control Conference 201

    ๊ตฌ์กฐ๋กœ๋ด‡์„ ์œ„ํ•œ ๊ฐ•๊ฑดํ•œ ๊ณ„์ธต์  ๋™์ž‘ ๊ณ„ํš ๋ฐ ์ œ์–ด

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ๊ธฐ๊ณ„ํ•ญ๊ณต๊ณตํ•™๋ถ€, 2021.8. ๋ฐ•์ข…์šฐ.Over the last several years, robotics has experienced a striking development, and a new generation of robots has emerged that shows great promise in being able to accomplish complex tasks associated with human behavior. Nowadays the objectives of the robots are no longer restricted to the automaton in the industrial process but are changing into explorers for hazardous, harsh, uncooperative, and extreme environments. As these robots usually operate in dynamic and unstructured environments, they should be robust, adaptive, and reactive under various changing operation conditions. We propose online hierarchical optimization-based planning and control methodologies for a rescue robot to execute a given mission in such a highly unstructured environment. A large number of degrees of freedom is provided to robots in order to achieve diverse kinematic and dynamic tasks. However, accomplishing such multiple objectives renders on-line reactive motion planning and control problems more difficult to solve due to the incompatible tasks. To address this problem, we exploit a hierarchical structure to precisely resolve conflicts by creating a priority in which every task is achieved as much as possible according to the levels. In particular, we concentrate on the reasoning about the task regularization to ensure the convergence and robustness of a solution in the face of singularity. As robotic systems with real-time motion planners or controllers often execute unrehearsed missions, a desired task cannot always be driven to a singularity free configuration. We develop a generic solver for regularized hierarchical quadratic programming without resorting to any off-the-shelf QP solver to take advantage of the null-space projections for computational efficiency. Therefore, the underlying principles are thoroughly investigated. The robust optimal solution is obtained under both equality and inequality tasks or constraints while addressing all problems resulting from the regularization. Especially as a singular value decomposition centric approach is leveraged, all hierarchical solutions and Lagrange multipliers for properly handling the inequality constraints are analytically acquired in a recursive procedure. The proposed algorithm works fast enough to be used as a practical means of real-time control system, so that it can be used for online motion planning, motion control, and interaction force control in a single hierarchical optimization. Core system design concepts of the rescue robot are presented. The goals of the robot are to safely extract a patient and to dispose a dangerous object instead of humans. The upper body is designed humanoid in form with replaceable modularized dual arms. The lower body is featured with a hybrid tracked and legged mobile platform to simultaneously acquire versatile manipulability and all-terrain mobility. Thus, the robot can successfully execute a driving task, dangerous object manipulation, and casualty extraction missions by changing the pose and modularized equipments in an optimized manner. Throughout the dissertation, all proposed methods are validated through extensive numerical simulations and experimental tests. We highlight precisely how the rescue robot can execute a casualty extraction and a dangerous object disposal mission both in indoor and outdoor environments that none of the existing robots has performed.์ตœ๊ทผ์— ๋“ฑ์žฅํ•œ ์ƒˆ๋กœ์šด ์„ธ๋Œ€์˜ ๋กœ๋ด‡์€ ๊ธฐ์กด์—๋Š” ์ธ๊ฐ„๋งŒ์ด ํ•  ์ˆ˜ ์žˆ์—ˆ๋˜ ๋ณต์žกํ•œ ์ผ์„ ๋กœ๋ด‡ ๋˜ํ•œ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ํŠนํžˆ DARPA Robotics Challenge๋ฅผ ํ†ตํ•ด ์ด๋Ÿฌํ•œ ์‚ฌ์‹ค์„ ์ž˜ ํ™•์ธํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด ๋กœ๋ด‡๋“ค์€ ๊ณต์žฅ๊ณผ ๊ฐ™์€ ์ •ํ˜•ํ™”๋œ ํ™˜๊ฒฝ์—์„œ ์ž๋™ํ™”๋œ ์ผ์„ ๋ฐ˜๋ณต์ ์œผ๋กœ ์ˆ˜ํ–‰ํ•˜๋˜ ์ž„๋ฌด์—์„œ ๋” ๋‚˜์•„๊ฐ€ ๊ทนํ•œ์˜ ํ™˜๊ฒฝ์—์„œ ์ธ๊ฐ„์„ ๋Œ€์‹ ํ•˜์—ฌ ์œ„ํ—˜ํ•œ ์ž„๋ฌด๋ฅผ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ๋ฐœ์ „ํ•˜๊ณ  ์žˆ๋‹ค. ๊ทธ๋ž˜์„œ ์‚ฌ๋žŒ๋“ค์€ ์žฌ๋‚œํ™˜๊ฒฝ์—์„œ ์•ˆ์ „ํ•˜๊ณ  ์‹œ์˜ ์ ์ ˆํ•˜๊ฒŒ ๋Œ€์‘ํ•  ์ˆ˜ ์žˆ๋Š” ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ๋Œ€์•ˆ ์ค‘์—์„œ ์‹คํ˜„ ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์€ ๋Œ€์ฒ˜ ๋ฐฉ์•ˆ์œผ๋กœ ๋กœ๋ด‡์„ ์ƒ๊ฐํ•˜๊ฒŒ ๋˜์—ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋Ÿฌํ•œ ๋กœ๋ด‡์€ ๋™์ ์œผ๋กœ ๋ณ€ํ™”ํ•˜๋Š” ๋น„์ •ํ˜• ํ™˜๊ฒฝ์—์„œ ์ž„๋ฌด๋ฅผ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋ถˆํ™•์‹ค์„ฑ์— ๋Œ€ํ•ด ๊ฐ•๊ฑดํ•ด์•ผํ•˜๊ณ , ๋‹ค์–‘ํ•œ ํ™˜๊ฒฝ ์กฐ๊ฑด์—์„œ ๋Šฅ๋™์ ์œผ๋กœ ๋ฐ˜์‘์„ ํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•œ๋‹ค. ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์—์„œ๋Š” ๋กœ๋ด‡์ด ๋น„์ •ํ˜• ํ™˜๊ฒฝ์—์„œ ๊ฐ•๊ฑดํ•˜๋ฉด์„œ๋„ ์ ์‘์ ์œผ๋กœ ๋™์ž‘ํ•  ์ˆ˜ ์žˆ๋Š” ์‹ค์‹œ๊ฐ„ ์ตœ์ ํ™” ๊ธฐ๋ฐ˜์˜ ๋™์ž‘ ๊ณ„ํš ๋ฐ ์ œ์–ด ๋ฐฉ๋ฒ•๊ณผ ๊ตฌ์กฐ ๋กœ๋ด‡์˜ ์„ค๊ณ„ ๊ฐœ๋…์„ ์ œ์•ˆํ•˜๊ณ ์ž ํ•œ๋‹ค. ์ธ๊ฐ„์€ ๋งŽ์€ ์ž์œ ๋„๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฉฐ, ํ•˜๋‚˜์˜ ์ „์‹  ๋™์ž‘์„ ์ƒ์„ฑํ•  ๋•Œ ๋‹ค์–‘ํ•œ ๊ธฐ๊ตฌํ•™ ํ˜น์€ ๋™์—ญํ•™์  ํŠน์„ฑ์„ ๊ฐ€์ง€๋Š” ์„ธ๋ถ€ ๋™์ž‘ ํ˜น์€ ์ž‘์—…์„ ์ •์˜ํ•˜๊ณ , ์ด๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์ข…ํ•ฉํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ํ•™์Šต์„ ํ†ตํ•ด ๊ฐ ๋™์ž‘ ์š”์†Œ๋“ค์„ ์ตœ์ ํ™”ํ•  ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ƒํ™ฉ ์— ๋”ฐ๋ผ ๊ฐ ๋™์ž‘ ์š”์†Œ์— ์šฐ์„ ์ˆœ์œ„๋ฅผ ๋ถ€์—ฌํ•˜์—ฌ ์ด๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ๊ฒฐํ•ฉํ•˜๊ฑฐ๋‚˜ ๋ถ„๋ฆฌํ•˜์—ฌ ์‹ค์‹œ๊ฐ„์œผ๋กœ ์ตœ์ ์˜ ๋™์ž‘์„ ์ƒ์„ฑํ•˜๊ณ  ์ œ์–ดํ•œ๋‹ค. ์ฆ‰, ์ƒํ™ฉ์— ๋”ฐ๋ผ ์ค‘์š”ํ•œ ๋™์ž‘์š”์†Œ๋ฅผ ์šฐ์„ ์ ์œผ๋กœ ์ˆ˜ํ–‰ํ•˜๊ณ  ์šฐ์„ ์ˆœ์œ„๊ฐ€ ๋‚ฎ์€ ๋™์ž‘์š”์†Œ๋Š” ๋ถ€๋ถ„ ํ˜น์€ ์ „์ฒด์ ์œผ๋กœ ํฌ๊ธฐํ•˜๊ธฐ๋„ ํ•˜๋ฉด์„œ ๋งค์šฐ ์œ ์—ฐํ•˜๊ฒŒ ์ „์ฒด ๋™์ž‘์„ ์ƒ์„ฑํ•˜๊ณ  ์ตœ์ ํ™” ํ•œ๋‹ค. ์ธ๊ฐ„๊ณผ ๊ฐ™์ด ๋‹ค์ž์œ ๋„๋ฅผ ๋ณด์œ ํ•œ ๋กœ๋ด‡ ๋˜ํ•œ ๊ธฐ๊ตฌํ•™๊ณผ ๋™์—ญํ•™์  ํŠน์„ฑ์„ ๊ฐ€์ง€๋Š” ๋‹ค์–‘ํ•œ ์„ธ๋ถ€ ๋™์ž‘ ํ˜น์€ ์ž‘์—…์„ ์ž‘์—…๊ณต๊ฐ„(task space) ํ˜น์€ ๊ด€์ ˆ๊ณต๊ฐ„(configuration space)์—์„œ ์ •์˜ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์šฐ์„ ์ˆœ์œ„์— ๋”ฐ๋ผ ์ด๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ๊ฒฐํ•ฉํ•˜์—ฌ ์ „์ฒด ๋™์ž‘์„ ์ƒ ์„ฑํ•˜๊ณ  ์ œ์–ดํ•  ์ˆ˜ ์žˆ๋‹ค. ์„œ๋กœ ์–‘๋ฆฝํ•˜๊ธฐ ์–ด๋ ค์šด ๋กœ๋ด‡์˜ ๋™์ž‘ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋™์ž‘๋“ค ์‚ฌ์ด์— ์šฐ์„ ์ˆœ์œ„๋ฅผ ๋ถ€์—ฌํ•˜์—ฌ ๊ณ„์ธต์„ ์ƒ์„ฑํ•˜๊ณ , ์ด์— ๋”ฐ๋ผ ๋กœ๋ด‡์˜ ์ „์‹  ๋™์ž‘์„ ๊ตฌํ˜„ํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์˜ค๋žซ๋™์•ˆ ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜์–ด ์™”๋‹ค. ์ด๋Ÿฌํ•œ ๊ณ„์ธต์  ์ตœ์ ํ™”๋ฅผ ์ด์šฉํ•˜๋ฉด ์šฐ์„ ์ˆœ์œ„๊ฐ€ ๋†’์€ ๋™์ž‘๋ถ€ํ„ฐ ์ˆœ์ฐจ์ ์œผ๋กœ ์‹คํ–‰ํ•˜์ง€๋งŒ, ์šฐ์„ ์ˆœ์œ„๊ฐ€ ๋‚ฎ์€ ๋™์ž‘์š”์†Œ๋“ค๋„ ๊ฐ€๋Šฅํ•œ ๋งŒ์กฑ์‹œํ‚ค๋Š” ์ตœ์ ์˜ ํ•ด๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ๊ด€์ ˆ์˜ ๊ตฌ๋™ ๋ฒ”์œ„์™€ ๊ฐ™์€ ๋ถ€๋“ฑ์‹์˜ ์กฐ๊ฑด์ด ํฌํ•จ๋œ ๊ณ„์ธต์  ์ตœ์ ํ™” ๋ฌธ์ œ์—์„œ ํŠน์ด์ ์— ๋Œ€ํ•œ ๊ฐ•๊ฑด์„ฑ๊นŒ์ง€ ํ™•๋ณดํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด์„œ๋Š” ์•„์ง๊นŒ์ง€ ๋งŽ์€ ๋ถ€๋ถ„์ด ๋ฐ ํ˜€์ง„ ๋ฐ”๊ฐ€ ์—†๋‹ค. ๋”ฐ๋ผ์„œ ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์—์„œ๋Š” ๋“ฑ์‹๊ณผ ๋ถ€๋“ฑ์‹์œผ๋กœ ํ‘œํ˜„๋˜๋Š” ๊ตฌ์†์กฐ๊ฑด ํ˜น์€ ๋™์ž‘์š”์†Œ๋ฅผ ๊ณ„์ธต์  ์ตœ์ ํ™”์— ๋™์‹œ์— ํฌํ•จ์‹œํ‚ค๊ณ , ํŠน์ด์ ์ด ์กด์žฌํ•˜๋”๋ผ๋„ ๊ฐ•๊ฑด์„ฑ๊ณผ ์ˆ˜๋ ด์„ฑ์„ ๋ณด์žฅํ•˜๋Š” ๊ด€์ ˆ๊ณต๊ฐ„์—์„œ์˜ ์ตœ์ ํ•ด๋ฅผ ํ™•๋ณดํ•˜๋Š”๋ฐ ์ง‘์ค‘ํ•œ๋‹ค. ์™œ๋‚˜ํ•˜๋ฉด ๋น„์ •ํ˜• ์ž„๋ฌด๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ๋กœ๋ด‡์€ ์‚ฌ์ „์— ๊ณ„ํš๋œ ๋™์ž‘์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹Œ ๋ณ€ํ™”ํ•˜๋Š” ํ™˜๊ฒฝ์กฐ๊ฑด์— ๋”ฐ๋ผ ์‹ค์‹œ๊ฐ„์œผ๋กœ ๋™์ž‘์„ ๊ณ„ํšํ•˜๊ณ  ์ œ์–ดํ•ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํŠน์ด์ ์ด ์—†๋Š” ์ž์„ธ๋กœ ๋กœ๋ด‡์„ ํ•ญ์ƒ ์ œ์–ดํ•˜๊ธฐ๊ฐ€ ์–ด๋ ต๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ด๋ ‡๊ฒŒ ํŠน์ด์ ์„ ํšŒํ”ผํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ๋กœ๋ด‡์„ ์ œ์–ดํ•˜๋Š” ๊ฒƒ์€ ๋กœ๋ด‡์˜ ์šด์šฉ์„ฑ์„ ์‹ฌ๊ฐํ•˜๊ฒŒ ์ €ํ•ด์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค. ํŠน์ด์  ๊ทผ๋ฐฉ์—์„œ์˜ ํ•ด์˜ ๊ฐ•๊ฑด์„ฑ์ด ๋ณด์žฅ๋˜์ง€ ์•Š์œผ๋ฉด ๋กœ๋ด‡ ๊ด€์ ˆ์— ๊ณผ๋„ํ•œ ์†๋„ ํ˜น์€ ํ† ํฌ๊ฐ€ ๋ฐœ์ƒํ•˜์—ฌ ๋กœ๋ด‡์˜ ์ž„๋ฌด ์ˆ˜ํ–‰์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๊ฑฐ๋‚˜ ํ™˜๊ฒฝ๊ณผ ๋กœ๋ด‡์˜ ์†์ƒ์„ ์ดˆ๋ž˜ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋‚˜์•„๊ฐ€ ๋กœ๋ด‡๊ณผ ํ•จ๊ป˜ ์ž„๋ฌด๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ์‚ฌ๋žŒ์—๊ฒŒ ์ƒํ•ด๋ฅผ ๊ฐ€ํ•  ์ˆ˜๋„ ์žˆ๋‹ค. ํŠน์ด์ ์— ๋Œ€ํ•œ ๊ฐ•๊ฑด์„ฑ์„ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•ด ์šฐ์„ ์ˆœ์œ„ ๊ธฐ๋ฐ˜์˜ ๊ณ„์ธต์  ์ตœ์ ํ™”์™€ ์ •๊ทœํ™” (regularization)๋ฅผ ํ†ตํ•ฉํ•˜์—ฌ ์ •๊ทœํ™”๋œ ๊ณ„์ธต์  ์ตœ์ ํ™” (RHQP: Regularized Hierarchical Quadratic Program) ๋ฌธ์ œ๋ฅผ ๋‹ค๋ฃฌ๋‹ค. ๋ถ€๋“ฑ์‹์ด ํฌํ•จ๋œ ๊ณ„์ธต์  ์ตœ์ ํ™”์— ์ •๊ทœํ™”๋ฅผ ๋™์‹œ์— ๊ณ ๋ คํ•จ์œผ๋กœ์จ ์•ผ๊ธฐ๋˜๋Š” ๋งŽ์€ ๋ฌธ์ œ์ ๋“ค์„ ํ•ด๊ฒฐํ•˜๊ณ  ํ•ด์˜ ์ตœ์ ์„ฑ๊ณผ ๊ฐ•๊ฑด์„ฑ์„ ํ™•๋ณดํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ํŠนํžˆ ์™ธ๋ถ€์˜ ์ตœ์ ํ™” ํ”„๋กœ๊ทธ๋žจ์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ  ์ˆ˜์น˜์  ์ตœ์ ํ™” (numerical optimization) ์ด๋ก ๊ณผ ์šฐ์„ ์ˆœ์œ„์— ๊ธฐ๋ฐ˜์„ ๋‘๋Š” ์—ฌ์œ ์ž์œ ๋„ ๋กœ๋ด‡์˜ ํ•ด์„ ๊ธฐ๋ฒ•์„ ์ด์šฉํ•˜์—ฌ ๊ณ„์‚ฐ์˜ ํšจ์œจ์„ฑ์„ ๊ทน๋Œ€ํ™”ํ•  ์ˆ˜ ์žˆ๋Š” ์ด์ฐจ ํ”„๋กœ๊ทธ๋žจ(quadratic programming)์„ ์ œ์•ˆํ•œ๋‹ค. ๋˜ํ•œ ์ด์™€ ๋™์‹œ์— ์ •๊ทœํ™”๋œ ๊ณ„์ธต์  ์ตœ์ ํ™” ๋ฌธ์ œ์˜ ์ด๋ก ์  ๊ตฌ์กฐ๋ฅผ ์ฒ ์ €ํ•˜๊ฒŒ ๋ถ„์„ํ•œ๋‹ค. ํŠนํžˆ ํŠน์ด๊ฐ’ ๋ถ„ํ•ด (singular value decomposition)๋ฅผ ํ†ตํ•ด ์ตœ์ ํ•ด์™€ ๋ถ€๋“ฑ์‹ ์กฐ๊ฑด์„ ์ฒ˜๋ฆฌํ•˜๋Š”๋ฐ ํ•„์š”ํ•œ ๋ผ๊ทธ๋ž‘์ง€ ์Šน์ˆ˜๋ฅผ ์žฌ๊ท€์ ์ธ ๋ฐฉ๋ฒ•์œผ๋กœ ํ•ด์„์  ํ˜•ํƒœ๋กœ ๊ตฌํ•จ์œผ๋กœ์จ ๊ณ„์‚ฐ์˜ ํšจ์œจ์„ฑ์„ ์ฆ๋Œ€์‹œํ‚ค๊ณ  ๋™์‹œ์— ๋ถ€๋“ฑ์‹์˜ ์กฐ๊ฑด์„ ์˜ค๋ฅ˜ ์—†์ด ์ •ํ™•ํ•˜๊ฒŒ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜์˜€๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ •๊ทœํ™”๋œ ๊ณ„์ธต์  ์ตœ์ ํ™”๋ฅผ ํž˜์ œ์–ด๊นŒ์ง€ ํ™•์žฅํ•˜์—ฌ ํ™˜๊ฒฝ๊ณผ ๋กœ๋ด‡์˜ ์•ˆ์ „ํ•œ ์ƒํ˜ธ์ž‘์šฉ์„ ๋ณด์žฅํ•˜์—ฌ ๋กœ๋ด‡์ด ์ ์ ˆํ•œ ํž˜์œผ๋กœ ํ™˜๊ฒฝ๊ณผ ์ ‘์ด‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜์˜€๋‹ค. ๋ถˆํ™•์‹ค์„ฑ์ด ์กด์žฌํ•˜๋Š” ๋น„์ •ํ˜• ํ™˜๊ฒฝ์—์„œ ๋น„์ •ํ˜• ์ž„๋ฌด๋ฅผ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” ๊ตฌ์กฐ๋กœ๋ด‡์˜ ํ•ต์‹ฌ ์„ค๊ณ„ ๊ฐœ๋…์„ ์ œ์‹œํ•œ๋‹ค. ๋น„์ •ํ˜• ํ™˜๊ฒฝ์—์„œ์˜ ์กฐ์ž‘ ์„ฑ๋Šฅ๊ณผ ์ด๋™ ์„ฑ๋Šฅ์„ ๋™์‹œ์— ํ™•๋ณดํ•  ์ˆ˜ ์žˆ๋Š” ํ˜•์ƒ์œผ๋กœ ๋กœ๋ด‡์„ ์„ค๊ณ„ํ•˜์—ฌ ๊ตฌ์กฐ ๋กœ๋ด‡์œผ๋กœ ํ•˜์—ฌ๊ธˆ ์ตœ์ข… ๋ชฉ์ ์œผ๋กœ ์„ค์ •๋œ ์ธ๊ฐ„์„ ๋Œ€์‹ ํ•˜์—ฌ ๋ถ€์ƒ์ž๋ฅผ ๊ตฌ์กฐํ•˜๊ณ  ์œ„ํ—˜๋ฌผ์„ ์ฒ˜๋ฆฌํ•˜๋Š” ์ž„๋ฌด๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค. ๊ตฌ์กฐ ๋กœ๋ด‡์— ํ•„์š”ํ•œ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋Š” ๋ถ€์ƒ์ž ๊ตฌ์กฐ ์ž„๋ฌด์™€ ์œ„ํ—˜๋ฌผ ์ฒ˜๋ฆฌ ์ž„๋ฌด์— ๋”ฐ๋ผ ๊ต์ฒด ๊ฐ€๋Šฅํ•œ ๋ชจ๋“ˆํ˜•์œผ๋กœ ์„ค๊ณ„ํ•˜์—ฌ ๊ฐ๊ฐ์˜ ์ž„๋ฌด์— ๋”ฐ๋ผ ์ตœ์ ํ™”๋œ ๋งค๋‹ˆํ“ฐ ๋ ˆ์ดํ„ฐ๋ฅผ ์žฅ์ฐฉํ•˜์—ฌ ์ž„๋ฌด๋ฅผ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค. ํ•˜์ฒด๋Š” ํŠธ๋ž™๊ณผ ๊ด€์ ˆ์ด ๊ฒฐํ•ฉ๋œ ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ํ˜•ํƒœ๋ฅผ ์ทจํ•˜๊ณ  ์žˆ์œผ๋ฉฐ, ์ฃผํ–‰ ์ž„๋ฌด์™€ ์กฐ์ž‘์ž„๋ฌด์— ๋”ฐ๋ผ ํ˜•์ƒ์„ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์žˆ๋‹ค. ํ˜•์ƒ ๋ณ€๊ฒฝ๊ณผ ๋ชจ๋“ˆํ™”๋œ ๋งค๋‹ˆํ“ฐ๋ ˆ์ดํ„ฐ๋ฅผ ํ†ตํ•ด์„œ์กฐ์ž‘ ์„ฑ๋Šฅ๊ณผ ํ—˜ํ•œ ์ง€ํ˜•์—์„œ ์ด๋™ํ•  ์ˆ˜ ์žˆ๋Š” ์ฃผํ–‰ ์„ฑ๋Šฅ์„ ๋™์‹œ์— ํ™•๋ณดํ•˜์˜€๋‹ค. ์ตœ์ข…์ ์œผ๋กœ ๊ตฌ์กฐ๋กœ๋ด‡์˜ ์„ค๊ณ„์™€ ์‹ค์‹œ๊ฐ„ ๊ณ„์ธต์  ์ œ์–ด๋ฅผ ์ด์šฉํ•˜์—ฌ ๋น„์ •ํ˜• ์‹ค๋‚ด์™ธ ํ™˜๊ฒฝ์—์„œ ๊ตฌ์กฐ๋กœ๋ด‡์ด ์ฃผํ–‰์ž„๋ฌด, ์œ„ํ—˜๋ฌผ ์กฐ์ž‘์ž„๋ฌด, ๋ถ€์ƒ์ž ๊ตฌ์กฐ ์ž„๋ฌด๋ฅผ ์„ฑ๊ณต์ ์œผ๋กœ ์ˆ˜ ํ–‰ํ•  ์ˆ˜ ์žˆ์Œ์„ ํ•ด์„๊ณผ ์‹คํ—˜์„ ํ†ตํ•˜์—ฌ ์ž…์ฆํ•จ์œผ๋กœ์จ ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ ์„ค๊ณ„์™€ ์ •๊ทœํ™”๋œ ๊ณ„์ธต์  ์ตœ์ ํ™” ๊ธฐ๋ฐ˜์˜ ์ œ์–ด ์ „๋žต์˜ ์œ ์šฉ์„ฑ์„ ๊ฒ€์ฆํ•˜์˜€๋‹ค.1 Introduction 1 1.1 Motivations 1 1.2 Related Works and Research Problems for Hierarchical Control 3 1.2.1 Classical Approaches 3 1.2.2 State-of-the-Art Strategies 4 1.2.3 Research Problems 7 1.3 Robust Rescue Robots 9 1.4 Research Goals 12 1.5 Contributions of ThisThesis 13 1.5.1 Robust Hierarchical Task-Priority Control 13 1.5.2 Design Concepts of Robust Rescue Robot 16 1.5.3 Hierarchical Motion and ForceControl 17 1.6 Dissertation Preview 18 2 Preliminaries for Task-Priority Control Framework 21 2.1 Introduction 21 2.2 Task-Priority Inverse Kinematics 23 2.3 Recursive Formulation of Null Space Projector 28 2.4 Conclusion 31 3 Robust Hierarchical Task-Priority Control 33 3.1 Introduction 33 3.1.1 Motivations 35 3.1.2 Objectives 36 3.2 Task Function Approach 37 3.3 Regularized Hierarchical Optimization with Equality Tasks 41 3.3.1 Regularized Hierarchical Optimization 41 3.3.2 Optimal Solution 45 3.3.3 Task Error and Hierarchical Matrix Decomposition 49 3.3.4 Illustrative Examples for Regularized Hierarchical Optimization 56 3.4 Regularized Hierarchical Optimization with Inequality Constraints 60 3.4.1 Lagrange Multipliers 61 3.4.2 Modified Active Set Method 66 3.4.3 Illustrative Examples of Modified Active Set Method 70 3.4.4 Examples for Hierarchical Optimization with Inequality Constraint 72 3.5 DLS-HQP Algorithm 79 3.6 Concluding Remarks 80 4 Rescue Robot Design and Experimental Results 83 4.1 Introduction 83 4.2 Rescue Robot Design 85 4.2.1 System Design 86 4.2.2 Variable Configuration Mobile Platform 92 4.2.3 Dual Arm Manipulators 95 4.2.4 Software Architecture 97 4.3 Performance Verification for Hierarchical Motion Control 99 4.3.1 Real-Time Motion Generation 99 4.3.2 Task Specifications 103 4.3.3 Singularity Robust Task Priority 106 4.3.4 Inequality Constraint Handling and Computation Time 111 4.4 Singularity Robustness and Inequality Handling for Rescue Mission 117 4.5 Field Tests 122 4.6 Concluding Remarks 126 5 Hierarchical Motion and Force Control 129 5.1 Introduction 129 5.2 Operational Space Control 132 5.3 Acceleration-Based Hierarchical Motion Control 134 5.4 Force Control 137 5.4.1 Force Control with Inner Position Loop 141 5.4.2 Force Control with Inner Velocity Loop 144 5.5 Motion and Force Control 145 5.6 Numerical Results for Acceleration-Based Motion and Force Control 148 5.6.1 Task Specifications 150 5.6.2 Force Control Performance 151 5.6.3 Singularity Robustness and Inequality Constraint Handling 155 5.7 Velocity Resolved Motion and Force Control 160 5.7.1 Velocity-Based Motion and Force Control 161 5.7.2 Experimental Results 163 5.8 Concluding Remarks 167 6 Conclusion 169 6.1 Summary 169 6.2 Concluding Remarks 173 A Appendix 175 A.1 Introduction to PID Control 175 A.2 Inverse Optimal Control 176 A.3 Experimental Results and Conclusion 181 Bibliography 183 Abstract 207๋ฐ•
    • โ€ฆ
    corecore