11 research outputs found

    Complex Locomotion Skill Learning via Differentiable Physics

    Full text link
    Differentiable physics enables efficient gradient-based optimizations of neural network (NN) controllers. However, existing work typically only delivers NN controllers with limited capability and generalizability. We present a practical learning framework that outputs unified NN controllers capable of tasks with significantly improved complexity and diversity. To systematically improve training robustness and efficiency, we investigated a suite of improvements over the baseline approach, including periodic activation functions, and tailored loss functions. In addition, we find our adoption of batching and an Adam optimizer effective in training complex locomotion tasks. We evaluate our framework on differentiable mass-spring and material point method (MPM) simulations, with challenging locomotion tasks and multiple robot designs. Experiments show that our learning framework, based on differentiable physics, delivers better results than reinforcement learning and converges much faster. We demonstrate that users can interactively control soft robot locomotion and switch among multiple goals with specified velocity, height, and direction instructions using a unified NN controller trained in our system. Code is available at https://github.com/erizmr/Complex-locomotion-skill-learning-via-differentiable-physics

    Optimization of Muscle Activity for Task-Level Goals Predicts Complex Changes in Limb Forces across Biomechanical Contexts

    Get PDF
    Optimality principles have been proposed as a general framework for understanding motor control in animals and humans largely based on their ability to predict general features movement in idealized motor tasks. However, generalizing these concepts past proof-of-principle to understand the neuromechanical transformation from task-level control to detailed execution-level muscle activity and forces during behaviorally-relevant motor tasks has proved difficult. In an unrestrained balance task in cats, we demonstrate that achieving task-level constraints center of mass forces and moments while minimizing control effort predicts detailed patterns of muscle activity and ground reaction forces in an anatomically-realistic musculoskeletal model. Whereas optimization is typically used to resolve redundancy at a single level of the motor hierarchy, we simultaneously resolved redundancy across both muscles and limbs and directly compared predictions to experimental measures across multiple perturbation directions that elicit different intra- and interlimb coordination patterns. Further, although some candidate task-level variables and cost functions generated indistinguishable predictions in a single biomechanical context, we identified a common optimization framework that could predict up to 48 experimental conditions per animal (nโ€Š=โ€Š3) across both perturbation directions and different biomechanical contexts created by altering animals' postural configuration. Predictions were further improved by imposing experimentally-derived muscle synergy constraints, suggesting additional task variables or costs that may be relevant to the neural control of balance. These results suggested that reduced-dimension neural control mechanisms such as muscle synergies can achieve similar kinetics to the optimal solution, but with increased control effort (โ‰ˆ2ร—) compared to individual muscle control. Our results are consistent with the idea that hierarchical, task-level neural control mechanisms previously associated with voluntary tasks may also be used in automatic brainstem-mediated pathways for balance

    Adaptive motion synthesis and motor invariant theory.

    Get PDF
    Generating natural-looking motion for virtual characters is a challenging research topic. It becomes even harder when adapting synthesized motion to interact with the environment. Current methods are tedious to use, computationally expensive and fail to capture natural looking features. These difficulties seem to suggest that artificial control techniques are inferior to their natural counterparts. Recent advances in biology research point to a new motor control principle: utilizing the natural dynamics. The interaction of body and environment forms some patterns, which work as primary elements for the motion repertoire: Motion Primitives. These elements serve as templates, tweaked by the neural system to satisfy environmental constraints or motion purposes. Complex motions are synthesized by connecting motion primitives together, just like connecting alphabets to form sentences. Based on such ideas, this thesis proposes a new dynamic motion synthesis method. A key contribution is the insight into dynamic reason behind motion primitives: template motions are stable and energy efficient. When synthesizing motions from templates, valuable properties like stability and efficiency should be perfectly preserved. The mathematical formalization of this idea is the Motor Invariant Theory and the preserved properties are motor invariant In the process of conceptualization, newmathematical tools are introduced to the research topic. The Invariant Theory, especially mathematical concepts of equivalence and symmetry, plays a crucial role. Motion adaptation is mathematically modelled as topological conjugacy: a transformation which maintains the topology and results in an analogous system. The Neural Oscillator and Symmetry Preserving Transformations are proposed for their computational efficiency. Even without reference motion data, this approach produces natural looking motion in real-time. Also the new motor invariant theory might shed light on the long time perception problem in biological research

    ์‚ฌ๋žŒ์˜ ์ž์—ฐ์Šค๋Ÿฌ์šด ๋ณดํ–‰ ๋™์ž‘ ์ƒ์„ฑ์„ ์œ„ํ•œ ๋ฌผ๋ฆฌ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๊ธฐ๋ฐ˜ ํœด๋จธ๋…ธ์ด๋“œ ์ œ์–ด ๋ฐฉ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2014. 8. ์ด์ œํฌ.ํœด๋จธ๋…ธ์ด๋“œ๋ฅผ ์ œ์–ดํ•˜์—ฌ ์‚ฌ๋žŒ์˜ ์ž์—ฐ์Šค๋Ÿฌ์šด ์ด๋™ ๋™์ž‘์„ ๋งŒ๋“ค์–ด๋‚ด๋Š” ๊ฒƒ์€ ์ปดํ“จํ„ฐ๊ทธ๋ž˜ํ”ฝ์Šค ๋ฐ ๋กœ๋ด‡๊ณตํ•™ ๋ถ„์•ผ์—์„œ ์ค‘์š”ํ•œ ๋ฌธ์ œ๋กœ ์ƒ๊ฐ๋˜์–ด ์™”๋‹ค. ํ•˜์ง€๋งŒ, ์ด๋Š” ์‚ฌ๋žŒ์˜ ์ด๋™์—์„œ ๊ตฌ๋™๊ธฐ๊ฐ€ ๋ถ€์กฑํ•œ (underactuated) ํŠน์„ฑ๊ณผ ์‚ฌ๋žŒ์˜ ๋ชธ์˜ ๋ณต์žกํ•œ ๊ตฌ์กฐ๋ฅผ ๋ชจ๋ฐฉํ•˜๊ณ  ์‹œ๋ฎฌ๋ ˆ์ด์…˜ํ•ด์•ผ ํ•œ๋‹ค๋Š” ์  ๋•Œ๋ฌธ์— ๋งค์šฐ ์–ด๋ ค์šด ๋ฌธ์ œ๋กœ ์•Œ๋ ค์ ธ์™”๋‹ค. ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์€ ๋ฌผ๋ฆฌ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๊ธฐ๋ฐ˜ ํœด๋จธ๋…ธ์ด๋“œ๊ฐ€ ์™ธ๋ถ€์˜ ๋ณ€ํ™”์— ์•ˆ์ •์ ์œผ๋กœ ๋Œ€์‘ํ•˜๊ณ  ์‹ค์ œ ์‚ฌ๋žŒ์ฒ˜๋Ÿผ ์ž์—ฐ์Šค๋Ÿฝ๊ณ  ๋‹ค์–‘ํ•œ ์ด๋™ ๋™์ž‘์„ ๋งŒ๋“ค์–ด๋‚ด๋„๋ก ํ•˜๋Š” ์ œ์–ด ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ์šฐ๋ฆฌ๋Š” ์‹ค์ œ ์‚ฌ๋žŒ์œผ๋กœ๋ถ€ํ„ฐ ์–ป์„ ์ˆ˜ ์žˆ๋Š” ๊ด€์ฐฐ ๊ฐ€๋Šฅํ•˜๊ณ  ์ธก์ • ๊ฐ€๋Šฅํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์ตœ๋Œ€ํ•œ์œผ๋กœ ํ™œ์šฉํ•˜์—ฌ ๋ฌธ์ œ์˜ ์–ด๋ ค์›€์„ ๊ทน๋ณตํ–ˆ๋‹ค. ์šฐ๋ฆฌ์˜ ์ ‘๊ทผ ๋ฐฉ๋ฒ•์€ ๋ชจ์…˜ ์บก์ฒ˜ ์‹œ์Šคํ…œ์œผ๋กœ๋ถ€ํ„ฐ ํš๋“ํ•œ ์‚ฌ๋žŒ์˜ ๋ชจ์…˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•˜๋ฉฐ, ์‹ค์ œ ์‚ฌ๋žŒ์˜ ์ธก์ • ๊ฐ€๋Šฅํ•œ ๋ฌผ๋ฆฌ์ , ์ƒ๋ฆฌํ•™์  ํŠน์„ฑ์„ ๋ณต์›ํ•˜์—ฌ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์šฐ๋ฆฌ๋Š” ํ† ํฌ๋กœ ๊ตฌ๋™๋˜๋Š” ์ด์กฑ ๋ณดํ–‰ ๋ชจ๋ธ์ด ๋‹ค์–‘ํ•œ ์Šคํƒ€์ผ๋กœ ๊ฑธ์„ ์ˆ˜ ์žˆ๋„๋ก ์ œ์–ดํ•˜๋Š” ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์•ˆํ•œ๋‹ค. ์šฐ๋ฆฌ์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋ชจ์…˜ ์บก์ฒ˜ ๋ฐ์ดํ„ฐ์— ๋‚ด์žฌ๋œ ์ด๋™ ๋™์ž‘ ์ž์ฒด์˜ ๊ฐ•๊ฑด์„ฑ์„ ํ™œ์šฉํ•˜์—ฌ ์‹ค์ œ ์‚ฌ๋žŒ๊ณผ ๊ฐ™์€ ์‚ฌ์‹ค์ ์ธ ์ด๋™ ์ œ์–ด๋ฅผ ๊ตฌํ˜„ํ•œ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ๋Š”, ์ฐธ์กฐ ๋ชจ์…˜ ๋ฐ์ดํ„ฐ๋ฅผ ์žฌํ˜„ํ•˜๋Š” ์ž์—ฐ์Šค๋Ÿฌ์šด ๋ณดํ–‰ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ์œ„ํ•œ ๊ด€์ ˆ ํ† ํฌ๋ฅผ ๊ณ„์‚ฐํ•˜๊ฒŒ ๋œ๋‹ค. ์•Œ๊ณ ๋ฆฌ์ฆ˜์—์„œ ๊ฐ€์žฅ ํ•ต์‹ฌ์ ์ธ ์•„์ด๋””์–ด๋Š” ๊ฐ„๋‹จํ•œ ์ถ”์ข… ์ œ์–ด๊ธฐ๋งŒ์œผ๋กœ๋„ ์ฐธ์กฐ ๋ชจ์…˜์„ ์žฌํ˜„ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ฐธ์กฐ ๋ชจ์…˜์„ ์—ฐ์†์ ์œผ๋กœ ์กฐ์ ˆํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์šฐ๋ฆฌ์˜ ๋ฐฉ๋ฒ•์€ ๋ชจ์…˜ ๋ธ”๋ Œ๋”ฉ, ๋ชจ์…˜ ์™€ํ•‘, ๋ชจ์…˜ ๊ทธ๋ž˜ํ”„์™€ ๊ฐ™์€ ๊ธฐ์กด์— ์กด์žฌํ•˜๋Š” ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ๊ธฐ๋ฒ•๋“ค์„ ์ด์กฑ ๋ณดํ–‰ ์ œ์–ด์— ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•œ๋‹ค. ์šฐ๋ฆฌ๋Š” ๋ณด๋‹ค ์‚ฌ์‹ค์ ์ธ ์ด๋™ ๋™์ž‘์„ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ๋žŒ์˜ ๋ชธ์„ ์„ธ๋ถ€์ ์œผ๋กœ ๋ชจ๋ธ๋งํ•œ, ๊ทผ์œก์— ์˜ํ•ด ๊ด€์ ˆ์ด ๊ตฌ๋™๋˜๋Š” ์ธ์ฒด ๋ชจ๋ธ์„ ์ œ์–ดํ•˜๋Š” ์ด๋™ ์ œ์–ด ์‹œ์Šคํ…œ์„ ์ œ์•ˆํ•œ๋‹ค. ์‹œ๋ฎฌ๋ ˆ์ด์…˜์— ์‚ฌ์šฉ๋˜๋Š” ํœด๋จธ๋…ธ์ด๋“œ๋Š” ์‹ค์ œ ์‚ฌ๋žŒ์˜ ๋ชธ์—์„œ ์ธก์ •๋œ ์ˆ˜์น˜๋“ค์— ๊ธฐ๋ฐ˜ํ•˜๊ณ  ์žˆ์œผ๋ฉฐ ์ตœ๋Œ€ 120๊ฐœ์˜ ๊ทผ์œก์„ ๊ฐ€์ง„๋‹ค. ์šฐ๋ฆฌ์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ตœ์ ์˜ ๊ทผ์œก ํ™œ์„ฑํ™” ์ •๋„๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ์ˆ˜ํ–‰ํ•˜๋ฉฐ, ์ฐธ์กฐ ๋ชจ์…˜์„ ์ถฉ์‹คํžˆ ์žฌํ˜„ํ•˜๊ฑฐ๋‚˜ ํ˜น์€ ์ƒˆ๋กœ์šด ์ƒํ™ฉ์— ๋งž๊ฒŒ ๋ชจ์…˜์„ ์ ์‘์‹œํ‚ค๊ธฐ ์œ„ํ•ด ์ฃผ์–ด์ง„ ์ฐธ์กฐ ๋ชจ์…˜์„ ์ˆ˜์ •ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ๋™์ž‘ํ•œ๋‹ค. ์šฐ๋ฆฌ์˜ ํ™•์žฅ๊ฐ€๋Šฅํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋‹ค์–‘ํ•œ ์ข…๋ฅ˜์˜ ๊ทผ๊ณจ๊ฒฉ ์ธ์ฒด ๋ชจ๋ธ์„ ์ตœ์ ์˜ ๊ทผ์œก ์กฐํ•ฉ์„ ์‚ฌ์šฉํ•˜๋ฉฐ ๊ท ํ˜•์„ ์œ ์ง€ํ•˜๋„๋ก ์ œ์–ดํ•  ์ˆ˜ ์žˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ๋‹ค์–‘ํ•œ ์Šคํƒ€์ผ๋กœ ๊ฑท๊ธฐ ๋ฐ ๋‹ฌ๋ฆฌ๊ธฐ, ๋ชจ๋ธ์˜ ๋ณ€ํ™” (๊ทผ์œก์˜ ์•ฝํ™”, ๊ฒฝ์ง, ๊ด€์ ˆ์˜ ํƒˆ๊ตฌ), ํ™˜๊ฒฝ์˜ ๋ณ€ํ™” (์™ธ๋ ฅ), ๋ชฉ์ ์˜ ๋ณ€ํ™” (ํ†ต์ฆ์˜ ๊ฐ์†Œ, ํšจ์œจ์„ฑ์˜ ์ตœ๋Œ€ํ™”)์— ๋Œ€ํ•œ ๋Œ€์‘, ๋ฐฉํ–ฅ ์ „ํ™˜, ํšŒ์ „, ์ธํ„ฐ๋ž™ํ‹ฐ๋ธŒํ•˜๊ฒŒ ๋ฐฉํ–ฅ์„ ๋ฐ”๊พธ๋ฉฐ ๊ฑท๊ธฐ ๋“ฑ๊ณผ ๊ฐ™์€ ๋ณด๋‹ค ๋‚œ์ด๋„ ๋†’์€ ๋™์ž‘๋“ค๋กœ ์ด๋ฃจ์–ด์ง„ ์˜ˆ์ œ๋ฅผ ํ†ตํ•ด ์šฐ๋ฆฌ์˜ ์ ‘๊ทผ ๋ฐฉ๋ฒ•์ด ํšจ์œจ์ ์ž„์„ ๋ณด์˜€๋‹ค.Controlling artificial humanoids to generate realistic human locomotion has been considered as an important problem in computer graphics and robotics. However, it has been known to be very difficult because of the underactuated characteristics of the locomotion dynamics and the complex human body structure to be imitated and simulated. In this thesis, we presents controllers for physically simulated humanoids that exhibit a rich set of human-like and resilient simulated locomotion. Our approach exploits observable and measurable data of a human to effectively overcome difficulties of the problem. More specifically, our approach utilizes observed human motion data collected by motion capture systems and reconstructs measured physical and physiological properties of a human body. We propose a data-driven algorithm to control torque-actuated biped models to walk in a wide range of locomotion skills. Our algorithm uses human motion capture data and realizes an human-like locomotion control facilitated by inherent robustness of the locomotion motion. Concretely, it takes reference motion and generates a set of joint torques to generate human-like walking simulation. The idea is continuously modulating the reference motion such that even a simple tracking controller can reproduce the reference motion. A number of existing data-driven techniques such as motion blending, motion warping, and motion graph can facilitate the biped control with this framework. We present a locomotion control system that controls detailed models of a human body with the musculotendon actuating process to create more human-like simulated locomotion. The simulated humanoids are based on measured properties of a human body and contain maximum 120 muscles. Our algorithm computes the optimal coordination of muscle activations and actively modulates the reference motion to fathifully reproduce the reference motion or adapt the motion to meet new conditions. Our scalable algorithm can control various types of musculoskeletal humanoids while seeking harmonious coordination of many muscles and maintaining balance. We demonstrate the strength of our approach with examples that allow simulated humanoids to walk and run in various styles, adapt to change of models (e.g., muscle weakness, tightness, joint dislocation), environments (e.g., external pushes), goals (e.g., pain reduction and efficiency maximization), and perform more challenging locomotion tasks such as turn, spin, and walking while steering its direction interactively.Contents Abstract i Contents iii List of Figures v 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1.1 Computer Graphics Perspective . . . . . . . . . . . . . . . . . 3 1.1.2 Robotics Perspective . . . . . . . . . . . . . . . . . . . . . . . 5 1.1.3 Biomechanics Perspective . . . . . . . . . . . . . . . . . . . . 7 1.2 Aim of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.4 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.5 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2 Previous Work 16 2.1 Biped Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1.1 Controllers with Optimization . . . . . . . . . . . . . . . . . . 18 2.1.2 Controllers with Motion Capture Data . . . . . . . . . . . . . 20 2.2 Simulation of Musculoskeletal Humanoids . . . . . . . . . . . . . . . 21 2.2.1 Simulation of Specic Body Parts . . . . . . . . . . . . . . . . 21 2.2.2 Simulation of Full-Body Models . . . . . . . . . . . . . . . . . 22 2.2.3 Controllers for Musculoskeletal Humanoids . . . . . . . . . . . 23 3 Data-Driven Biped Control 24 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3 Data-Driven Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3.1 Balancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.3.2 Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4 Locomotion Control for Many-Muscle Humanoids 56 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.2 Humanoid Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.2.1 Muscle Force Generation . . . . . . . . . . . . . . . . . . . . . 61 4.2.2 Muscle Force Transfer . . . . . . . . . . . . . . . . . . . . . . 64 4.2.3 Equation of Motion . . . . . . . . . . . . . . . . . . . . . . . . 66 4.3 Muscle Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.3.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.3.2 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.3.3 Quadratic Programming Formulation . . . . . . . . . . . . . . 70 4.4 Trajectory Optimization . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5 Conclusion 84 A Mathematical Definitions 88 A.1 Definitions of Transition Function . . . . . . . . . . . . . . . . . . . . 88 B Humanoid Models 89 B.1 Torque-Actuated Biped Models . . . . . . . . . . . . . . . . . . . . . 89 B.2 Many-Muscle Humanoid Models . . . . . . . . . . . . . . . . . . . . . 91 C Dynamics of Musculotendon Actuators 94 C.1 Contraction Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . 94 C.2 Initial Muscle States . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Glossary for Medical Terms 99 Bibliography 102 ์ดˆ๋ก 113Docto

    Learning Control Policies for Fall Prevention and Safety in Bipedal Locomotion

    Get PDF
    The ability to recover from an unexpected external perturbation is a fundamental motor skill in bipedal locomotion. An effective response includes the ability to not just recover balance and maintain stability but also to fall in a safe manner when balance recovery is physically infeasible. For robots associated with bipedal locomotion, such as humanoid robots and assistive robotic devices that aid humans in walking, designing controllers which can provide this stability and safety can prevent damage to robots or prevent injury related medical costs. This is a challenging task because it involves generating highly dynamic motion for a high-dimensional, non-linear and under-actuated system with contacts. Despite prior advancements in using model-based and optimization methods, challenges such as requirement of extensive domain knowledge, relatively large computational time and limited robustness to changes in dynamics still make this an open problem. In this thesis, to address these issues we develop learning-based algorithms capable of synthesizing push recovery control policies for two different kinds of robots : Humanoid robots and assistive robotic devices that assist in bipedal locomotion. Our work can be branched into two closely related directions : 1) Learning safe falling and fall prevention strategies for humanoid robots and 2) Learning fall prevention strategies for humans using a robotic assistive devices. To achieve this, we introduce a set of Deep Reinforcement Learning (DRL) algorithms to learn control policies that improve safety while using these robots. To enable efficient learning, we present techniques to incorporate abstract dynamical models, curriculum learning and a novel method of building a graph of policies into the learning framework. We also propose an approach to create virtual human walking agents which exhibit similar gait characteristics to real-world human subjects, using which, we learn an assistive device controller to help virtual human return to steady state walking after an external push is applied. Finally, we extend our work on assistive devices and address the challenge of transferring a push-recovery policy to different individuals. As walking and recovery characteristics differ significantly between individuals, exoskeleton policies have to be fine-tuned for each person which is a tedious, time consuming and potentially unsafe process. We propose to solve this by posing it as a transfer learning problem, where a policy trained for one individual can adapt to another without fine tuning.Ph.D
    corecore