2,132 research outputs found
λͺ¨μ ν리머ν°λΈλ₯Ό μ΄μ©ν 볡μ‘ν λ‘λ΄ μ무 νμ΅ λ° μΌλ°ν κΈ°λ²
νμλ
Όλ¬Έ (λ°μ¬) -- μμΈλνκ΅ λνμ : 곡과λν ν곡μ°μ£Όκ³΅νκ³Ό, 2020. 8. κΉνμ§.Learning from demonstrations (LfD) is a promising approach that enables robots to perform a specific movement. As robotic manipulations are substituting a variety of tasks, LfD algorithms are widely used and studied for specifying the robot configurations for the various types of movements.
This dissertation presents an approach based on parametric dynamic movement primitives (PDMP) as a motion representation algorithm which is one of relevant LfD techniques. Unlike existing motion representation algorithms, this work not only represents a prescribed motion but also computes the new behavior through a generalization of multiple demonstrations in the actual environment. The generalization process uses Gaussian process regression (GPR) by representing the nonlinear relationship between the PDMP parameters that determine motion and the corresponding environmental variables. The proposed algorithm shows that it serves as a powerful optimal and real-time motion planner among the existing planning algorithms when optimal demonstrations are provided as dataset.
In this dissertation, the safety of motion is also considered. Here, safety refers to keeping the system away from certain configurations that are unsafe. The safety criterion of the PDMP internal parameters are computed to check the safety. This safety criterion reflects the new behavior computed through the generalization process, as well as the individual motion safety of the demonstration set. The demonstrations causing unsafe movement are identified and removed. Also, the demolished demonstrations are replaced by proven demonstrations upon this criterion.
This work also presents an extension approach reducing the number of required demonstrations for the PDMP framework. This approach is effective where a single mission consists of multiple sub-tasks and requires numerous demonstrations in generalizing them. The whole trajectories in provided demonstrations are segmented into multiple sub-tasks representing unit motions. Then, multiple PDMPs are formed independently for correlated-segments. The phase-decision process determines which sub-task and associated PDMPs to be executed online, allowing multiple PDMPs to be autonomously configured within an integrated framework. GPR formulations are applied to obtain execution time and regional goal configuration for each sub-task.
Finally, the proposed approach and its extension are validated with the actual experiments of mobile manipulators. The first two scenarios regarding cooperative aerial transportation demonstrate the excellence of the proposed technique in terms of quick computation, generation of efficient movement, and safety assurance. The last scenario deals with two mobile manipulations using ground vehicles and shows the effectiveness of the proposed extension in executing complex missions.μμ° νμ΅ κΈ°λ²(Learning from demonstrations, LfD)μ λ‘λ΄μ΄ νΉμ λμμ μνν μ μλλ‘ νλ μ λ§ν λμ μμ± κΈ°λ²μ΄λ€. λ‘λ΄ μ‘°μκΈ°κ° μΈκ° μ¬νμμ λ€μν μ
무λ₯Ό λμ²΄ν΄ κ°μ λ°λΌ, λ€μν μ무λ₯Ό μννλ λ‘λ΄μ λμμ μμ±νκΈ° μν΄ LfD μκ³ λ¦¬μ¦λ€μ λ리 μ°κ΅¬λκ³ , μ¬μ©λκ³ μλ€.
λ³Έ λ
Όλ¬Έμ LfD κΈ°λ² μ€ λͺ¨μ
ν리머ν°λΈ κΈ°λ°μ λμ μ¬μμ± μκ³ λ¦¬μ¦μΈ Parametric dynamic movement primitives(PDMP)μ κΈ°μ΄ν μκ³ λ¦¬μ¦μ μ μνλ©°, μ΄λ₯Ό ν΅ν΄ λ€μν μ무λ₯Ό μννλ λͺ¨λ°μΌ μ‘°μκΈ°μ κΆ€μ μ μμ±νλ€. κΈ°μ‘΄μ λμ μ¬μμ± μκ³ λ¦¬μ¦κ³Ό λ¬λ¦¬, μ΄ μ°κ΅¬λ μ 곡λ μμ°μμ ννλ λμμ λ¨μν μ¬μμ±νλ κ²μ κ·ΈμΉμ§ μκ³ , μλ‘μ΄ νκ²½μ λ§κ² μΌλ°ν νλ κ³Όμ μ ν¬ν¨νλ€. μ΄ λ
Όλ¬Έμμ μ μνλ μΌλ°ν κ³Όμ μ PDMPsμ λ΄λΆ νλΌλ―Έν° κ°μΈ μ€νμΌ νλΌλ―Έν°μ νκ²½ λ³μ μ¬μ΄μ λΉμ ν κ΄κ³λ₯Ό κ°μ°μ€ νκ· κΈ°λ² (Gaussian process regression, GPR)μ μ΄μ©νμ¬ μμμ μΌλ‘ νννλ€. μ μλ κΈ°λ²μ λν μ΅μ μμ°λ₯Ό νμ΅νλ λ°©μμ ν΅ν΄ κ°λ ₯ν μ΅μ μ€μκ° κ²½λ‘ κ³ν κΈ°λ²μΌλ‘λ μμ©λ μ μλ€.
λ³Έ λ
Όλ¬Έμμλ λν λ‘λ΄μ ꡬλ μμ μ±λ κ³ λ €νλ€. κΈ°μ‘΄ μ°κ΅¬λ€μμ λ€λ£¨μ΄μ§ μμ° κ΄λ¦¬ κΈ°μ μ΄ λ‘λ΄μ ꡬλ ν¨μ¨μ±μ κ°μ νλ λ°©ν₯μΌλ‘ μ μλ κ²κ³Ό λ¬λ¦¬, μ΄ μ°κ΅¬λ κ°ν ꡬμ쑰건μΌλ‘ λ‘λ΄μ ꡬλ μμ μ±μ ν보νλ μμ° κ΄λ¦¬ κΈ°μ μ ν΅ν΄ μμ μ±μ κ³ λ €νλ μλ‘μ΄ λ°©μμ μ μνλ€. μ μλ λ°©μμ μ€νμΌ νλΌλ―Έν° κ° μμμ μμ μ± κΈ°μ€μ κ³μ°νλ©°, μ΄ μμ κΈ°μ€μ ν΅ν΄ μμ°μ μ κ±°νλ μΌλ ¨μ μμ
μ μννλ€. λν, μ κ±°λ μμλ₯Ό μμ κΈ°μ€μ λ°λΌ μ
μ¦λ μμλ‘ λ체νμ¬ μΌλ°ν μ±λ₯μ μ νμν€μ§ μλλ‘ μμλ₯Ό κ΄λ¦¬νλ€. μ΄λ₯Ό ν΅ν΄ λ€μμ μμ° κ°κ° κ°λ³ λμ μμ μ± λΏ μλλΌ μ¨λΌμΈ λμμ μμ μ±κΉμ§ κ³ λ €ν μ μμΌλ©°, μ€μκ° λ‘λ΄ μ‘°μκΈ° μ΄μ©μ μμ μ±μ΄ ν보λ μ μλ€. μ μλ μμ μ±μ κ³ λ €ν μμ° κ΄λ¦¬ κΈ°μ μ λν νκ²½μ μ μ μ€μ μ΄ λ³κ²½λμ΄ λͺ¨λ μμ°μ κ΅μ²΄ν΄μΌ ν μ μλ μν©μμ μ¬μ©ν μ μλ μμ°λ€μ νλ³νκ³ , ν¨μ¨μ μΌλ‘ μ¬μ¬μ©νλ λ° μμ©ν μ μλ€.
λν λ³Έ λ
Όλ¬Έμ 볡μ‘ν μ무μμ μ μ©λ μ μλ PDMPsμ νμ₯ κΈ°λ²μΈ seg-PDMPsλ₯Ό μ μνλ€. μ΄ μ κ·Όλ°©μμ 볡μ‘ν μλ¬΄κ° μΌλ°μ μΌλ‘ 볡μκ°μ κ°λ¨ν νμ μμ
μΌλ‘ ꡬμ±λλ€κ³ κ°μ νλ€. κΈ°μ‘΄ PDMPsμ λ¬λ¦¬ seg-PDMPsλ μ 체 κΆ€μ μ νμ μμ
μ λνλ΄λ μ¬λ¬ κ°μ λ¨μ λμμΌλ‘ λΆν νκ³ , κ° λ¨μλμμ λν΄ μ¬λ¬κ°μ PDMPsλ₯Ό ꡬμ±νλ€. κ° λ¨μ λμ λ³λ‘ μμ±λ PDMPsλ ν΅ν©λ νλ μμν¬λ΄μμ λ¨κ³ κ²°μ νλ‘μΈμ€λ₯Ό ν΅ν΄ μλμ μΌλ‘ νΈμΆλλ€. κ° λ¨κ³ λ³λ‘ λ¨μ λμμ μννκΈ° μν μκ° λ° νμ λͺ©νμ μ κ°μ°μ€ 곡μ νκ·(GPR)λ₯Ό μ΄μ©ν νκ²½λ³μμμμ κ΄κ³μμ ν΅ν΄ μ»λλ€. κ²°κ³Όμ μΌλ‘, μ΄ μ°κ΅¬λ μ 체μ μΌλ‘ μꡬλλ μμ°μ μλ₯Ό ν¨κ³Όμ μΌλ‘ μ€μΌ λΏ μλλΌ, κ° λ¨μλμμ νν μ±λ₯μ κ°μ νλ€.
μ μλ μκ³ λ¦¬μ¦μ νλ λͺ¨λ°μΌ λ‘λ΄ μ‘°μκΈ° μ€νμ ν΅νμ¬ κ²μ¦λλ€. μΈ κ°μ§μ μλ리μ€κ° λ³Έ λ
Όλ¬Έμμ λ€λ£¨μ΄μ§λ©°, ν곡 μ΄μ‘κ³Ό κ΄λ ¨λ 첫 λ κ°μ§ μλ리μ€λ PDMPs κΈ°λ²μ΄ λ‘λ΄ μ‘°μκΈ°μμ λΉ λ₯Έ μ μμ±, μ무 ν¨μ¨μ±κ³Ό μμ μ± λͺ¨λ λ§μ‘±νλ κ²μ μ
μ¦νλ€. λ§μ§λ§ μλ리μ€λ μ§μ μ°¨λμ μ΄μ©ν λ κ°μ λ‘λ΄ μ‘°μκΈ°μ λν μ€νμΌλ‘ 볡μ‘ν μ무 μνμ νκΈ° μν΄ νμ₯λ κΈ°λ²μΈ seg-PDMPsκ° ν¨κ³Όμ μΌλ‘ λ³ννλ νκ²½μμ μΌλ°νλ λμμ μμ±ν¨μ κ²μ¦νλ€.1 Introduction 1
1.1 Motivations 1
1.2 Literature Survey 3
1.2.1 Conventional Motion Planning in Mobile Manipulations 3
1.2.2 Motion Representation Algorithms 5
1.2.3 Safety-guaranteed Motion Representation Algorithms 7
1.3 Research Objectives and Contributions 7
1.3.1 Motion Generalization in Motion Representation Algorithm 9
1.3.2 Motion Generalization with Safety Guarantee 9
1.3.3 Motion Generalization for Complex Missions 10
1.4 Thesis Organization 11
2 Background 12
2.1 DMPs 12
2.2 Mobile Manipulation Systems 13
2.2.1 Single Mobile Manipulation 14
2.2.2 Cooperative Mobile Manipulations 14
2.3 Experimental Setup 17
2.3.1 Test-beds for Aerial Manipulators 17
2.3.2 Test-beds for Robot Manipulators with Ground Vehicles 17
3 Motion Generalization in Motion Representation Algorithm 22
3.1 Parametric Dynamic Movement Primitives 22
3.2 Generalization Process in PDMPs 26
3.2.1 Environmental Parameters 26
3.2.2 Mapping Function 26
3.3 Simulation Results 29
3.3.1 Two-dimensional Hurdling Motion 29
3.3.2 Cooperative Aerial Transportation 30
4 Motion Generalization with Safety Guarantee 36
4.1 Safety Criterion in Style Parameter 36
4.2 Demonstration Management 39
4.3 Simulation Validation 42
4.3.1 Two-dimensional Hurdling Motion 46
4.3.2 Cooperative Aerial Transportation 47
5 Motion Generalization for Complex Missions 51
5.1 Overall Structure of Seg-PDMPs 51
5.2 Motion Segments 53
5.3 Phase-decision Process 54
5.4 Seg-PDMPs for Single Phase 54
5.5 Simulation Results 55
5.5.1 Initial/terminal Offsets 56
5.5.2 Style Generalization 59
5.5.3 Recombination 61
6 Experimental Validation and Results 63
6.1 Cooperative Aerial Transportation 63
6.2 Cooperative Mobile Hang-dry Mission 70
6.2.1 Demonstrations 70
6.2.2 Simulation Validation 72
6.2.3 Experimental Results 78
7 Conclusions 82
Abstract (in Korean) 93Docto
Learning to Adapt the Parameters of Behavior Trees and Motion Generators (BTMGs) to Task Variations
The ability to learn new tasks and quickly adapt to different variations or
dimensions is an important attribute in agile robotics. In our previous work,
we have explored Behavior Trees and Motion Generators (BTMGs) as a robot arm
policy representation to facilitate the learning and execution of assembly
tasks. The current implementation of the BTMGs for a specific task may not be
robust to the changes in the environment and may not generalize well to
different variations of tasks. We propose to extend the BTMG policy
representation with a module that predicts BTMG parameters for a new task
variation. To achieve this, we propose a model that combines a Gaussian process
and a weighted support vector machine classifier. This model predicts the
performance measure and the feasibility of the predicted policy with BTMG
parameters and task variations as inputs. Using the outputs of the model, we
then construct a surrogate reward function that is utilized within an optimizer
to maximize the performance of a task over BTMG parameters for a fixed task
variation. To demonstrate the effectiveness of our proposed approach, we
conducted experimental evaluations on push and obstacle avoidance tasks in
simulation and with a real KUKA iiwa robot. Furthermore, we compared the
performance of our approach with four baseline methods
Bayesian Disturbance Injection: Robust Imitation Learning of Flexible Policies
Scenarios requiring humans to choose from multiple seemingly optimal actions
are commonplace, however standard imitation learning often fails to capture
this behavior. Instead, an over-reliance on replicating expert actions induces
inflexible and unstable policies, leading to poor generalizability in an
application. To address the problem, this paper presents the first imitation
learning framework that incorporates Bayesian variational inference for
learning flexible non-parametric multi-action policies, while simultaneously
robustifying the policies against sources of error, by introducing and
optimizing disturbances to create a richer demonstration dataset. This
combinatorial approach forces the policy to adapt to challenging situations,
enabling stable multi-action policies to be learned efficiently. The
effectiveness of our proposed method is evaluated through simulations and
real-robot experiments for a table-sweep task using the UR3 6-DOF robotic arm.
Results show that, through improved flexibility and robustness, the learning
performance and control safety are better than comparison methods.Comment: 7 pages, Accepted by the 2021 International Conference on Robotics
and Automation (ICRA 2021
Learning Skill-based Industrial Robot Tasks with User Priors
Robot skills systems are meant to reduce robot setup time for new
manufacturing tasks. Yet, for dexterous, contact-rich tasks, it is often
difficult to find the right skill parameters. One strategy is to learn these
parameters by allowing the robot system to learn directly on the task. For a
learning problem, a robot operator can typically specify the type and range of
values of the parameters. Nevertheless, given their prior experience, robot
operators should be able to help the learning process further by providing
educated guesses about where in the parameter space potential optimal solutions
could be found. Interestingly, such prior knowledge is not exploited in current
robot learning frameworks. We introduce an approach that combines user priors
and Bayesian optimization to allow fast optimization of robot industrial tasks
at robot deployment time. We evaluate our method on three tasks that are
learned in simulation as well as on two tasks that are learned directly on a
real robot system. Additionally, we transfer knowledge from the corresponding
simulation tasks by automatically constructing priors from well-performing
configurations for learning on the real system. To handle potentially
contradicting task objectives, the tasks are modeled as multi-objective
problems. Our results show that operator priors, both user-specified and
transferred, vastly accelerate the discovery of rich Pareto fronts, and
typically produce final performance far superior to proposed baselines.Comment: 8 pages, 6 figures, accepted at 2022 IEEE International Conference on
Automation Science and Engineering (CASE
A structured prediction approach for robot imitation learning
We propose a structured prediction approach for robot imitation learning from demonstrations. Among various tools for robot imitation learning, supervised learning has been observed to have a prominent role. Structured prediction is a form of supervised learning that enables learning models to operate on output spaces with complex structures. Through the lens of structured prediction, we show how robots can learn to imitate trajectories belonging to not only Euclidean spaces but also Riemannian manifolds. Exploiting ideas from information theory, we propose a class of loss functions based on the f-divergence to measure the information loss between the demonstrated and reproduced probabilistic trajectories. Different types of f-divergence will result in different policies, which we call imitation modes. Furthermore, our approach enables the incorporation of spatial and temporal trajectory modulation, which is necessary for robots to be adaptive to the change in working conditions. We benchmark our algorithm against state-of-the-art methods in terms of trajectory reproduction and adaptation. The quantitative evaluation shows that our approach outperforms other algorithms regarding both accuracy and efficiency. We also report real-world experimental results on learning manifold trajectories in a polishing task with a KUKA LWR robot arm, illustrating the effectiveness of our algorithmic framework
A Structured Prediction Approach for Robot Imitation Learning
We propose a structured prediction approach for robot imitation learning from
demonstrations. Among various tools for robot imitation learning, supervised
learning has been observed to have a prominent role. Structured prediction is a
form of supervised learning that enables learning models to operate on output
spaces with complex structures. Through the lens of structured prediction, we
show how robots can learn to imitate trajectories belonging to not only
Euclidean spaces but also Riemannian manifolds. Exploiting ideas from
information theory, we propose a class of loss functions based on the
f-divergence to measure the information loss between the demonstrated and
reproduced probabilistic trajectories. Different types of f-divergence will
result in different policies, which we call imitation modes. Furthermore, our
approach enables the incorporation of spatial and temporal trajectory
modulation, which is necessary for robots to be adaptive to the change in
working conditions. We benchmark our algorithm against state-of-the-art methods
in terms of trajectory reproduction and adaptation. The quantitative evaluation
shows that our approach outperforms other algorithms regarding both accuracy
and efficiency. We also report real-world experimental results on learning
manifold trajectories in a polishing task with a KUKA LWR robot arm,
illustrating the effectiveness of our algorithmic framework
- β¦