Search CORE

254 research outputs found

Deep Predictive Policy Training using Reinforcement Learning

Author: Björkman Mårten
Ghadirzadeh Ali
Kragic Danica
Maki Atsuto
Publication venue
Publication date: 02/03/2017
Field of study

Skilled robot task learning is best implemented by predictive action policies due to the inherent latency of sensorimotor processes. However, training such predictive policies is challenging as it involves finding a trajectory of motor activations for the full duration of the action. We propose a data-efficient deep predictive policy training (DPPT) framework with a deep neural network policy architecture which maps an image observation to a sequence of motor activations. The architecture consists of three sub-networks referred to as the perception, policy and behavior super-layers. The perception and behavior super-layers force an abstraction of visual and motor data trained with synthetic and simulated training samples, respectively. The policy super-layer is a small sub-network with fewer parameters that maps data in-between the abstracted manifolds. It is trained for each task using methods for policy search reinforcement learning. We demonstrate the suitability of the proposed architecture and learning framework by training predictive policies for skilled object grasping and ball throwing on a PR2 robot. The effectiveness of the method is illustrated by the fact that these tasks are trained using only about 180 real robot attempts with qualitative terminal rewards.Comment: This work is submitted to IEEE/RSJ International Conference on Intelligent Robots and Systems 2017 (IROS2017

arXiv.org e-Print Archive

Planning and Learning: Path-Planning for Autonomous Vehicles, a Review of the Literature

Author: Cazenave Tristan
Guettier Christophe
Jacopin Eric
Osanlou Kevin
Publication venue
Publication date: 17/10/2023
Field of study

This short review aims to make the reader familiar with state-of-the-art works relating to planning, scheduling and learning. First, we study state-of-the-art planning algorithms. We give a brief introduction of neural networks. Then we explore in more detail graph neural networks, a recent variant of neural networks suited for processing graph-structured inputs. We describe briefly the concept of reinforcement learning algorithms and some approaches designed to date. Next, we study some successful approaches combining neural networks for path-planning. Lastly, we focus on temporal planning problems with uncertainty.Comment: AAAI-format & update

arXiv.org e-Print Archive

잔차 합성곱 신경망을 통한 산업용 로봇 기어박스의 동작 적응형 퓨샷 고장 감지 방법

Author: 오영탁
Publication venue: 서울대학교 대학원
Publication date: 01/08/2020
Field of study

학위논문 (석사) -- 서울대학교 대학원 : 공과대학 기계공학부, 2020. 8. 윤병동.Nowadays, industrial robots are indispensable equipment for automated manufacturing processes because they can perform repetitive tasks with consistent precision and accuracy. However, when faults occur in the industrial robot, it can lead to the unexpected shutdown of the production line, which brings significant economic losses, so the fault detection is important. The gearbox, one of the main drivetrain components of an industrial robot, is often subjected to high torque loads, and faults occur frequently. When faults occur in the gearbox, the amplitude and frequency of the torque signal are modulated, which leads to changes in the characteristics of the torque signal. Although several previous studies have proposed fault detection methods for industrial robots using torque signals, it is still a challenge to extract fault-related features under various environmental and operating conditions and to detect faults in the complex motions used in industrial sites To overcome such difficulties, in this paper, we propose a novel motion-adaptive few-shot (MAFS) fault detection method of industrial robot gearboxes using torque ripples via a one-dimensional (1D) residual-convolutional neural network (Res-CNN) and binary-supervised domain adaptation (BSDA). The overall procedure of the proposed method is as follows. First, applying the moving average filtering to the torque signal to extract the data trend, and the torque ripples of the high-frequency band are obtained as a residual value between the original signal and the filtered signal. Second, classifying the state of pre-processed torque ripples under various operating and environmental conditions. It is shown that Res-CNN network 1) distinguishes small differences between normal and fault torque ripples effectively, and 2) focuses on important regions of the input data by the attention effect. Third, after constructing the Siamese network with a pre-trained network in the source domain, which consisted of simple motions, detecting the faults on the target domain, which consisted of complex motions through BSDA. As a result, 1) the similarities of the jointly shared physical mechanisms of torque ripples between simple and complex motions are learned, and 2) faults of the gearbox are adaptively detected while the industrial robot executes complex motions. The proposed method showed the most superior accuracy over other deep learning-based methods in few-shot conditions where only one cycle of each normal and fault data of complex motions is available. In addition, the transferable regions on the torque ripples after domain adaptation was highlighted using 1D guided grad-CAM. The effectiveness of the proposed method was validated with experimental data of multi-axial welding motions in constant and transient speed, which are commonly executed in real-industrial fields such as the automobile manufacturing line. Furthermore, it is expected that the proposed method is applicable to other types of motions, such as inspection, painting, assembly, and so on. The source code is available on my GitHub page of https://github.com/oyt9306/MAFS.Chapter 1. Introduction 1 1.1 Research Motivation 1 1.2 Scope of Research 4 1.3 Thesis Layout 5 Chapter 2. Research Backgrounds 6 2.1 Interpretations of Torque Ripples 6 2.1.1. Causes of torque ripples 6 2.1.1. Modulations on torque ripples due to gearbox faults 8 2.2 Architectures of Res-CNN 11 2.2.1 Convolutional Operation 11 2.2.2 Pooling Operation 12 2.2.3 Activation 13 2.2.4 Batch Normalization 13 2.2.5 Residual Learning 15 2.3 Domain Adaptation (DA) 17 2.3.1 Few-shot domain adaptation 18 Chapter 3. Motion-Adaptive Few-Shot (MAFS) Fault Detection Method 20 3.1 Pre-processing 23 3.2 Network Pre-training 28 3.3 Binary-Supervised Domain Adaptation (BSDA) 31 Chapter 4. Experimental Validations 37 4.1 Experimental Settings 37 4.2 Pre-trained Network Generation 40 4.3 Motion-Adaptation with Few-Shot Learning 43 Chapter 5. Conclusion and Future Work 52 5.1 Conclusion 52 5.2 Contribution 52 5.3 Future Work 54 Bibliography 55 Appendix A. 1D Guided Grad-CAM 60 국문 초록 62Maste

Goal-Conditioned End-to-End Visuomotor Control for Versatile Skill Primitives

Author: Groth Oliver
Hung Chia-Man
Posner Ingmar
Vedaldi Andrea
Publication venue
Publication date: 08/11/2020
Field of study

Visuomotor control (VMC) is an effective means of achieving basic manipulation tasks such as pushing or pick-and-place from raw images. Conditioning VMC on desired goal states is a promising way of achieving versatile skill primitives. However, common conditioning schemes either rely on task-specific fine tuning - e.g. using one-shot imitation learning (IL) - or on sampling approaches using a forward model of scene dynamics i.e. model-predictive control (MPC), leaving deployability and planning horizon severely limited. In this paper we propose a conditioning scheme which avoids these pitfalls by learning the controller and its conditioning in an end-to-end manner. Our model predicts complex action sequences based directly on a dynamic image representation of the robot motion and the distance to a given target observation. In contrast to related works, this enables our approach to efficiently perform complex manipulation tasks from raw image observations without predefined control primitives or test time demonstrations. We report significant improvements in task success over representative MPC and IL baselines. We also demonstrate our model's generalisation capabilities in challenging, unseen tasks featuring visual noise, cluttered scenes and unseen object geometries.Comment: revised manuscript with additional baselines and generalisation experiments; 11 pages, 8 figures, 7 table

arXiv.org e-Print Archive

Oxford University Research Archive

AdaptNet: Policy Adaptation for Physics-Based Character Control

Author: Andrews Sheldon
Karamouzas Ioannis
Kry Paul G.
McGuire Morgan
Neff Michael
Xie Kaixiang
Xu Pei
Zordan Victor
Publication venue
Publication date: 14/11/2023
Field of study

Motivated by humans' ability to adapt skills in the learning of new ones, this paper presents AdaptNet, an approach for modifying the latent space of existing policies to allow new behaviors to be quickly learned from like tasks in comparison to learning from scratch. Building on top of a given reinforcement learning controller, AdaptNet uses a two-tier hierarchy that augments the original state embedding to support modest changes in a behavior and further modifies the policy network layers to make more substantive changes. The technique is shown to be effective for adapting existing physics-based controllers to a wide range of new styles for locomotion, new task targets, changes in character morphology and extensive changes in environment. Furthermore, it exhibits significant increase in learning efficiency, as indicated by greatly reduced training times when compared to training from scratch or using other approaches that modify existing policies. Code is available at https://motion-lab.github.io/AdaptNet.Comment: SIGGRAPH Asia 2023. Video: https://youtu.be/WxmJSCNFb28. Website: https://motion-lab.github.io/AdaptNet, https://pei-xu.github.io/AdaptNe

arXiv.org e-Print Archive