254 research outputs found
Deep Predictive Policy Training using Reinforcement Learning
Skilled robot task learning is best implemented by predictive action policies
due to the inherent latency of sensorimotor processes. However, training such
predictive policies is challenging as it involves finding a trajectory of motor
activations for the full duration of the action. We propose a data-efficient
deep predictive policy training (DPPT) framework with a deep neural network
policy architecture which maps an image observation to a sequence of motor
activations. The architecture consists of three sub-networks referred to as the
perception, policy and behavior super-layers. The perception and behavior
super-layers force an abstraction of visual and motor data trained with
synthetic and simulated training samples, respectively. The policy super-layer
is a small sub-network with fewer parameters that maps data in-between the
abstracted manifolds. It is trained for each task using methods for policy
search reinforcement learning. We demonstrate the suitability of the proposed
architecture and learning framework by training predictive policies for skilled
object grasping and ball throwing on a PR2 robot. The effectiveness of the
method is illustrated by the fact that these tasks are trained using only about
180 real robot attempts with qualitative terminal rewards.Comment: This work is submitted to IEEE/RSJ International Conference on
Intelligent Robots and Systems 2017 (IROS2017
Planning and Learning: Path-Planning for Autonomous Vehicles, a Review of the Literature
This short review aims to make the reader familiar with state-of-the-art
works relating to planning, scheduling and learning. First, we study
state-of-the-art planning algorithms. We give a brief introduction of neural
networks. Then we explore in more detail graph neural networks, a recent
variant of neural networks suited for processing graph-structured inputs. We
describe briefly the concept of reinforcement learning algorithms and some
approaches designed to date. Next, we study some successful approaches
combining neural networks for path-planning. Lastly, we focus on temporal
planning problems with uncertainty.Comment: AAAI-format & update
์์ฐจ ํฉ์ฑ๊ณฑ ์ ๊ฒฝ๋ง์ ํตํ ์ฐ์ ์ฉ ๋ก๋ด ๊ธฐ์ด๋ฐ์ค์ ๋์ ์ ์ํ ํจ์ท ๊ณ ์ฅ ๊ฐ์ง ๋ฐฉ๋ฒ
ํ์๋
ผ๋ฌธ (์์ฌ) -- ์์ธ๋ํ๊ต ๋ํ์ : ๊ณต๊ณผ๋ํ ๊ธฐ๊ณ๊ณตํ๋ถ, 2020. 8. ์ค๋ณ๋.Nowadays, industrial robots are indispensable equipment for automated manufacturing processes because they can perform repetitive tasks with consistent precision and accuracy. However, when faults occur in the industrial robot, it can lead to the unexpected shutdown of the production line, which brings significant economic losses, so the fault detection is important. The gearbox, one of the main drivetrain components of an industrial robot, is often subjected to high torque loads, and faults occur frequently. When faults occur in the gearbox, the amplitude and frequency of the torque signal are modulated, which leads to changes in the characteristics of the torque signal. Although several previous studies have proposed fault detection methods for industrial robots using torque signals, it is still a challenge to extract fault-related features under various environmental and operating conditions and to detect faults in the complex motions used in industrial sites
To overcome such difficulties, in this paper, we propose a novel motion-adaptive few-shot (MAFS) fault detection method of industrial robot gearboxes using torque ripples via a one-dimensional (1D) residual-convolutional neural network (Res-CNN) and binary-supervised domain adaptation (BSDA). The overall procedure of the proposed method is as follows. First, applying the moving average filtering to the torque signal to extract the data trend, and the torque ripples of the high-frequency band are obtained as a residual value between the original signal and the filtered signal. Second, classifying the state of pre-processed torque ripples under various operating and environmental conditions. It is shown that Res-CNN network 1) distinguishes small differences between normal and fault torque ripples effectively, and 2) focuses on important regions of the input data by the attention effect. Third, after constructing the Siamese network with a pre-trained network in the source domain, which consisted of simple motions, detecting the faults on the target domain, which consisted of complex motions through BSDA. As a result, 1) the similarities of the jointly shared physical mechanisms of torque ripples between simple and complex motions are learned, and 2) faults of the gearbox are adaptively detected while the industrial robot executes complex motions. The proposed method showed the most superior accuracy over other deep learning-based methods in few-shot conditions where only one cycle of each normal and fault data of complex motions is available. In addition, the transferable regions on the torque ripples after domain adaptation was highlighted using 1D guided grad-CAM.
The effectiveness of the proposed method was validated with experimental data of multi-axial welding motions in constant and transient speed, which are commonly executed in real-industrial fields such as the automobile manufacturing line. Furthermore, it is expected that the proposed method is applicable to other types of motions, such as inspection, painting, assembly, and so on. The source code is available on my GitHub page of https://github.com/oyt9306/MAFS.Chapter 1. Introduction 1
1.1 Research Motivation 1
1.2 Scope of Research 4
1.3 Thesis Layout 5
Chapter 2. Research Backgrounds 6
2.1 Interpretations of Torque Ripples 6
2.1.1. Causes of torque ripples 6
2.1.1. Modulations on torque ripples due to gearbox faults 8
2.2 Architectures of Res-CNN 11
2.2.1 Convolutional Operation 11
2.2.2 Pooling Operation 12
2.2.3 Activation 13
2.2.4 Batch Normalization 13
2.2.5 Residual Learning 15
2.3 Domain Adaptation (DA) 17
2.3.1 Few-shot domain adaptation 18
Chapter 3. Motion-Adaptive Few-Shot (MAFS) Fault Detection Method 20
3.1 Pre-processing 23
3.2 Network Pre-training 28
3.3 Binary-Supervised Domain Adaptation (BSDA) 31
Chapter 4. Experimental Validations 37
4.1 Experimental Settings 37
4.2 Pre-trained Network Generation 40
4.3 Motion-Adaptation with Few-Shot Learning 43
Chapter 5. Conclusion and Future Work 52
5.1 Conclusion 52
5.2 Contribution 52
5.3 Future Work 54
Bibliography 55
Appendix A. 1D Guided Grad-CAM 60
๊ตญ๋ฌธ ์ด๋ก 62Maste
Goal-Conditioned End-to-End Visuomotor Control for Versatile Skill Primitives
Visuomotor control (VMC) is an effective means of achieving basic
manipulation tasks such as pushing or pick-and-place from raw images.
Conditioning VMC on desired goal states is a promising way of achieving
versatile skill primitives. However, common conditioning schemes either rely on
task-specific fine tuning - e.g. using one-shot imitation learning (IL) - or on
sampling approaches using a forward model of scene dynamics i.e.
model-predictive control (MPC), leaving deployability and planning horizon
severely limited. In this paper we propose a conditioning scheme which avoids
these pitfalls by learning the controller and its conditioning in an end-to-end
manner. Our model predicts complex action sequences based directly on a dynamic
image representation of the robot motion and the distance to a given target
observation. In contrast to related works, this enables our approach to
efficiently perform complex manipulation tasks from raw image observations
without predefined control primitives or test time demonstrations. We report
significant improvements in task success over representative MPC and IL
baselines. We also demonstrate our model's generalisation capabilities in
challenging, unseen tasks featuring visual noise, cluttered scenes and unseen
object geometries.Comment: revised manuscript with additional baselines and generalisation
experiments; 11 pages, 8 figures, 7 table
AdaptNet: Policy Adaptation for Physics-Based Character Control
Motivated by humans' ability to adapt skills in the learning of new ones,
this paper presents AdaptNet, an approach for modifying the latent space of
existing policies to allow new behaviors to be quickly learned from like tasks
in comparison to learning from scratch. Building on top of a given
reinforcement learning controller, AdaptNet uses a two-tier hierarchy that
augments the original state embedding to support modest changes in a behavior
and further modifies the policy network layers to make more substantive
changes. The technique is shown to be effective for adapting existing
physics-based controllers to a wide range of new styles for locomotion, new
task targets, changes in character morphology and extensive changes in
environment. Furthermore, it exhibits significant increase in learning
efficiency, as indicated by greatly reduced training times when compared to
training from scratch or using other approaches that modify existing policies.
Code is available at https://motion-lab.github.io/AdaptNet.Comment: SIGGRAPH Asia 2023. Video: https://youtu.be/WxmJSCNFb28. Website:
https://motion-lab.github.io/AdaptNet, https://pei-xu.github.io/AdaptNe
- โฆ