62,905 research outputs found
Developmental Bayesian Optimization of Black-Box with Visual Similarity-Based Transfer Learning
We present a developmental framework based on a long-term memory and
reasoning mechanisms (Vision Similarity and Bayesian Optimisation). This
architecture allows a robot to optimize autonomously hyper-parameters that need
to be tuned from any action and/or vision module, treated as a black-box. The
learning can take advantage of past experiences (stored in the episodic and
procedural memories) in order to warm-start the exploration using a set of
hyper-parameters previously optimized from objects similar to the new unknown
one (stored in a semantic memory). As example, the system has been used to
optimized 9 continuous hyper-parameters of a professional software (Kamido)
both in simulation and with a real robot (industrial robotic arm Fanuc) with a
total of 13 different objects. The robot is able to find a good object-specific
optimization in 68 (simulation) or 40 (real) trials. In simulation, we
demonstrate the benefit of the transfer learning based on visual similarity, as
opposed to an amnesic learning (i.e. learning from scratch all the time).
Moreover, with the real robot, we show that the method consistently outperforms
the manual optimization from an expert with less than 2 hours of training time
to achieve more than 88% of success
Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement Learning
Skilled robotic manipulation benefits from complex synergies between
non-prehensile (e.g. pushing) and prehensile (e.g. grasping) actions: pushing
can help rearrange cluttered objects to make space for arms and fingers;
likewise, grasping can help displace objects to make pushing movements more
precise and collision-free. In this work, we demonstrate that it is possible to
discover and learn these synergies from scratch through model-free deep
reinforcement learning. Our method involves training two fully convolutional
networks that map from visual observations to actions: one infers the utility
of pushes for a dense pixel-wise sampling of end effector orientations and
locations, while the other does the same for grasping. Both networks are
trained jointly in a Q-learning framework and are entirely self-supervised by
trial and error, where rewards are provided from successful grasps. In this
way, our policy learns pushing motions that enable future grasps, while
learning grasps that can leverage past pushes. During picking experiments in
both simulation and real-world scenarios, we find that our system quickly
learns complex behaviors amid challenging cases of clutter, and achieves better
grasping success rates and picking efficiencies than baseline alternatives
after only a few hours of training. We further demonstrate that our method is
capable of generalizing to novel objects. Qualitative results (videos), code,
pre-trained models, and simulation environments are available at
http://vpg.cs.princeton.eduComment: To appear at the International Conference On Intelligent Robots and
Systems (IROS) 2018. Project webpage: http://vpg.cs.princeton.edu Summary
video: https://youtu.be/-OkyX7Zlhi
- …