We develop an approach that benefits from large simulated datasets and takes
full advantage of the limited online data that is most relevant. We propose a
variant of Bayesian optimization that alternates between using informed and
uninformed kernels. With this Bernoulli Alternation Kernel we ensure that
discrepancies between simulation and reality do not hinder adapting robot
control policies online. The proposed approach is applied to a challenging
real-world problem of task-oriented grasping with novel objects. Our further
contribution is a neural network architecture and training pipeline that use
experience from grasping objects in simulation to learn grasp stability scores.
We learn task scores from a labeled dataset with a convolutional network, which
is used to construct an informed kernel for our variant of Bayesian
optimization. Experiments on an ABB Yumi robot with real sensor data
demonstrate success of our approach, despite the challenge of fulfilling task
requirements and high uncertainty over physical properties of objects.Comment: To appear in 2nd Conference on Robot Learning (CoRL) 201