Sense, Think, Grasp: A study on visual and tactile information processing for autonomous manipulation

Abstract

Interacting with the environment using hands is one of the distinctive abilities of humans with respect to other species. This aptitude reflects on the crucial role played by objects\u2019 manipulation in the world that we have shaped for us. With a view of bringing robots outside industries for supporting people during everyday life, the ability of manipulating objects autonomously and in unstructured environments is therefore one of the basic skills they need. Autonomous manipulation is characterized by great complexity especially regarding the processing of sensors information to perceive the surrounding environment. Humans rely on vision for wideranging tridimensional information, prioprioception for the awareness of the relative position of their own body in the space and the sense of touch for local information when physical interaction with objects happens. The study of autonomous manipulation in robotics aims at transferring similar perceptive skills to robots so that, combined with state of the art control techniques, they could be able to achieve similar performance in manipulating objects. The great complexity of this task makes autonomous manipulation one of the open problems in robotics that has been drawing increasingly the research attention in the latest years. In this work of Thesis, we propose possible solutions to some key components of autonomous manipulation, focusing in particular on the perception problem and testing the developed approaches on the humanoid robotic platform iCub. When available, vision is the first source of information to be processed for inferring how to interact with objects. The object modeling and grasping pipeline based on superquadric functions we designed meets this need, since it reconstructs the object 3D model from partial point cloud and computes a suitable hand pose for grasping the object. Retrieving objects information with touch sensors only is a relevant skill that becomes crucial when vision is occluded, as happens for instance during physical interaction with the object. We addressed this problem with the design of a novel tactile localization algorithm, named Memory Unscented Particle Filter, capable of localizing and recognizing objects relying solely on 3D contact points collected on the object surface. Another key point of autonomous manipulation we report on in this Thesis work is bi-manual coordination. The execution of more advanced manipulation tasks in fact might require the use and coordination of two arms. Tool usage for instance often requires a proper in-hand object pose that can be obtained via dual-arm re-grasping. In pick-and-place tasks sometimes the initial and target position of the object do not belong to the same arm workspace, then requiring to use one hand for lifting the object and the other for locating it in the new position. At this regard, we implemented a pipeline for executing the handover task, i.e. the sequences of actions for autonomously passing an object from one robot hand on to the other. The contributions described thus far address specific subproblems of the more complex task of autonomous manipulation. This actually differs from what humans do, in that humans develop their manipulation skills by learning through experience and trial-and-error strategy. Aproper mathematical formulation for encoding this learning approach is given by Deep Reinforcement Learning, that has recently proved to be successful in many robotics applications. For this reason, in this Thesis we report also on the six month experience carried out at Berkeley Artificial Intelligence Research laboratory with the goal of studying Deep Reinforcement Learning and its application to autonomous manipulation

    Similar works