Training convolutional neural networks in virtual reality for grasp detection from 3D images

Abstract

Master's thesis in Cybernetics and signal processingThe focus of this project has been on training convolutional neural networks for grasp detection with synthetic data. Convolutional neural networks have had great success on a wide variety of computer vision tasks, but they require large amounts of labelled training data, which currently is non existent for grasp detection tasks. In this thesis, a novel approach for generating large amounts of synthetic data for grasp detection is proposed. By working solely with depth images, realistic looking data can be generated with 3D models in a virtual environment. It is proposed to use simulated physics to ensure that the generated depth images captures objects in natural poses. Additionally, the use of heuristics for choosing the best grip vectors for the objects in relation to their environment is proposed, to serve as the labels for the generated depth images. A virtual environment for synthetic depth image generation was created and a convolutional neural network was trained on the generated data. The results show that neural networks can find good grasps from the synthetic depth images for three different types of objects in cluttered scenes. A novel way of creating real world data sets for grasping using a head mounted display and tracked hand controllers is also proposed. The results show that this may enable easy and fast labelling of real data which can be performed without training by non-technical people

    Similar works

    Full text

    thumbnail-image

    Available Versions