92 research outputs found

    Vision-based Robotic Grasping in Simulation using Deep Reinforcement Learning

    Get PDF
    This thesis will investigate different robotic manipulation and grasping approaches. It will present an overview of robotic simulation environments, and offer an evaluation of PyBullet, CoppeliaSim, and Gazebo, comparing various features. The thesis further presents a background for current approaches to robotic manipulation and grasping by describing how the robotic movement and grasping can be organized. State-of-the-Art approaches for learning robotic grasping, both using supervised methods and reinforcement learning methods are presented. Two set of experiments will be conducted in PyBullet, illustrating how Deep Reinforcement Learning methods could be applied to train a 7 degrees of freedom robotic arm to grasp objects

    Vision-based and marker-less surgical tool detection and tracking: a review of the literature

    Get PDF
    In recent years, tremendous progress has been made in surgical practice for example with Minimally Invasive Surgery (MIS). To overcome challenges coming from deported eye-to-hand manipulation, robotic and computer-assisted systems have been developed. Having real-time knowledge of the pose of surgical tools with respect to the surgical camera and underlying anatomy is a key ingredient for such systems. In this paper, we present a review of the literature dealing with vision-based and marker-less surgical tool detection. This paper includes three primary contributions: (1) identification and analysis of data-sets used for developing and testing detection algorithms, (2) in-depth comparison of surgical tool detection methods from the feature extraction process to the model learning strategy and highlight existing shortcomings, and (3) analysis of validation techniques employed to obtain detection performance results and establish comparison between surgical tool detectors. The papers included in the review were selected through PubMed and Google Scholar searches using the keywords: “surgical tool detection”, “surgical tool tracking”, “surgical instrument detection” and “surgical instrument tracking” limiting results to the year range 2000 2015. Our study shows that despite significant progress over the years, the lack of established surgical tool data-sets, and reference format for performance assessment and method ranking is preventing faster improvement

    Data-Driven Grasp Synthesis - A Survey

    Full text link
    We review the work on data-driven grasp synthesis and the methodologies for sampling and ranking candidate grasps. We divide the approaches into three groups based on whether they synthesize grasps for known, familiar or unknown objects. This structure allows us to identify common object representations and perceptual processes that facilitate the employed data-driven grasp synthesis technique. In the case of known objects, we concentrate on the approaches that are based on object recognition and pose estimation. In the case of familiar objects, the techniques use some form of a similarity matching to a set of previously encountered objects. Finally for the approaches dealing with unknown objects, the core part is the extraction of specific features that are indicative of good grasps. Our survey provides an overview of the different methodologies and discusses open problems in the area of robot grasping. We also draw a parallel to the classical approaches that rely on analytic formulations.Comment: 20 pages, 30 Figures, submitted to IEEE Transactions on Robotic

    Ultra high frequency (UHF) radio-frequency identification (RFID) for robot perception and mobile manipulation

    Get PDF
    Personal robots with autonomy, mobility, and manipulation capabilities have the potential to dramatically improve quality of life for various user populations, such as older adults and individuals with motor impairments. Unfortunately, unstructured environments present many challenges that hinder robot deployment in ordinary homes. This thesis seeks to address some of these challenges through a new robotic sensing modality that leverages a small amount of environmental augmentation in the form of Ultra High Frequency (UHF) Radio-Frequency Identification (RFID) tags. Previous research has demonstrated the utility of infrastructure tags (affixed to walls) for robot localization; in this thesis, we specifically focus on tagging objects. Owing to their low-cost and passive (battery-free) operation, users can apply UHF RFID tags to hundreds of objects throughout their homes. The tags provide two valuable properties for robots: a unique identifier and receive signal strength indicator (RSSI, the strength of a tag's response). This thesis explores robot behaviors and radio frequency perception techniques using robot-mounted UHF RFID readers that enable a robot to efficiently discover, locate, and interact with UHF RFID tags applied to objects and people of interest. The behaviors and algorithms explicitly rely on the robot's mobility and manipulation capabilities to provide multiple opportunistic views of the complex electromagnetic landscape inside a home environment. The electromagnetic properties of RFID tags change when applied to common household objects. Objects can have varied material properties, can be placed in diverse orientations, and be relocated to completely new environments. We present a new class of optimization-based techniques for RFID sensing that are robust to the variation in tag performance caused by these complexities. We discuss a hybrid global-local search algorithm where a robot employing long-range directional antennas searches for tagged objects by maximizing expected RSSI measurements; that is, the robot attempts to position itself (1) near a desired tagged object and (2) oriented towards it. The robot first performs a sparse, global RFID search to locate a pose in the neighborhood of the tagged object, followed by a series of local search behaviors (bearing estimation and RFID servoing) to refine the robot's state within the local basin of attraction. We report on RFID search experiments performed in Georgia Tech's Aware Home (a real home). Our optimization-based approach yields superior performance compared to state of the art tag localization algorithms, does not require RF sensor models, is easy to implement, and generalizes to other short-range RFID sensor systems embedded in a robot's end effector. We demonstrate proof of concept applications, such as medication delivery and multi-sensor fusion, using these techniques. Through our experimental results, we show that UHF RFID is a complementary sensing modality that can assist robots in unstructured human environments.PhDCommittee Chair: Kemp, Charles C.; Committee Member: Abowd, Gregory; Committee Member: Howard, Ayanna; Committee Member: Ingram, Mary Ann; Committee Member: Reynolds, Matt; Committee Member: Tentzeris, Emmanoui

    Improved Deep Neural Networks for Generative Robotic Grasping

    Get PDF
    This thesis provides a thorough evaluation of current state-of-the-art robotic grasping methods and contributes to a subset of data-driven grasp estimation approaches, termed generative models. These models aim to directly generate grasp region proposals from a given image without the need for a separate analysis and ranking step, which can be computationally expensive. This approach allows for fully end-to-end training of a model and quick closed-loop operation of a robot arm. A number of limitations are identified within these generative models, which are identified and addressed. Contributions are proposed that directly target each stage of the training pipeline that help to form accurate grasp proposals and generalise better to unseen objects. Firstly, inspired by theories of object manipulation within the mammalian visual system, the use of multi-task learning in existing generative architectures is evaluated. This aims to improve the performance of grasping algorithms when presented with impoverished colour (RGB) data by training models to perform simultaneous tasks such as object categorisation, saliency detection, and depth reconstruction. Secondly, a novel loss function is introduced which improves overall performance by rewarding the network to focus only on learning grasps at suitable positions. This reduces overall training times and results in better performance on fewer training examples. The last contribution analyses the problems with the most common metric used for evaluating and comparing offline performance between different grasping models and algorithms. To this end, a Gaussian method of representing ground-truth labelled grasps is put forward, which optimal grasp locations tested in a simulated grasping environment. The combination of these novel additions to generative models results in improved grasp success, accuracy, and performance on common benchmark datasets compared to previous approaches. Furthermore, the efficacy of these contributions is also tested when transferred to a physical robotic arm, demonstrating the ability to effectively grasp previously unseen 3D printed objects of varying complexity and difficulty without the need for domain adaptation. Finally, the future directions are discussed for generative convolutional models within the overall field of robotic grasping

    Image Compositing for Segmentation of Surgical Tools Without Manual Annotations

    Get PDF
    Producing manual, pixel-accurate, image segmentation labels is tedious and time-consuming. This is often a rate-limiting factor when large amounts of labeled images are required, such as for training deep convolutional networks for instrument-background segmentation in surgical scenes. No large datasets comparable to industry standards in the computer vision community are available for this task. To circumvent this problem, we propose to automate the creation of a realistic training dataset by exploiting techniques stemming from special effects and harnessing them to target training performance rather than visual appeal. Foreground data is captured by placing sample surgical instruments over a chroma key (a.k.a. green screen) in a controlled environment, thereby making extraction of the relevant image segment straightforward. Multiple lighting conditions and viewpoints can be captured and introduced in the simulation by moving the instruments and camera and modulating the light source. Background data is captured by collecting videos that do not contain instruments. In the absence of pre-existing instrument-free background videos, minimal labeling effort is required, just to select frames that do not contain surgical instruments from videos of surgical interventions freely available online. We compare different methods to blend instruments over tissue and propose a novel data augmentation approach that takes advantage of the plurality of options. We show that by training a vanilla U-Net on semi-synthetic data only and applying a simple post-processing, we are able to match the results of the same network trained on a publicly available manually labeled real dataset
    • …
    corecore