2,431 research outputs found

    Integrated visual perception architecture for robotic clothes perception and manipulation

    Get PDF
    This thesis proposes a generic visual perception architecture for robotic clothes perception and manipulation. This proposed architecture is fully integrated with a stereo vision system and a dual-arm robot and is able to perform a number of autonomous laundering tasks. Clothes perception and manipulation is a novel research topic in robotics and has experienced rapid development in recent years. Compared to the task of perceiving and manipulating rigid objects, clothes perception and manipulation poses a greater challenge. This can be attributed to two reasons: firstly, deformable clothing requires precise (high-acuity) visual perception and dexterous manipulation; secondly, as clothing approximates a non-rigid 2-manifold in 3-space, that can adopt a quasi-infinite configuration space, the potential variability in the appearance of clothing items makes them difficult to understand, identify uniquely, and interact with by machine. From an applications perspective, and as part of EU CloPeMa project, the integrated visual perception architecture refines a pre-existing clothing manipulation pipeline by completing pre-wash clothes (category) sorting (using single-shot or interactive perception for garment categorisation and manipulation) and post-wash dual-arm flattening. To the best of the author’s knowledge, as investigated in this thesis, the autonomous clothing perception and manipulation solutions presented here were first proposed and reported by the author. All of the reported robot demonstrations in this work follow a perception-manipulation method- ology where visual and tactile feedback (in the form of surface wrinkledness captured by the high accuracy depth sensor i.e. CloPeMa stereo head or the predictive confidence modelled by Gaussian Processing) serve as the halting criteria in the flattening and sorting tasks, respectively. From scientific perspective, the proposed visual perception architecture addresses the above challenges by parsing and grouping 3D clothing configurations hierarchically from low-level curvatures, through mid-level surface shape representations (providing topological descriptions and 3D texture representations), to high-level semantic structures and statistical descriptions. A range of visual features such as Shape Index, Surface Topologies Analysis and Local Binary Patterns have been adapted within this work to parse clothing surfaces and textures and several novel features have been devised, including B-Spline Patches with Locality-Constrained Linear coding, and Topology Spatial Distance to describe and quantify generic landmarks (wrinkles and folds). The essence of this proposed architecture comprises 3D generic surface parsing and interpretation, which is critical to underpinning a number of laundering tasks and has the potential to be extended to other rigid and non-rigid object perception and manipulation tasks. The experimental results presented in this thesis demonstrate that: firstly, the proposed grasp- ing approach achieves on-average 84.7% accuracy; secondly, the proposed flattening approach is able to flatten towels, t-shirts and pants (shorts) within 9 iterations on-average; thirdly, the proposed clothes recognition pipeline can recognise clothes categories from highly wrinkled configurations and advances the state-of-the-art by 36% in terms of classification accuracy, achieving an 83.2% true-positive classification rate when discriminating between five categories of clothes; finally the Gaussian Process based interactive perception approach exhibits a substantial improvement over single-shot perception. Accordingly, this thesis has advanced the state-of-the-art of robot clothes perception and manipulation

    Autonomous clothes manipulation using a hierarchical vision architecture

    Get PDF
    This paper presents a novel robot vision architecture for perceiving generic 3-D clothes configurations. Our architecture is hierarchically structured, starting from low-level curvature features to mid-level geometric shapes and topology descriptions, and finally, high-level semantic surface descriptions. We demonstrate our robot vision architecture in a customized dual-arm industrial robot with our inhouse developed stereo vision system, carrying out autonomous grasping and dual-arm flattening. The experimental results show the effectiveness of the proposed dual-arm flattening using the stereo vision system compared with the single-arm flattening using the widely cited Kinect-like sensor as the baseline. In addition, the proposed grasping approach achieves satisfactory performance when grasping various kind of garments, verifying the capability of the proposed visual perception architecture to be adapted to more than one clothing manipulation tasks

    On the Calibration of Active Binocular and RGBD Vision Systems for Dual-Arm Robots

    Get PDF
    This paper describes a camera and hand-eye calibration methodology for integrating an active binocular robot head within a dual-arm robot. For this purpose, we derive the forward kinematic model of our active robot head and describe our methodology for calibrating and integrating our robot head. This rigid calibration provides a closedform hand-to-eye solution. We then present an approach for updating dynamically camera external parameters for optimal 3D reconstruction that are the foundation for robotic tasks such as grasping and manipulating rigid and deformable objects. We show from experimental results that our robot head achieves an overall sub millimetre accuracy of less than 0.3 millimetres while recovering the 3D structure of a scene. In addition, we report a comparative study between current RGBD cameras and our active stereo head within two dual-arm robotic testbeds that demonstrates the accuracy and portability of our proposed methodology

    Visual Perception of Garments for their Robotic Manipulation

    Get PDF
    Tématem předložené práce je strojové vnímání textilií založené na obrazové informaci a využité pro jejich robotickou manipulaci. Práce studuje několik reprezentativních textilií v běžných kognitivně-manipulačních úlohách, jako je například třídění neznámých oděvů podle typu nebo jejich skládání. Některé z těchto činností by v budoucnu mohly být vykonávány domácími robotickými pomocníky. Strojová manipulace s textiliemi je poptávaná také v průmyslu. Hlavní výzvou řešeného problému je měkkost a s tím související vysoká deformovatelnost textilií, které se tak mohou nacházet v bezpočtu vizuálně velmi odlišných stavů.The presented work addresses the visual perception of garments applied for their robotic manipulation. Various types of garments are considered in the typical perception and manipulation tasks, including their classification, folding or unfolding. Our work is motivated by the possibility of having humanoid household robots performing these tasks for us in the future, as well as by the industrial applications. The main challenge is the high deformability of garments, which can be posed in infinitely many configurations with a significantly varying appearance

    Robotic system for garment perception and manipulation

    Get PDF
    Mención Internacional en el título de doctorGarments are a key element of people’s daily lives, as many domestic tasks -such as laundry-, revolve around them. Performing such tasks, generally dull and repetitive, implies devoting many hours of unpaid labor to them, that could be freed through automation. But automation of such tasks has been traditionally hard due to the deformable nature of garments, that creates additional challenges to the already existing when performing object perception and manipulation. This thesis presents a Robotic System for Garment Perception and Manipulation that intends to address these challenges. The laundry pipeline as defined in this work is composed by four independent -but sequential- tasks: hanging, unfolding, ironing and folding. The aim of this work is the automation of this pipeline through a robotic system able to work on domestic environments as a robot household companion. Laundry starts by washing the garments, that then need to be dried, frequently by hanging them. As hanging is a complex task requiring bimanipulation skills and dexterity, a simplified approach is followed in this work as a starting point, by using a deep convolutional neural network and a custom synthetic dataset to study if a robot can predict whether a garment will hang or not when dropped over a hanger, as a first step towards a more complex controller. After the garment is dry, it has to be unfolded to ease recognition of its garment category for the next steps. The presented model-less unfolding method uses only color and depth information from the garment to determine the grasp and release points of an unfolding action, that is repeated iteratively until the garment is fully spread. Before storage, wrinkles have to be removed from the garment. For that purpose, a novel ironing method is proposed, that uses a custom wrinkle descriptor to locate the most prominent wrinkles and generate a suitable ironing plan. The method does not require a precise control of the light conditions of the scene, and is able to iron using unmodified ironing tools through a force-feedback-based controller. Finally, the last step is to fold the garment to store it. One key aspect when folding is to perform the folding operation in a precise manner, as errors will accumulate when several folds are required. A neural folding controller is proposed that uses visual feedback of the current garment shape, extracted through a deep neural network trained with synthetic data, to accurately perform a fold. All the methods presented to solve each of the laundry pipeline tasks have been validated experimentally on different robotic platforms, including a full-body humanoid robot.La ropa es un elemento clave en la vida diaria de las personas, no sólo a la hora de vestir, sino debido también a que muchas de las tareas domésticas que una persona debe realizar diariamente, como hacer la colada, requieren interactuar con ellas. Estas tareas, a menudo tediosas y repetitivas, obligan a invertir una gran cantidad de horas de trabajo no remunerado en su realización, las cuales podrían reducirse a través de su automatización. Sin embargo, automatizar dichas tareas ha sido tradicionalmente un reto, debido a la naturaleza deformable de las prendas, que supone una dificultad añadida a las ya existentes al llevar a cabo percepción y manipulación de objetos a través de robots. Esta tesis presenta un sistema robótico orientado a la percepción y manipulación de prendas, que pretende resolver dichos retos. La colada es una tarea doméstica compuesta de varias subtareas que se llevan a cabo de manera secuencial. En este trabajo, se definen dichas subtareas como: tender, desdoblar, planchar y doblar. El objetivo de este trabajo es automatizar estas tareas a través de un sistema robótico capaz de trabajar en entornos domésticos, convirtiéndose en un asistente robótico doméstico. La colada comienza lavando las prendas, las cuales han de ser posteriormente secadas, generalmente tendiéndolas al aire libre, para poder realizar el resto de subtareas con ellas. Tender la ropa es una tarea compleja, que requiere de bimanipulación y una gran destreza al manipular la prenda. Por ello, en este trabajo se ha optado por abordar una versión simplicada de la tarea de tendido, como punto de partida para llevar a cabo investigaciones más avanzadas en el futuro. A través de una red neuronal convolucional profunda y un conjunto de datos de entrenamiento sintéticos, se ha llevado a cabo un estudio sobre la capacidad de predecir el resultado de dejar caer una prenda sobre un tendedero por parte de un robot. Este estudio, que sirve como primer paso hacia un controlador más avanzado, ha resultado en un modelo capaz de predecir si la prenda se quedará tendida o no a partir de una imagen de profundidad de la misma en la posición en la que se dejará caer. Una vez las prendas están secas, y para facilitar su reconocimiento por parte del robot de cara a realizar las siguientes tareas, la prenda debe ser desdoblada. El método propuesto en este trabajo para realizar el desdoble no requiere de un modelo previo de la prenda, y utiliza únicamente información de profundidad y color, obtenida mediante un sensor RGB-D, para calcular los puntos de agarre y soltado de una acción de desdoble. Este proceso es iterativo, y se repite hasta que la prenda se encuentra totalmente desdoblada. Antes de almacenar la prenda, se deben eliminar las posibles arrugas que hayan surgido en el proceso de lavado y secado. Para ello, se propone un nuevo algoritmo de planchado, que utiliza un descriptor de arrugas desarrollado en este trabajo para localizar las arrugas más prominentes y generar un plan de planchado acorde a las condiciones de la prenda. A diferencia de otros métodos existentes, este método puede aplicarse en un entorno doméstico, ya que no requiere de un contol preciso de las condiciones de iluminación. Además, es capaz de usar las mismas herramientas de planchado que usaría una persona sin necesidad de realizar modificaciones a las mismas, a través de un controlador que usa realimentación de fuerza para aplicar una presión constante durante el planchado. El último paso al hacer la colada es doblar la prenda para almacenarla. Un aspecto importante al doblar prendas es ejecutar cada uno de los dobleces necesarios con precisión, ya que cada error o desfase cometido en un doblez se acumula cuando la secuencia de doblado está formada por varios dobleces consecutivos. Para llevar a cabo estos dobleces con la precisión requerida, se propone un controlador basado en una red neuronal, que utiliza realimentación visual de la forma de la prenda durante cada operación de doblado. Esta realimentación es obtenida a través de una red neuronal profunda entrenada con un conjunto de entrenamiento sintético, que permite estimar la forma en 3D de la parte a doblar a través de una imagen monocular de la misma. Todos los métodos descritos en esta tesis han sido validados experimentalmente con éxito en diversas plataformas robóticas, incluyendo un robot humanoide.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidente: Abderrahmane Kheddar.- Secretario: Ramón Ignacio Barber Castaño.- Vocal: Karinne Ramírez-Amar

    Data-Driven Grasp Synthesis - A Survey

    Full text link
    We review the work on data-driven grasp synthesis and the methodologies for sampling and ranking candidate grasps. We divide the approaches into three groups based on whether they synthesize grasps for known, familiar or unknown objects. This structure allows us to identify common object representations and perceptual processes that facilitate the employed data-driven grasp synthesis technique. In the case of known objects, we concentrate on the approaches that are based on object recognition and pose estimation. In the case of familiar objects, the techniques use some form of a similarity matching to a set of previously encountered objects. Finally for the approaches dealing with unknown objects, the core part is the extraction of specific features that are indicative of good grasps. Our survey provides an overview of the different methodologies and discusses open problems in the area of robot grasping. We also draw a parallel to the classical approaches that rely on analytic formulations.Comment: 20 pages, 30 Figures, submitted to IEEE Transactions on Robotic

    Scene understanding by robotic interactive perception

    Get PDF
    This thesis presents a novel and generic visual architecture for scene understanding by robotic interactive perception. This proposed visual architecture is fully integrated into autonomous systems performing object perception and manipulation tasks. The proposed visual architecture uses interaction with the scene, in order to improve scene understanding substantially over non-interactive models. Specifically, this thesis presents two experimental validations of an autonomous system interacting with the scene: Firstly, an autonomous gaze control model is investigated, where the vision sensor directs its gaze to satisfy a scene exploration task. Secondly, autonomous interactive perception is investigated, where objects in the scene are repositioned by robotic manipulation. The proposed visual architecture for scene understanding involving perception and manipulation tasks has four components: 1) A reliable vision system, 2) Camera-hand eye calibration to integrate the vision system into an autonomous robot’s kinematic frame chain, 3) A visual model performing perception tasks and providing required knowledge for interaction with scene, and finally, 4) A manipulation model which, using knowledge received from the perception model, chooses an appropriate action (from a set of simple actions) to satisfy a manipulation task. This thesis presents contributions for each of the aforementioned components. Firstly, a portable active binocular robot vision architecture that integrates a number of visual behaviours are presented. This active vision architecture has the ability to verge, localise, recognise and simultaneously identify multiple target object instances. The portability and functional accuracy of the proposed vision architecture is demonstrated by carrying out both qualitative and comparative analyses using different robot hardware configurations, feature extraction techniques and scene perspectives. Secondly, a camera and hand-eye calibration methodology for integrating an active binocular robot head within a dual-arm robot are described. For this purpose, the forward kinematic model of the active robot head is derived and the methodology for calibrating and integrating the robot head is described in detail. A rigid calibration methodology has been implemented to provide a closed-form hand-to-eye calibration chain and this has been extended with a mechanism to allow the camera external parameters to be updated dynamically for optimal 3D reconstruction to meet the requirements for robotic tasks such as grasping and manipulating rigid and deformable objects. It is shown from experimental results that the robot head achieves an overall accuracy of fewer than 0.3 millimetres while recovering the 3D structure of a scene. In addition, a comparative study between current RGB-D cameras and our active stereo head within two dual-arm robotic test-beds is reported that demonstrates the accuracy and portability of our proposed methodology. Thirdly, this thesis proposes a visual perception model for the task of category-wise objects sorting, based on Gaussian Process (GP) classification that is capable of recognising objects categories from point cloud data. In this approach, Fast Point Feature Histogram (FPFH) features are extracted from point clouds to describe the local 3D shape of objects and a Bag-of-Words coding method is used to obtain an object-level vocabulary representation. Multi-class Gaussian Process classification is employed to provide a probability estimate of the identity of the object and serves the key role of modelling perception confidence in the interactive perception cycle. The interaction stage is responsible for invoking the appropriate action skills as required to confirm the identity of an observed object with high confidence as a result of executing multiple perception-action cycles. The recognition accuracy of the proposed perception model has been validated based on simulation input data using both Support Vector Machine (SVM) and GP based multi-class classifiers. Results obtained during this investigation demonstrate that by using a GP-based classifier, it is possible to obtain true positive classification rates of up to 80\%. Experimental validation of the above semi-autonomous object sorting system shows that the proposed GP based interactive sorting approach outperforms random sorting by up to 30\% when applied to scenes comprising configurations of household objects. Finally, a fully autonomous visual architecture is presented that has been developed to accommodate manipulation skills for an autonomous system to interact with the scene by object manipulation. This proposed visual architecture is mainly made of two stages: 1) A perception stage, that is a modified version of the aforementioned visual interaction model, 2) An interaction stage, that performs a set of ad-hoc actions relying on the information received from the perception stage. More specifically, the interaction stage simply reasons over the information (class label and associated probabilistic confidence score) received from perception stage to choose one of the following two actions: 1) An object class has been identified with high confidence, so remove from the scene and place it in the designated basket/bin for that particular class. 2) An object class has been identified with less probabilistic confidence, since from observation and inspired from the human behaviour of inspecting doubtful objects, an action is chosen to further investigate that object in order to confirm the object’s identity by capturing more images from different views in isolation. The perception stage then processes these views, hence multiple perception-action/interaction cycles take place. From an application perspective, the task of autonomous category based objects sorting is performed and the experimental design for the task is described in detail
    corecore