209 research outputs found
Sensing Highly Non-Rigid Objects with RGBD Sensors for Robotic Systems
The goal of this research is to enable a robotic system to manipulate clothing and other highly non-rigid objects using an RGBD sensor. The focus of this thesis is to define and test various algorithms / models that are used to solve parts of the laundry process (i.e. handling, classifying, sorting, unfolding, and folding). First, a system is presented for automatically extracting and classifying items in a pile of laundry. Using only visual sensors, the robot identifies and extracts items sequentially from the pile. When an item is removed and isolated, a model is captured of the shape and appearance of the object, which is then compared against a dataset of known items. The contributions of this part of the laundry process are a novel method for extracting articles of clothing from a pile of laundry, a novel method of classifying clothing using interactive perception, and a multi-layer approach termed L-M-H, more specifically L-C-S-H for clothing classification. This thesis describes two different approaches to classify clothing into categories. The first approach relies upon silhouettes, edges, and other low-level image measurements of the articles of clothing. Experiments from the first approach demonstrate the ability of the system to efficiently classify and label into one of six categories (pants, shorts, short-sleeve shirt, long-sleeve shirt, socks, or underwear). These results show that, on average, classification rates using robot interaction are 59% higher than those that do not use interaction. The second approach relies upon color, texture, shape, and edge information from 2D and 3D data within a local and global perspective. The multi-layer approach compartmentalizes the problem into a high (H) layer, multiple mid-level (characteristics(C), selection masks(S)) layers, and a low (L) layer. This approach produces \u27local\u27 solutions to solve the global classification problem. Experiments demonstrate the ability of the system to efficiently classify each article of clothing into one of seven categories (pants, shorts, shirts, socks, dresses, cloths, or jackets). The results presented in this paper show that, on average, the classification rates improve by +27.47% for three categories, +17.90% for four categories, and +10.35% for seven categories over the baseline system, using support vector machines. Second, an algorithm is presented for automatically unfolding a piece of clothing. A piece of cloth is pulled in different directions at various points of the cloth in order to flatten the cloth. The features of the cloth are extracted and calculated to determine a valid location and orientation in which to interact with it. The features include the peak region, corner locations, and continuity / discontinuity of the cloth. In this thesis, a two-stage algorithm is presented, introducing a novel solution to the unfolding / flattening problem using interactive perception. Simulations using 3D simulation software, and experiments with robot hardware demonstrate the ability of the algorithm to flatten pieces of laundry using different starting configurations. These results show that, at most, the algorithm flattens out a piece of cloth from 11.1% to 95.6% of the canonical configuration. Third, an energy minimization algorithm is presented that is designed to estimate the configuration of a deformable object. This approach utilizes an RGBD image to calculate feature correspondence (using SURF features), depth values, and boundary locations. Input from a Kinect sensor is used to segment the deformable surface from the background using an alpha-beta swap algorithm. Using this segmentation, the system creates an initial mesh model without prior information of the surface geometry, and it reinitializes the configuration of the mesh model after a loss of input data. This approach is able to handle in-plane rotation, out-of-plane rotation, and varying changes in translation and scale. Results display the proposed algorithm over a dataset consisting of seven shirts, two pairs of shorts, two posters, and a pair of pants. The current approach is compared using a simulated shirt model in order to calculate the mean square error of the distance from the vertices on the mesh model to the ground truth, provided by the simulation model
Robotic system for garment perception and manipulation
Mención Internacional en el título de doctorGarments are a key element of people’s daily lives, as many
domestic tasks -such as laundry-, revolve around them. Performing
such tasks, generally dull and repetitive, implies devoting
many hours of unpaid labor to them, that could be freed
through automation. But automation of such tasks has been traditionally
hard due to the deformable nature of garments, that
creates additional challenges to the already existing when performing
object perception and manipulation. This thesis presents
a Robotic System for Garment Perception and Manipulation
that intends to address these challenges.
The laundry pipeline as defined in this work is composed
by four independent -but sequential- tasks: hanging, unfolding,
ironing and folding. The aim of this work is the automation of
this pipeline through a robotic system able to work on domestic
environments as a robot household companion.
Laundry starts by washing the garments, that then need to
be dried, frequently by hanging them. As hanging is a complex
task requiring bimanipulation skills and dexterity, a simplified
approach is followed in this work as a starting point, by using
a deep convolutional neural network and a custom synthetic
dataset to study if a robot can predict whether a garment will
hang or not when dropped over a hanger, as a first step towards
a more complex controller.
After the garment is dry, it has to be unfolded to ease recognition
of its garment category for the next steps. The presented
model-less unfolding method uses only color and depth information
from the garment to determine the grasp and release
points of an unfolding action, that is repeated iteratively until
the garment is fully spread.
Before storage, wrinkles have to be removed from the garment.
For that purpose, a novel ironing method is proposed,
that uses a custom wrinkle descriptor to locate the most prominent
wrinkles and generate a suitable ironing plan. The method
does not require a precise control of the light conditions of
the scene, and is able to iron using unmodified ironing tools
through a force-feedback-based controller.
Finally, the last step is to fold the garment to store it. One
key aspect when folding is to perform the folding operation in a precise manner, as errors will accumulate when several
folds are required. A neural folding controller is proposed that
uses visual feedback of the current garment shape, extracted
through a deep neural network trained with synthetic data, to
accurately perform a fold.
All the methods presented to solve each of the laundry pipeline
tasks have been validated experimentally on different robotic
platforms, including a full-body humanoid robot.La ropa es un elemento clave en la vida diaria de las personas,
no sólo a la hora de vestir, sino debido también a que muchas
de las tareas domésticas que una persona debe realizar diariamente,
como hacer la colada, requieren interactuar con ellas.
Estas tareas, a menudo tediosas y repetitivas, obligan a invertir
una gran cantidad de horas de trabajo no remunerado en
su realización, las cuales podrían reducirse a través de su automatización.
Sin embargo, automatizar dichas tareas ha sido
tradicionalmente un reto, debido a la naturaleza deformable de
las prendas, que supone una dificultad añadida a las ya existentes
al llevar a cabo percepción y manipulación de objetos a
través de robots. Esta tesis presenta un sistema robótico orientado
a la percepción y manipulación de prendas, que pretende
resolver dichos retos.
La colada es una tarea doméstica compuesta de varias subtareas
que se llevan a cabo de manera secuencial. En este trabajo,
se definen dichas subtareas como: tender, desdoblar, planchar
y doblar. El objetivo de este trabajo es automatizar estas tareas
a través de un sistema robótico capaz de trabajar en entornos
domésticos, convirtiéndose en un asistente robótico doméstico.
La colada comienza lavando las prendas, las cuales han de
ser posteriormente secadas, generalmente tendiéndolas al aire
libre, para poder realizar el resto de subtareas con ellas. Tender
la ropa es una tarea compleja, que requiere de bimanipulación
y una gran destreza al manipular la prenda. Por ello, en este
trabajo se ha optado por abordar una versión simplicada de
la tarea de tendido, como punto de partida para llevar a cabo
investigaciones más avanzadas en el futuro. A través de una red
neuronal convolucional profunda y un conjunto de datos de
entrenamiento sintéticos, se ha llevado a cabo un estudio sobre
la capacidad de predecir el resultado de dejar caer una prenda
sobre un tendedero por parte de un robot. Este estudio, que
sirve como primer paso hacia un controlador más avanzado,
ha resultado en un modelo capaz de predecir si la prenda se
quedará tendida o no a partir de una imagen de profundidad
de la misma en la posición en la que se dejará caer.
Una vez las prendas están secas, y para facilitar su reconocimiento
por parte del robot de cara a realizar las siguientes tareas, la prenda debe ser desdoblada. El método propuesto en
este trabajo para realizar el desdoble no requiere de un modelo
previo de la prenda, y utiliza únicamente información de profundidad
y color, obtenida mediante un sensor RGB-D, para
calcular los puntos de agarre y soltado de una acción de desdoble.
Este proceso es iterativo, y se repite hasta que la prenda se
encuentra totalmente desdoblada.
Antes de almacenar la prenda, se deben eliminar las posibles
arrugas que hayan surgido en el proceso de lavado y secado.
Para ello, se propone un nuevo algoritmo de planchado, que
utiliza un descriptor de arrugas desarrollado en este trabajo para
localizar las arrugas más prominentes y generar un plan de
planchado acorde a las condiciones de la prenda. A diferencia
de otros métodos existentes, este método puede aplicarse en un
entorno doméstico, ya que no requiere de un contol preciso de
las condiciones de iluminación. Además, es capaz de usar las
mismas herramientas de planchado que usaría una persona sin
necesidad de realizar modificaciones a las mismas, a través de
un controlador que usa realimentación de fuerza para aplicar
una presión constante durante el planchado.
El último paso al hacer la colada es doblar la prenda para
almacenarla. Un aspecto importante al doblar prendas es ejecutar
cada uno de los dobleces necesarios con precisión, ya que
cada error o desfase cometido en un doblez se acumula cuando
la secuencia de doblado está formada por varios dobleces
consecutivos. Para llevar a cabo estos dobleces con la precisión
requerida, se propone un controlador basado en una red neuronal,
que utiliza realimentación visual de la forma de la prenda
durante cada operación de doblado. Esta realimentación es obtenida
a través de una red neuronal profunda entrenada con
un conjunto de entrenamiento sintético, que permite estimar
la forma en 3D de la parte a doblar a través de una imagen
monocular de la misma.
Todos los métodos descritos en esta tesis han sido validados
experimentalmente con éxito en diversas plataformas robóticas,
incluyendo un robot humanoide.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidente: Abderrahmane Kheddar.- Secretario: Ramón Ignacio Barber Castaño.- Vocal: Karinne Ramírez-Amar
Active recognition and pose estimation of rigid and deformable objects in 3D space
Object recognition and pose estimation is a fundamental problem in computer vision and of utmost importance in robotic applications. Object recognition refers to the problem of recognizing certain object instances, or categorizing objects into specific classes. Pose estimation deals with estimating the exact position of the object in 3D space, usually expressed in Euler angles. There are generally two types of objects that require special care when designing solutions to the aforementioned problems: rigid and deformable. Dealing with deformable objects has been a much harder problem, and usually solutions that apply to rigid objects, fail when used for deformable objects due to the inherent assumptions made during the design.
In this thesis we deal with object categorization, instance recognition and pose estimation of both rigid and deformable objects. In particular, we are interested in a special type of deformable objects, clothes. We tackle the problem of autonomously recognizing and unfolding articles of clothing using a dual manipulator. This problem consists of grasping an article from a random point, recognizing it and then bringing it into an unfolded state by a dual arm robot. We propose a data-driven method for clothes recognition from depth images using Random Decision Forests. We also propose a method for unfolding an article of clothing after estimating and grasping two key-points, using Hough Forests. Both methods are implemented into a POMDP framework allowing the robot to interact optimally with the garments, taking into account uncertainty in the recognition and point estimation process. This active recognition and unfolding makes our system very robust to noisy observations. Our methods were tested on regular-sized clothes using a dual-arm manipulator. Our systems perform better in both accuracy and speed compared to state-of-the-art approaches.
In order to take advantage of the robotic manipulator and increase the accuracy of our system, we developed a novel approach to address generic active vision problems, called Active Random Forests. While state of the art focuses on best viewing parameters selection based on single view classifiers, we propose a multi-view classifier where the decision mechanism of optimally changing viewing parameters is inherent to the classification process. This has many advantages: a) the classifier exploits the entire set of captured images and does not simply aggregate probabilistically per view hypotheses; b) actions are based on learnt disambiguating features from all views and are optimally selected using the powerful voting scheme of Random Forests and c) the classifier can take into account the costs of actions. The proposed framework was applied to the same task of autonomously unfolding clothes by a robot, addressing the problem of best viewpoint selection in classification, grasp point and pose estimation of garments. We show great performance improvement compared to state of the art methods and our previous POMDP formulation.
Moving from deformable to rigid objects while keeping our interest to domestic robotic applications, we focus on object instance recognition and 3D pose estimation of household objects. We are particularly interested in realistic scenes that are very crowded and objects can be perceived under severe occlusions. Single shot-based 6D pose estimators with manually designed features are still unable to tackle such difficult scenarios for a variety of objects, motivating the research towards unsupervised feature learning and next-best-view estimation. We present a complete framework for both single shot-based 6D object pose estimation and next-best-view prediction based on Hough Forests, the state of the art object pose estimator that performs classification and regression jointly. Rather than using manually designed features we propose an unsupervised feature learnt from depth-invariant patches using a Sparse Autoencoder. Furthermore, taking advantage of the clustering performed in the leaf nodes of Hough Forests, we learn to estimate the reduction of uncertainty in other views, formulating the problem of selecting the next-best-view. To further improve 6D object pose estimation, we propose an improved joint registration and hypotheses verification module as a final refinement step to reject false detections. We provide two additional challenging datasets inspired from realistic scenarios to extensively evaluate the state of the art and our framework. One is related to domestic environments and the other depicts a bin-picking scenario mostly found in industrial settings. We show that our framework significantly outperforms state of the art both on public and on our datasets.
Unsupervised feature learning, although efficient, might produce sub-optimal features for our particular tast. Therefore in our last work, we leverage the power of Convolutional Neural Networks to tackled the problem of estimating the pose of rigid objects by an end-to-end deep regression network. To improve the moderate performance of the standard regression objective function, we introduce the Siamese Regression Network. For a given image pair, we enforce a similarity measure between the representation of the sample images in the feature and pose space respectively, that is shown to boost regression performance. Furthermore, we argue that our pose-guided feature learning using our Siamese Regression Network generates more discriminative features that outperform the state of the art. Last, our feature learning formulation provides the ability of learning features that can perform under severe occlusions, demonstrating high performance on our novel hand-object dataset.
Concluding, this work is a research on the area of object detection and pose estimation in 3D space, on a variety of object types. Furthermore we investigate how accuracy can be further improved by applying active vision techniques to optimally move the camera view to minimize the detection error.Open Acces
Visual Perception of Garments for their Robotic Manipulation
Tématem předložené práce je strojové vnímání textilií založené na obrazové informaci a využité pro jejich robotickou manipulaci. Práce studuje několik reprezentativních textilií v běžných kognitivně-manipulačních úlohách, jako je například třídění neznámých oděvů podle typu nebo jejich skládání. Některé z těchto činností by v budoucnu mohly být vykonávány domácími robotickými pomocníky. Strojová manipulace s textiliemi je poptávaná také v průmyslu. Hlavní výzvou řešeného problému je měkkost a s tím související vysoká deformovatelnost textilií, které se tak mohou nacházet v bezpočtu vizuálně velmi odlišných stavů.The presented work addresses the visual perception of garments applied for their robotic manipulation. Various types of garments are considered in the typical perception and manipulation tasks, including their classification, folding or unfolding. Our work is motivated by the possibility of having humanoid household robots performing these tasks for us in the future, as well as by the industrial applications. The main challenge is the high deformability of garments, which can be posed in infinitely many configurations with a significantly varying appearance
Data-driven robotic manipulation of cloth-like deformable objects : the present, challenges and future prospects
Manipulating cloth-like deformable objects (CDOs) is a long-standing problem in the robotics community. CDOs are flexible (non-rigid) objects that do not show a detectable level of compression strength while two points on the article are pushed towards each other and include objects such as ropes (1D), fabrics (2D) and bags (3D). In general, CDOs’ many degrees of freedom (DoF) introduce severe self-occlusion and complex state–action dynamics as significant obstacles to perception and manipulation systems. These challenges exacerbate existing issues of modern robotic control methods such as imitation learning (IL) and reinforcement learning (RL). This review focuses on the application details of data-driven control methods on four major task families in this domain: cloth shaping, knot tying/untying, dressing and bag manipulation. Furthermore, we identify specific inductive biases in these four domains that present challenges for more general IL and RL algorithms.Publisher PDFPeer reviewe
Flexible Object Manipulation
Flexible objects are a challenge to manipulate. Their motions are hard to predict, and the high number of degrees of freedom makes sensing, control, and planning difficult. Additionally, they have more complex friction and contact issues than rigid bodies, and they may stretch and compress. In this thesis, I explore two major types of flexible materials: cloth and string. For rigid bodies, one of the most basic problems in manipulation is the development of immobilizing grasps. The same problem exists for flexible objects. I have shown that a simple polygonal piece of cloth can be fully immobilized by grasping all convex vertices and no more than one third of the concave vertices. I also explored simple manipulation methods that make use of gravity to reduce the number of fingers necessary for grasping. I have built a system for folding a T-shirt using a 4 DOF arm and a fixed-length iron bar which simulates two fingers. The main goal with string manipulation has been to tie knots without the use of any sensing. I have developed single-piece fixtures capable of tying knots in fishing line, solder, and wire, along with a more complex track-based system for autonomously tying a knot in steel wire. I have also developed a series of different fixtures that use compressed air to tie knots in string. Additionally, I have designed four-piece fixtures, which demonstrate a way to fully enclose a knot during the insertion process, while guaranteeing that extraction will always succeed
Bimanual Interaction with Clothes. Topology, Geometry, and Policy Representations in Robots
Twardon L. Bimanual Interaction with Clothes. Topology, Geometry, and Policy Representations in Robots. Bielefeld: Universität Bielefeld; 2019.If anthropomorphic robots are to assist people with activities of daily living, they must be able to handle all kinds of everyday objects, including highly deformable ones such as garments. The present thesis begins with a detailed problem analysis of robotic interaction with and perception of clothes. We show that handling items of clothing is very challenging due to their complex dynamics and the vast number of degrees of freedom. As a result of our analysis, we obtain a topological, geometric, and functional description of garments that supports the development of reduced object and task representations. One of the key findings is that the boundary components, which typically correspond with the openings, characterize garments well, both in terms of their topology and their inherent purpose, namely dressing. We present a polygon-based and an interactive method for identifying boundary components using RGB-D vision with application to grasping. Moreover, we propose Active
Boundary Component Models (ABCMs), a constraint-based framework for tracking garment openings with point clouds. It is often difficult to maintain an accurate representation of the objects involved in contact-rich interaction tasks such as dressing assistance. Therefore, our policy optimization approach to putting a knit cap on a styrofoam head avoids modeling the details of the garment and its deformations. The experimental results suggest that a heuristic performance measure that takes into account the amount of contact established between the two objects is suitable for the task
- …