24 research outputs found
State of the Art in Face Recognition
Notwithstanding the tremendous effort to solve the face recognition problem, it is not possible yet to design a face recognition system with a potential close to human performance. New computer vision and pattern recognition approaches need to be investigated. Even new knowledge and perspectives from different fields like, psychology and neuroscience must be incorporated into the current field of face recognition to design a robust face recognition system. Indeed, many more efforts are required to end up with a human like face recognition system. This book tries to make an effort to reduce the gap between the previous face recognition research state and the future state
Robotic system for garment perception and manipulation
Mención Internacional en el tÃtulo de doctorGarments are a key element of people’s daily lives, as many
domestic tasks -such as laundry-, revolve around them. Performing
such tasks, generally dull and repetitive, implies devoting
many hours of unpaid labor to them, that could be freed
through automation. But automation of such tasks has been traditionally
hard due to the deformable nature of garments, that
creates additional challenges to the already existing when performing
object perception and manipulation. This thesis presents
a Robotic System for Garment Perception and Manipulation
that intends to address these challenges.
The laundry pipeline as defined in this work is composed
by four independent -but sequential- tasks: hanging, unfolding,
ironing and folding. The aim of this work is the automation of
this pipeline through a robotic system able to work on domestic
environments as a robot household companion.
Laundry starts by washing the garments, that then need to
be dried, frequently by hanging them. As hanging is a complex
task requiring bimanipulation skills and dexterity, a simplified
approach is followed in this work as a starting point, by using
a deep convolutional neural network and a custom synthetic
dataset to study if a robot can predict whether a garment will
hang or not when dropped over a hanger, as a first step towards
a more complex controller.
After the garment is dry, it has to be unfolded to ease recognition
of its garment category for the next steps. The presented
model-less unfolding method uses only color and depth information
from the garment to determine the grasp and release
points of an unfolding action, that is repeated iteratively until
the garment is fully spread.
Before storage, wrinkles have to be removed from the garment.
For that purpose, a novel ironing method is proposed,
that uses a custom wrinkle descriptor to locate the most prominent
wrinkles and generate a suitable ironing plan. The method
does not require a precise control of the light conditions of
the scene, and is able to iron using unmodified ironing tools
through a force-feedback-based controller.
Finally, the last step is to fold the garment to store it. One
key aspect when folding is to perform the folding operation in a precise manner, as errors will accumulate when several
folds are required. A neural folding controller is proposed that
uses visual feedback of the current garment shape, extracted
through a deep neural network trained with synthetic data, to
accurately perform a fold.
All the methods presented to solve each of the laundry pipeline
tasks have been validated experimentally on different robotic
platforms, including a full-body humanoid robot.La ropa es un elemento clave en la vida diaria de las personas,
no sólo a la hora de vestir, sino debido también a que muchas
de las tareas domésticas que una persona debe realizar diariamente,
como hacer la colada, requieren interactuar con ellas.
Estas tareas, a menudo tediosas y repetitivas, obligan a invertir
una gran cantidad de horas de trabajo no remunerado en
su realización, las cuales podrÃan reducirse a través de su automatización.
Sin embargo, automatizar dichas tareas ha sido
tradicionalmente un reto, debido a la naturaleza deformable de
las prendas, que supone una dificultad añadida a las ya existentes
al llevar a cabo percepción y manipulación de objetos a
través de robots. Esta tesis presenta un sistema robótico orientado
a la percepción y manipulación de prendas, que pretende
resolver dichos retos.
La colada es una tarea doméstica compuesta de varias subtareas
que se llevan a cabo de manera secuencial. En este trabajo,
se definen dichas subtareas como: tender, desdoblar, planchar
y doblar. El objetivo de este trabajo es automatizar estas tareas
a través de un sistema robótico capaz de trabajar en entornos
domésticos, convirtiéndose en un asistente robótico doméstico.
La colada comienza lavando las prendas, las cuales han de
ser posteriormente secadas, generalmente tendiéndolas al aire
libre, para poder realizar el resto de subtareas con ellas. Tender
la ropa es una tarea compleja, que requiere de bimanipulación
y una gran destreza al manipular la prenda. Por ello, en este
trabajo se ha optado por abordar una versión simplicada de
la tarea de tendido, como punto de partida para llevar a cabo
investigaciones más avanzadas en el futuro. A través de una red
neuronal convolucional profunda y un conjunto de datos de
entrenamiento sintéticos, se ha llevado a cabo un estudio sobre
la capacidad de predecir el resultado de dejar caer una prenda
sobre un tendedero por parte de un robot. Este estudio, que
sirve como primer paso hacia un controlador más avanzado,
ha resultado en un modelo capaz de predecir si la prenda se
quedará tendida o no a partir de una imagen de profundidad
de la misma en la posición en la que se dejará caer.
Una vez las prendas están secas, y para facilitar su reconocimiento
por parte del robot de cara a realizar las siguientes tareas, la prenda debe ser desdoblada. El método propuesto en
este trabajo para realizar el desdoble no requiere de un modelo
previo de la prenda, y utiliza únicamente información de profundidad
y color, obtenida mediante un sensor RGB-D, para
calcular los puntos de agarre y soltado de una acción de desdoble.
Este proceso es iterativo, y se repite hasta que la prenda se
encuentra totalmente desdoblada.
Antes de almacenar la prenda, se deben eliminar las posibles
arrugas que hayan surgido en el proceso de lavado y secado.
Para ello, se propone un nuevo algoritmo de planchado, que
utiliza un descriptor de arrugas desarrollado en este trabajo para
localizar las arrugas más prominentes y generar un plan de
planchado acorde a las condiciones de la prenda. A diferencia
de otros métodos existentes, este método puede aplicarse en un
entorno doméstico, ya que no requiere de un contol preciso de
las condiciones de iluminación. Además, es capaz de usar las
mismas herramientas de planchado que usarÃa una persona sin
necesidad de realizar modificaciones a las mismas, a través de
un controlador que usa realimentación de fuerza para aplicar
una presión constante durante el planchado.
El último paso al hacer la colada es doblar la prenda para
almacenarla. Un aspecto importante al doblar prendas es ejecutar
cada uno de los dobleces necesarios con precisión, ya que
cada error o desfase cometido en un doblez se acumula cuando
la secuencia de doblado está formada por varios dobleces
consecutivos. Para llevar a cabo estos dobleces con la precisión
requerida, se propone un controlador basado en una red neuronal,
que utiliza realimentación visual de la forma de la prenda
durante cada operación de doblado. Esta realimentación es obtenida
a través de una red neuronal profunda entrenada con
un conjunto de entrenamiento sintético, que permite estimar
la forma en 3D de la parte a doblar a través de una imagen
monocular de la misma.
Todos los métodos descritos en esta tesis han sido validados
experimentalmente con éxito en diversas plataformas robóticas,
incluyendo un robot humanoide.Programa de Doctorado en IngenierÃa Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidente: Abderrahmane Kheddar.- Secretario: Ramón Ignacio Barber Castaño.- Vocal: Karinne RamÃrez-Amar
Objects extraction and recognition for camera-based interaction : heuristic and statistical approaches
In this thesis, heuristic and probabilistic methods are applied to a number of problems for camera-based interactions. The goal is to provide solutions for a vision based system that is able to extract and analyze interested objects in camera images and to use that information for various interactions for mobile usage. New methods and new attempts of combination of existing methods are developed for different applications, including text extraction from complex scene images, bar code reading performed by camera phones, and face/facial feature detection and facial expression manipulation.
The application-driven problems of camera-based interaction can not be modeled by a uniform and straightforward model that has very strong simplifications of reality. The solutions we learned to be efficient were to apply heuristic but easy of implementation approaches at first to reduce the complexity of the problems and search for possible means, then use developed statistical learning approaches to deal with the remaining difficult but well-defined problems and get much better accuracy. The process can be evolved in some or all of the stages, and the combination of the approaches is problem-dependent.
Contribution of this thesis resides in two aspects: firstly, new features and approaches are proposed either as heuristics or statistical means for concrete applications; secondly engineering design combining seveal methods for system optimization is studied. Geometrical characteristics and the alignment of text, texture features of bar codes, and structures of faces can all be extracted as heuristics for object extraction and further recognition. The boosting algorithm is one of the proper choices to perform probabilistic learning and to achieve desired accuracy. New feature selection techniques are proposed for constructing the weak learner and applying the boosting output in concrete applications. Subspace methods such as manifold learning algorithms are introduced and tailored for facial expression analysis and synthesis. A modified generalized learning vector quantization method is proposed to deal with the blurring of bar code images. Efficient implementations that combine the approaches in a rational joint point are presented and the results are illustrated.reviewe
Visual and Camera Sensors
This book includes 13 papers published in Special Issue ("Visual and Camera Sensors") of the journal Sensors. The goal of this Special Issue was to invite high-quality, state-of-the-art research papers dealing with challenging issues in visual and camera sensors
Entropy in Image Analysis II
Image analysis is a fundamental task for any application where extracting information from images is required. The analysis requires highly sophisticated numerical and analytical methods, particularly for those applications in medicine, security, and other fields where the results of the processing consist of data of vital importance. This fact is evident from all the articles composing the Special Issue "Entropy in Image Analysis II", in which the authors used widely tested methods to verify their results. In the process of reading the present volume, the reader will appreciate the richness of their methods and applications, in particular for medical imaging and image security, and a remarkable cross-fertilization among the proposed research areas
LIPIcs, Volume 277, GIScience 2023, Complete Volume
LIPIcs, Volume 277, GIScience 2023, Complete Volum