48 research outputs found
Iterative Superquadric Recomposition of 3D Objects from Multiple Views
Humans are good at recomposing novel objects, i.e. they can identify
commonalities between unknown objects from general structure to finer detail,
an ability difficult to replicate by machines. We propose a framework, ISCO, to
recompose an object using 3D superquadrics as semantic parts directly from 2D
views without training a model that uses 3D supervision. To achieve this, we
optimize the superquadric parameters that compose a specific instance of the
object, comparing its rendered 3D view and 2D image silhouette. Our ISCO
framework iteratively adds new superquadrics wherever the reconstruction error
is high, abstracting first coarse regions and then finer details of the target
object. With this simple coarse-to-fine inductive bias, ISCO provides
consistent superquadrics for related object parts, despite not having any
semantic supervision. Since ISCO does not train any neural network, it is also
inherently robust to out-of-distribution objects. Experiments show that,
compared to recent single instance superquadrics reconstruction approaches,
ISCO provides consistently more accurate 3D reconstructions, even from images
in the wild. Code available at https://github.com/ExplainableML/ISCO .Comment: Accepted at ICCV 202
Superquadric representation of scenes from multi-view range data
Object representation denotes representing three-dimensional (3D) real-world objects with known graphic or mathematic primitives recognizable to computers. This research has numerous applications for object-related tasks in areas including computer vision, computer graphics, reverse engineering, etc. Superquadrics, as volumetric and parametric models, have been selected to be the representation primitives throughout this research. Superquadrics are able to represent a large family of solid shapes by a single equation with only a few parameters. This dissertation addresses superquadric representation of multi-part objects and multiobject scenes. Two issues motivate this research. First, superquadric representation of multipart objects or multi-object scenes has been an unsolved problem due to the complex geometry of objects. Second, superquadrics recovered from single-view range data tend to have low confidence and accuracy due to partially scanned object surfaces caused by inherent occlusions. To address these two problems, this dissertation proposes a multi-view superquadric representation algorithm. By incorporating both part decomposition and multi-view range data, the proposed algorithm is able to not only represent multi-part objects or multi-object scenes, but also achieve high confidence and accuracy of recovered superquadrics. The multi-view superquadric representation algorithm consists of (i) initial superquadric model recovery from single-view range data, (ii) pairwise view registration based on recovered superquadric models, (iii) view integration, (iv) part decomposition, and (v) final superquadric fitting for each decomposed part. Within the multi-view superquadric representation framework, this dissertation proposes a 3D part decomposition algorithm to automatically decompose multi-part objects or multiobject scenes into their constituent single parts consistent with human visual perception. Superquadrics can then be recovered for each decomposed single-part object. The proposed part decomposition algorithm is based on curvature analysis, and includes (i) Gaussian curvature estimation, (ii) boundary labeling, (iii) part growing and labeling, and (iv) post-processing. In addition, this dissertation proposes an extended view registration algorithm based on superquadrics. The proposed view registration algorithm is able to handle deformable superquadrics as well as 3D unstructured data sets. For superquadric fitting, two objective functions primarily used in the literature have been comprehensively investigated with respect to noise, viewpoints, sample resolutions, etc. The objective function proved to have better performance has been used throughout this dissertation. In summary, the three algorithms (contributions) proposed in this dissertation are generic and flexible in the sense of handling triangle meshes, which are standard surface primitives in computer vision and graphics. For each proposed algorithm, the dissertation presents both theory and experimental results. The results demonstrate the efficiency of the algorithms using both synthetic and real range data of a large variety of objects and scenes. In addition, the experimental results include comparisons with previous methods from the literature. Finally, the dissertation concludes with a summary of the contributions to the state of the art in superquadric representation, and presents possible future extensions to this research
Sense, Think, Grasp: A study on visual and tactile information processing for autonomous manipulation
Interacting with the environment using hands is one of the distinctive
abilities of humans with respect to other species. This aptitude reflects on
the crucial role played by objects\u2019 manipulation in the world that we have
shaped for us. With a view of bringing robots outside industries for supporting
people during everyday life, the ability of manipulating objects
autonomously and in unstructured environments is therefore one of the basic
skills they need. Autonomous manipulation is characterized by great
complexity especially regarding the processing of sensors information to
perceive the surrounding environment. Humans rely on vision for wideranging
tridimensional information, prioprioception for the awareness of
the relative position of their own body in the space and the sense of touch
for local information when physical interaction with objects happens. The
study of autonomous manipulation in robotics aims at transferring similar
perceptive skills to robots so that, combined with state of the art control
techniques, they could be able to achieve similar performance in manipulating
objects. The great complexity of this task makes autonomous
manipulation one of the open problems in robotics that has been drawing
increasingly the research attention in the latest years.
In this work of Thesis, we propose possible solutions to some key components
of autonomous manipulation, focusing in particular on the perception
problem and testing the developed approaches on the humanoid robotic platform iCub. When available, vision is the first source of information
to be processed for inferring how to interact with objects. The object
modeling and grasping pipeline based on superquadric functions we designed
meets this need, since it reconstructs the object 3D model from partial
point cloud and computes a suitable hand pose for grasping the object.
Retrieving objects information with touch sensors only is a relevant skill
that becomes crucial when vision is occluded, as happens for instance during
physical interaction with the object. We addressed this problem with
the design of a novel tactile localization algorithm, named Memory Unscented
Particle Filter, capable of localizing and recognizing objects relying solely
on 3D contact points collected on the object surface. Another key point of
autonomous manipulation we report on in this Thesis work is bi-manual
coordination. The execution of more advanced manipulation tasks in fact
might require the use and coordination of two arms. Tool usage for instance
often requires a proper in-hand object pose that can be obtained via
dual-arm re-grasping. In pick-and-place tasks sometimes the initial and
target position of the object do not belong to the same arm workspace, then
requiring to use one hand for lifting the object and the other for locating it
in the new position. At this regard, we implemented a pipeline for executing
the handover task, i.e. the sequences of actions for autonomously passing an
object from one robot hand on to the other.
The contributions described thus far address specific subproblems of
the more complex task of autonomous manipulation. This actually differs
from what humans do, in that humans develop their manipulation
skills by learning through experience and trial-and-error strategy. Aproper
mathematical formulation for encoding this learning approach is given by
Deep Reinforcement Learning, that has recently proved to be successful in
many robotics applications. For this reason, in this Thesis we report also
on the six month experience carried out at Berkeley Artificial Intelligence
Research laboratory with the goal of studying Deep Reinforcement Learning
and its application to autonomous manipulation
Surface and Volumetric Segmentation of Complex 3-D Objects Using Parametric Shape Models
The problem of part definition, description, and decomposition is central to the shape recognition systems. In this dissertation, we develop an integrated framework for segmenting dense range data of complex 3-D scenes into their constituent parts in terms of surface and volumetric primitives. Unlike previous approaches, we use geometric properties derived from surface, as well as volumetric models, to recover structured descriptions of complex objects without a priori domain knowledge or stored models.
To recover shape descriptions, we use bi-quadric models for surface representation and superquadric models for object-centered volumetric representation. The surface segmentation uses a novel approach of searching for the best piecewise description of the image in terms of bi-quadric (z = f(x,y)) models. It is used to generate the region adjacency graphs, to localize surface discontinuities, and to derive global shape properties of the surfaces. A superquadric model is recovered for the entire data set and residuals are computed to evaluate the fit. The goodness-of-fit value based on the inside-outside function, and the mean-squared distance of data from the model provide quantitative evaluation of the model. The qualitative evaluation criteria check the local consistency of the model in the form of residual maps of overestimated and underestimated data regions.
The control structure invokes the models in a systematic manner, evaluates the intermediate descriptions, and integrates them to achieve final segmentation. Superquadric and bi-quadric models are recovered in parallel to incorporate the best of the coarse-to-fine and fine-to-coarse segmentation strategies. The model evaluation criteria determine the dimensionality of the scene, and decide whether to terminate the procedure, or selectively refine the segmentation by following a global-to-local part segmentation approach. The control module generates hypotheses about superquadric models at clusters of underestimated data and performs controlled extrapolation of the part-model by shrinking the global model. As the global model shrinks and the local models grow, they are evaluated and tested for termination or further segmentation.
We present results on real range images of scenes of varying complexity, including objects with occluding parts, and scenes where surface segmentation is not sufficient to guide the volumetric segmentation. We analyze the issue of segmentation of complex scenes thoroughly by studying the effect of missing data on volumetric model recovery, generating object-centered descriptions, and presenting a complete set of criteria for the evaluation of the superquadric models. We conclude by discussing the applications of our approach in data reduction, 3-D object recognition, geometric modeling, automatic model generation. object manipulation, and active vision
Consistent depth video segmentation using adaptive surface models
We propose a new approach for the segmentation of 3-D point clouds into geometric surfaces using adaptive surface models. Starting from an initial configuration, the algorithm converges to a stable segmentation through a new iterative split-And-merge procedure, which includes an adaptive mechanism for the creation and removal of segments. This allows the segmentation to adjust to changing input data along the movie, leading to stable, temporally coherent, and traceable segments. We tested the method on a large variety of data acquired with different range imaging devices, including a structured-light sensor and a time-of-flight camera, and successfully segmented the videos into surface segments. We further demonstrated the feasibility of the approach using quantitative evaluations based on ground-truth data.This research is partially funded by the EU project IntellAct (FP7-269959), the Grup consolidat 2009 SGR155, the project PAU+ (DPI2011-27510), and the CSIC project CINNOVA (201150E088). B. Dellen acknowledges support from the Spanish Ministry of Science and Innovation through a Ramon y Cajal program.Peer Reviewe
CFD-DEM Modeling of Spouted Beds With Internal Devices Using PTV
195 p.Esta tesis se centra en la extracción de perfiles de velocidad de sólidos, tanto esféricos como irregulares, en un spouted bed y el análisis de estos valores bajo la influencia de diferentes dispositivos internos en el contactor y caudales. El análisis se ha centrado en un contactor cónico mientras que un contactor de perfil prismático también ha sido utilizado para analizar el efecto de esta geometría en la dinámica del sistema. Estos valores experimentales de sólidos regulares e irregulares han sido modelados y simulados a través de un modelo CFD-DEM en el que la fase continua y discreta se han acoplado, a fin de garantizar simulaciones realistas y capaces de predecir parámetros difíciles de obtener de manera experimental y cruciales para el diseño y escalado de estos tipos de lechos; como son los tiempos de ciclo de los sólidos y la distribución de tiempos de residencia del gas bajo diferentes condiciones. Estos parámetros determinan la capacidad de un sistema y la eficacia a la hora de utilizar el volumen del reactor
CFD-DEM Modeling of Spouted Beds With Internal Devices Using PTV
195 p.Esta tesis se centra en la extracción de perfiles de velocidad de sólidos, tanto esféricos como irregulares, en un spouted bed y el análisis de estos valores bajo la influencia de diferentes dispositivos internos en el contactor y caudales. El análisis se ha centrado en un contactor cónico mientras que un contactor de perfil prismático también ha sido utilizado para analizar el efecto de esta geometría en la dinámica del sistema. Estos valores experimentales de sólidos regulares e irregulares han sido modelados y simulados a través de un modelo CFD-DEM en el que la fase continua y discreta se han acoplado, a fin de garantizar simulaciones realistas y capaces de predecir parámetros difíciles de obtener de manera experimental y cruciales para el diseño y escalado de estos tipos de lechos; como son los tiempos de ciclo de los sólidos y la distribución de tiempos de residencia del gas bajo diferentes condiciones. Estos parámetros determinan la capacidad de un sistema y la eficacia a la hora de utilizar el volumen del reactor