48 research outputs found

    Iterative Superquadric Recomposition of 3D Objects from Multiple Views

    Full text link
    Humans are good at recomposing novel objects, i.e. they can identify commonalities between unknown objects from general structure to finer detail, an ability difficult to replicate by machines. We propose a framework, ISCO, to recompose an object using 3D superquadrics as semantic parts directly from 2D views without training a model that uses 3D supervision. To achieve this, we optimize the superquadric parameters that compose a specific instance of the object, comparing its rendered 3D view and 2D image silhouette. Our ISCO framework iteratively adds new superquadrics wherever the reconstruction error is high, abstracting first coarse regions and then finer details of the target object. With this simple coarse-to-fine inductive bias, ISCO provides consistent superquadrics for related object parts, despite not having any semantic supervision. Since ISCO does not train any neural network, it is also inherently robust to out-of-distribution objects. Experiments show that, compared to recent single instance superquadrics reconstruction approaches, ISCO provides consistently more accurate 3D reconstructions, even from images in the wild. Code available at https://github.com/ExplainableML/ISCO .Comment: Accepted at ICCV 202

    Superquadric representation of scenes from multi-view range data

    Get PDF
    Object representation denotes representing three-dimensional (3D) real-world objects with known graphic or mathematic primitives recognizable to computers. This research has numerous applications for object-related tasks in areas including computer vision, computer graphics, reverse engineering, etc. Superquadrics, as volumetric and parametric models, have been selected to be the representation primitives throughout this research. Superquadrics are able to represent a large family of solid shapes by a single equation with only a few parameters. This dissertation addresses superquadric representation of multi-part objects and multiobject scenes. Two issues motivate this research. First, superquadric representation of multipart objects or multi-object scenes has been an unsolved problem due to the complex geometry of objects. Second, superquadrics recovered from single-view range data tend to have low confidence and accuracy due to partially scanned object surfaces caused by inherent occlusions. To address these two problems, this dissertation proposes a multi-view superquadric representation algorithm. By incorporating both part decomposition and multi-view range data, the proposed algorithm is able to not only represent multi-part objects or multi-object scenes, but also achieve high confidence and accuracy of recovered superquadrics. The multi-view superquadric representation algorithm consists of (i) initial superquadric model recovery from single-view range data, (ii) pairwise view registration based on recovered superquadric models, (iii) view integration, (iv) part decomposition, and (v) final superquadric fitting for each decomposed part. Within the multi-view superquadric representation framework, this dissertation proposes a 3D part decomposition algorithm to automatically decompose multi-part objects or multiobject scenes into their constituent single parts consistent with human visual perception. Superquadrics can then be recovered for each decomposed single-part object. The proposed part decomposition algorithm is based on curvature analysis, and includes (i) Gaussian curvature estimation, (ii) boundary labeling, (iii) part growing and labeling, and (iv) post-processing. In addition, this dissertation proposes an extended view registration algorithm based on superquadrics. The proposed view registration algorithm is able to handle deformable superquadrics as well as 3D unstructured data sets. For superquadric fitting, two objective functions primarily used in the literature have been comprehensively investigated with respect to noise, viewpoints, sample resolutions, etc. The objective function proved to have better performance has been used throughout this dissertation. In summary, the three algorithms (contributions) proposed in this dissertation are generic and flexible in the sense of handling triangle meshes, which are standard surface primitives in computer vision and graphics. For each proposed algorithm, the dissertation presents both theory and experimental results. The results demonstrate the efficiency of the algorithms using both synthetic and real range data of a large variety of objects and scenes. In addition, the experimental results include comparisons with previous methods from the literature. Finally, the dissertation concludes with a summary of the contributions to the state of the art in superquadric representation, and presents possible future extensions to this research

    Sense, Think, Grasp: A study on visual and tactile information processing for autonomous manipulation

    Get PDF
    Interacting with the environment using hands is one of the distinctive abilities of humans with respect to other species. This aptitude reflects on the crucial role played by objects\u2019 manipulation in the world that we have shaped for us. With a view of bringing robots outside industries for supporting people during everyday life, the ability of manipulating objects autonomously and in unstructured environments is therefore one of the basic skills they need. Autonomous manipulation is characterized by great complexity especially regarding the processing of sensors information to perceive the surrounding environment. Humans rely on vision for wideranging tridimensional information, prioprioception for the awareness of the relative position of their own body in the space and the sense of touch for local information when physical interaction with objects happens. The study of autonomous manipulation in robotics aims at transferring similar perceptive skills to robots so that, combined with state of the art control techniques, they could be able to achieve similar performance in manipulating objects. The great complexity of this task makes autonomous manipulation one of the open problems in robotics that has been drawing increasingly the research attention in the latest years. In this work of Thesis, we propose possible solutions to some key components of autonomous manipulation, focusing in particular on the perception problem and testing the developed approaches on the humanoid robotic platform iCub. When available, vision is the first source of information to be processed for inferring how to interact with objects. The object modeling and grasping pipeline based on superquadric functions we designed meets this need, since it reconstructs the object 3D model from partial point cloud and computes a suitable hand pose for grasping the object. Retrieving objects information with touch sensors only is a relevant skill that becomes crucial when vision is occluded, as happens for instance during physical interaction with the object. We addressed this problem with the design of a novel tactile localization algorithm, named Memory Unscented Particle Filter, capable of localizing and recognizing objects relying solely on 3D contact points collected on the object surface. Another key point of autonomous manipulation we report on in this Thesis work is bi-manual coordination. The execution of more advanced manipulation tasks in fact might require the use and coordination of two arms. Tool usage for instance often requires a proper in-hand object pose that can be obtained via dual-arm re-grasping. In pick-and-place tasks sometimes the initial and target position of the object do not belong to the same arm workspace, then requiring to use one hand for lifting the object and the other for locating it in the new position. At this regard, we implemented a pipeline for executing the handover task, i.e. the sequences of actions for autonomously passing an object from one robot hand on to the other. The contributions described thus far address specific subproblems of the more complex task of autonomous manipulation. This actually differs from what humans do, in that humans develop their manipulation skills by learning through experience and trial-and-error strategy. Aproper mathematical formulation for encoding this learning approach is given by Deep Reinforcement Learning, that has recently proved to be successful in many robotics applications. For this reason, in this Thesis we report also on the six month experience carried out at Berkeley Artificial Intelligence Research laboratory with the goal of studying Deep Reinforcement Learning and its application to autonomous manipulation

    Surface and Volumetric Segmentation of Complex 3-D Objects Using Parametric Shape Models

    Get PDF
    The problem of part definition, description, and decomposition is central to the shape recognition systems. In this dissertation, we develop an integrated framework for segmenting dense range data of complex 3-D scenes into their constituent parts in terms of surface and volumetric primitives. Unlike previous approaches, we use geometric properties derived from surface, as well as volumetric models, to recover structured descriptions of complex objects without a priori domain knowledge or stored models. To recover shape descriptions, we use bi-quadric models for surface representation and superquadric models for object-centered volumetric representation. The surface segmentation uses a novel approach of searching for the best piecewise description of the image in terms of bi-quadric (z = f(x,y)) models. It is used to generate the region adjacency graphs, to localize surface discontinuities, and to derive global shape properties of the surfaces. A superquadric model is recovered for the entire data set and residuals are computed to evaluate the fit. The goodness-of-fit value based on the inside-outside function, and the mean-squared distance of data from the model provide quantitative evaluation of the model. The qualitative evaluation criteria check the local consistency of the model in the form of residual maps of overestimated and underestimated data regions. The control structure invokes the models in a systematic manner, evaluates the intermediate descriptions, and integrates them to achieve final segmentation. Superquadric and bi-quadric models are recovered in parallel to incorporate the best of the coarse-to-fine and fine-to-coarse segmentation strategies. The model evaluation criteria determine the dimensionality of the scene, and decide whether to terminate the procedure, or selectively refine the segmentation by following a global-to-local part segmentation approach. The control module generates hypotheses about superquadric models at clusters of underestimated data and performs controlled extrapolation of the part-model by shrinking the global model. As the global model shrinks and the local models grow, they are evaluated and tested for termination or further segmentation. We present results on real range images of scenes of varying complexity, including objects with occluding parts, and scenes where surface segmentation is not sufficient to guide the volumetric segmentation. We analyze the issue of segmentation of complex scenes thoroughly by studying the effect of missing data on volumetric model recovery, generating object-centered descriptions, and presenting a complete set of criteria for the evaluation of the superquadric models. We conclude by discussing the applications of our approach in data reduction, 3-D object recognition, geometric modeling, automatic model generation. object manipulation, and active vision

    Consistent depth video segmentation using adaptive surface models

    Get PDF
    We propose a new approach for the segmentation of 3-D point clouds into geometric surfaces using adaptive surface models. Starting from an initial configuration, the algorithm converges to a stable segmentation through a new iterative split-And-merge procedure, which includes an adaptive mechanism for the creation and removal of segments. This allows the segmentation to adjust to changing input data along the movie, leading to stable, temporally coherent, and traceable segments. We tested the method on a large variety of data acquired with different range imaging devices, including a structured-light sensor and a time-of-flight camera, and successfully segmented the videos into surface segments. We further demonstrated the feasibility of the approach using quantitative evaluations based on ground-truth data.This research is partially funded by the EU project IntellAct (FP7-269959), the Grup consolidat 2009 SGR155, the project PAU+ (DPI2011-27510), and the CSIC project CINNOVA (201150E088). B. Dellen acknowledges support from the Spanish Ministry of Science and Innovation through a Ramon y Cajal program.Peer Reviewe

    CFD-DEM Modeling of Spouted Beds With Internal Devices Using PTV

    Get PDF
    195 p.Esta tesis se centra en la extracción de perfiles de velocidad de sólidos, tanto esféricos como irregulares, en un spouted bed y el análisis de estos valores bajo la influencia de diferentes dispositivos internos en el contactor y caudales. El análisis se ha centrado en un contactor cónico mientras que un contactor de perfil prismático también ha sido utilizado para analizar el efecto de esta geometría en la dinámica del sistema. Estos valores experimentales de sólidos regulares e irregulares han sido modelados y simulados a través de un modelo CFD-DEM en el que la fase continua y discreta se han acoplado, a fin de garantizar simulaciones realistas y capaces de predecir parámetros difíciles de obtener de manera experimental y cruciales para el diseño y escalado de estos tipos de lechos; como son los tiempos de ciclo de los sólidos y la distribución de tiempos de residencia del gas bajo diferentes condiciones. Estos parámetros determinan la capacidad de un sistema y la eficacia a la hora de utilizar el volumen del reactor

    CFD-DEM Modeling of Spouted Beds With Internal Devices Using PTV

    Get PDF
    195 p.Esta tesis se centra en la extracción de perfiles de velocidad de sólidos, tanto esféricos como irregulares, en un spouted bed y el análisis de estos valores bajo la influencia de diferentes dispositivos internos en el contactor y caudales. El análisis se ha centrado en un contactor cónico mientras que un contactor de perfil prismático también ha sido utilizado para analizar el efecto de esta geometría en la dinámica del sistema. Estos valores experimentales de sólidos regulares e irregulares han sido modelados y simulados a través de un modelo CFD-DEM en el que la fase continua y discreta se han acoplado, a fin de garantizar simulaciones realistas y capaces de predecir parámetros difíciles de obtener de manera experimental y cruciales para el diseño y escalado de estos tipos de lechos; como son los tiempos de ciclo de los sólidos y la distribución de tiempos de residencia del gas bajo diferentes condiciones. Estos parámetros determinan la capacidad de un sistema y la eficacia a la hora de utilizar el volumen del reactor
    corecore