106 research outputs found

    Data-Driven Grasp Synthesis - A Survey

    Full text link
    We review the work on data-driven grasp synthesis and the methodologies for sampling and ranking candidate grasps. We divide the approaches into three groups based on whether they synthesize grasps for known, familiar or unknown objects. This structure allows us to identify common object representations and perceptual processes that facilitate the employed data-driven grasp synthesis technique. In the case of known objects, we concentrate on the approaches that are based on object recognition and pose estimation. In the case of familiar objects, the techniques use some form of a similarity matching to a set of previously encountered objects. Finally for the approaches dealing with unknown objects, the core part is the extraction of specific features that are indicative of good grasps. Our survey provides an overview of the different methodologies and discusses open problems in the area of robot grasping. We also draw a parallel to the classical approaches that rely on analytic formulations.Comment: 20 pages, 30 Figures, submitted to IEEE Transactions on Robotic

    Developing a Semantic-Driven Hybrid Segmentation Method for Point Clouds of 3D Shapes

    Get PDF
    With the rapid development of point cloud processing technologies and the availability of a wide range of 3D capturing devices, a geometric object from the real world can be directly represented digitally as a dense and fine point cloud. Decomposing a 3D shape represented in point cloud into meaningful parts has very important practical implications in the fields of computer graphics, virtual reality and mixed reality. In this paper, a semantic-driven automated hybrid segmentation method is proposed for 3D point cloud shapes. Our method consists of three stages: semantic clustering, variational merging, and region remerging. In the first stage, a new feature of point cloud, called Local Concave-Convex Histogram, is introduced to first extract saddle regions complying with the semantic boundary feature. All other types of regions are then aggregated according to this extracted feature. This stage often leads to multiple over-segmentation convex regions, which are then remerged by a variational method established based on the narrow-band theory. Finally, in order to recombine the regions with the approximate shapes, order relation is introduced to improve the weighting forms in calculating the conventional Shape Diameter Function. We have conducted extensive experiments with the Princeton Dataset. The results show that the proposed algorithm outperforms the state-of-the-art algorithms in this area. We have also applied the proposed algorithm to process the point cloud data acquired directly from the real 3D objects. It achieves excellent results too. These results demonstrate that the method proposed in this paper is effective and universal

    Algorithms for the reconstruction, analysis, repairing and enhancement of 3D urban models from multiple data sources

    Get PDF
    Over the last few years, there has been a notorious growth in the field of digitization of 3D buildings and urban environments. The substantial improvement of both scanning hardware and reconstruction algorithms has led to the development of representations of buildings and cities that can be remotely transmitted and inspected in real-time. Among the applications that implement these technologies are several GPS navigators and virtual globes such as Google Earth or the tools provided by the Institut Cartogràfic i Geològic de Catalunya. In particular, in this thesis, we conceptualize cities as a collection of individual buildings. Hence, we focus on the individual processing of one structure at a time, rather than on the larger-scale processing of urban environments. Nowadays, there is a wide diversity of digitization technologies, and the choice of the appropriate one is key for each particular application. Roughly, these techniques can be grouped around three main families: - Time-of-flight (terrestrial and aerial LiDAR). - Photogrammetry (street-level, satellite, and aerial imagery). - Human-edited vector data (cadastre and other map sources). Each of these has its advantages in terms of covered area, data quality, economic cost, and processing effort. Plane and car-mounted LiDAR devices are optimal for sweeping huge areas, but acquiring and calibrating such devices is not a trivial task. Moreover, the capturing process is done by scan lines, which need to be registered using GPS and inertial data. As an alternative, terrestrial LiDAR devices are more accessible but cover smaller areas, and their sampling strategy usually produces massive point clouds with over-represented plain regions. A more inexpensive option is street-level imagery. A dense set of images captured with a commodity camera can be fed to state-of-the-art multi-view stereo algorithms to produce realistic-enough reconstructions. One other advantage of this approach is capturing high-quality color data, whereas the geometric information is usually lacking. In this thesis, we analyze in-depth some of the shortcomings of these data-acquisition methods and propose new ways to overcome them. Mainly, we focus on the technologies that allow high-quality digitization of individual buildings. These are terrestrial LiDAR for geometric information and street-level imagery for color information. Our main goal is the processing and completion of detailed 3D urban representations. For this, we will work with multiple data sources and combine them when possible to produce models that can be inspected in real-time. Our research has focused on the following contributions: - Effective and feature-preserving simplification of massive point clouds. - Developing normal estimation algorithms explicitly designed for LiDAR data. - Low-stretch panoramic representation for point clouds. - Semantic analysis of street-level imagery for improved multi-view stereo reconstruction. - Color improvement through heuristic techniques and the registration of LiDAR and imagery data. - Efficient and faithful visualization of massive point clouds using image-based techniques.Durant els darrers anys, hi ha hagut un creixement notori en el camp de la digitalització d'edificis en 3D i entorns urbans. La millora substancial tant del maquinari d'escaneig com dels algorismes de reconstrucció ha portat al desenvolupament de representacions d'edificis i ciutats que es poden transmetre i inspeccionar remotament en temps real. Entre les aplicacions que implementen aquestes tecnologies hi ha diversos navegadors GPS i globus virtuals com Google Earth o les eines proporcionades per l'Institut Cartogràfic i Geològic de Catalunya. En particular, en aquesta tesi, conceptualitzem les ciutats com una col·lecció d'edificis individuals. Per tant, ens centrem en el processament individual d'una estructura a la vegada, en lloc del processament a gran escala d'entorns urbans. Avui en dia, hi ha una àmplia diversitat de tecnologies de digitalització i la selecció de l'adequada és clau per a cada aplicació particular. Aproximadament, aquestes tècniques es poden agrupar en tres famílies principals: - Temps de vol (LiDAR terrestre i aeri). - Fotogrametria (imatges a escala de carrer, de satèl·lit i aèries). - Dades vectorials editades per humans (cadastre i altres fonts de mapes). Cadascun d'ells presenta els seus avantatges en termes d'àrea coberta, qualitat de les dades, cost econòmic i esforç de processament. Els dispositius LiDAR muntats en avió i en cotxe són òptims per escombrar àrees enormes, però adquirir i calibrar aquests dispositius no és una tasca trivial. A més, el procés de captura es realitza mitjançant línies d'escaneig, que cal registrar mitjançant GPS i dades inercials. Com a alternativa, els dispositius terrestres de LiDAR són més accessibles, però cobreixen àrees més petites, i la seva estratègia de mostreig sol produir núvols de punts massius amb regions planes sobrerepresentades. Una opció més barata són les imatges a escala de carrer. Es pot fer servir un conjunt dens d'imatges capturades amb una càmera de qualitat mitjana per obtenir reconstruccions prou realistes mitjançant algorismes estèreo d'última generació per produir. Un altre avantatge d'aquest mètode és la captura de dades de color d'alta qualitat. Tanmateix, la informació geomètrica resultant sol ser de baixa qualitat. En aquesta tesi, analitzem en profunditat algunes de les mancances d'aquests mètodes d'adquisició de dades i proposem noves maneres de superar-les. Principalment, ens centrem en les tecnologies que permeten una digitalització d'alta qualitat d'edificis individuals. Es tracta de LiDAR terrestre per obtenir informació geomètrica i imatges a escala de carrer per obtenir informació sobre colors. El nostre objectiu principal és el processament i la millora de representacions urbanes 3D amb molt detall. Per a això, treballarem amb diverses fonts de dades i les combinarem quan sigui possible per produir models que es puguin inspeccionar en temps real. La nostra investigació s'ha centrat en les següents contribucions: - Simplificació eficaç de núvols de punts massius, preservant detalls d'alta resolució. - Desenvolupament d'algoritmes d'estimació normal dissenyats explícitament per a dades LiDAR. - Representació panoràmica de baixa distorsió per a núvols de punts. - Anàlisi semàntica d'imatges a escala de carrer per millorar la reconstrucció estèreo de façanes. - Millora del color mitjançant tècniques heurístiques i el registre de dades LiDAR i imatge. - Visualització eficient i fidel de núvols de punts massius mitjançant tècniques basades en imatges

    Efficient 3D Segmentation, Registration and Mapping for Mobile Robots

    Get PDF
    Sometimes simple is better! For certain situations and tasks, simple but robust methods can achieve the same or better results in the same or less time than related sophisticated approaches. In the context of robots operating in real-world environments, key challenges are perceiving objects of interest and obstacles as well as building maps of the environment and localizing therein. The goal of this thesis is to carefully analyze such problem formulations, to deduce valid assumptions and simplifications, and to develop simple solutions that are both robust and fast. All approaches make use of sensors capturing 3D information, such as consumer RGBD cameras. Comparative evaluations show the performance of the developed approaches. For identifying objects and regions of interest in manipulation tasks, a real-time object segmentation pipeline is proposed. It exploits several common assumptions of manipulation tasks such as objects being on horizontal support surfaces (and well separated). It achieves real-time performance by using particularly efficient approximations in the individual processing steps, subsampling the input data where possible, and processing only relevant subsets of the data. The resulting pipeline segments 3D input data with up to 30Hz. In order to obtain complete segmentations of the 3D input data, a second pipeline is proposed that approximates the sampled surface, smooths the underlying data, and segments the smoothed surface into coherent regions belonging to the same geometric primitive. It uses different primitive models and can reliably segment input data into planes, cylinders and spheres. A thorough comparative evaluation shows state-of-the-art performance while computing such segmentations in near real-time. The second part of the thesis addresses the registration of 3D input data, i.e., consistently aligning input captured from different view poses. Several methods are presented for different types of input data. For the particular application of mapping with micro aerial vehicles where the 3D input data is particularly sparse, a pipeline is proposed that uses the same approximate surface reconstruction to exploit the measurement topology and a surface-to-surface registration algorithm that robustly aligns the data. Optimization of the resulting graph of determined view poses then yields globally consistent 3D maps. For sequences of RGBD data this pipeline is extended to include additional subsampling steps and an initial alignment of the data in local windows in the pose graph. In both cases, comparative evaluations show a robust and fast alignment of the input data

    Enhancing RGB-D SLAM Using Deep Learning

    Get PDF

    LOD Generation for Urban Scenes

    Get PDF
    International audienceWe introduce a novel approach that reconstructs 3D urban scenes in the form of levels of detail (LODs). Starting from raw data sets such as surface meshes generated by multi-view stereo systems, our algorithm proceeds in three main steps: classification, abstraction and reconstruction. From geometric attributes and a set of semantic rules combined with a Markov random field, we classify the scene into four meaningful classes. The abstraction step detects and regularizes planar structures on buildings, fits icons on trees, roofs and facades, and performs filtering and simplification for LOD generation. The abstracted data are then provided as input to the reconstruction step which generates watertight buildings through a min-cut formula-tion on a set of 3D arrangements. Our experiments on complex buildings and large scale urban scenes show that our approach generates meaningful LODs while being robust and scalable. By combining semantic segmentation and abstraction it also outperforms general mesh approximation ap-proaches at preserving urban structures

    Human perception-oriented segmentation for triangle meshes

    Get PDF
    A segmentação de malhas é um tópico importante de investigação em computação gráfica, em particular em modelação geométrica. Isto deve-se ao facto de as técnicas de segmentaçãodemalhasteremváriasaplicações,nomeadamentenaproduçãodefilmes, animaçãoporcomputador, realidadevirtual, compressãodemalhas, assimcomoemjogosdigitais. Emconcreto, asmalhastriangularessãoamplamenteusadasemaplicações interativas, visto que sua segmentação em partes significativas (também designada por segmentação significativa, segmentação perceptiva ou segmentação perceptualmente significativa ) é muitas vezes vista como uma forma de acelerar a interação com o utilizador ou a deteção de colisões entre esses objetos 3D definidos por uma malha, bem como animar uma ou mais partes significativas (por exemplo, a cabeça de uma personagem) de um dado objeto, independentemente das restantes partes. Acontece que não se conhece nenhuma técnica capaz de segmentar correctamente malhas arbitrárias −ainda que restritas aos domínios de formas livres e não-livres− em partes significativas. Algumas técnicas são mais adequadas para objetos de forma não-livre (por exemplo, peças mecânicas definidas geometricamente por quádricas), enquanto outras são mais talhadas para o domínio dos objectos de forma livre. Só na literatura recente surgem umas poucas técnicas que se aplicam a todo o universo de objetos de forma livre e não-livre. Pior ainda é o facto de que a maioria das técnicas de segmentação não serem totalmente automáticas, no sentido de que quase todas elas exigem algum tipo de pré-requisitos e assistência do utilizador. Resumindo, estes três desafios relacionados com a proximidade perceptual, generalidade e automação estão no cerne do trabalho descrito nesta tese. Para enfrentar estes desafios, esta tese introduz o primeiro algoritmo de segmentação baseada nos contornos ou fronteiras dos segmentos, cuja técnica se inspira nas técnicas de segmentação baseada em arestas, tão comuns em análise e processamento de imagem,porcontraposiçãoàstécnicasesegmentaçãobaseadaemregiões. Aideiaprincipal é a de encontrar em primeiro lugar a fronteira de cada região para, em seguida, identificar e agrupar todos os seus triângulos internos. As regiões da malha encontradas correspondem a saliências e reentrâncias, que não precisam de ser estritamente convexas, nem estritamente côncavas, respectivamente. Estas regiões, designadas regiões relaxadamenteconvexas(ousaliências)eregiõesrelaxadamentecôncavas(oureentrâncias), produzem segmentações que são menos sensíveis ao ruído e, ao mesmo tempo, são mais intuitivas do ponto de vista da perceção humana; por isso, é designada por segmentação orientada à perceção humana (ou, human perception- oriented (HPO), do inglês). Além disso, e ao contrário do atual estado-da-arte da segmentação de malhas, a existência destas regiões relaxadas torna o algoritmo capaz de segmentar de maneira bastante plausível tanto objectos de forma não-livre como objectos de forma livre. Nesta tese, enfrentou-se também um quarto desafio, que está relacionado com a fusão de segmentação e multi-resolução de malhas. Em boa verdade, já existe na literatura uma variedade grande de técnicas de segmentação, bem como um número significativo de técnicas de multi-resolução, para malhas triangulares. No entanto, não é assim tão comum encontrar estruturas de dados e algoritmos que façam a fusão ou a simbiose destes dois conceitos, multi-resolução e segmentação, num único esquema multi-resolução que sirva os propósitos das aplicações que lidam com malhas simples e segmentadas, sendo que neste contexto se entende que uma malha simples é uma malha com um único segmento. Sendo assim, nesta tese descreve-se um novo esquema (entenda-seestruturasdedadosealgoritmos)demulti-resoluçãoesegmentação,designado por extended Ghost Cell (xGC). Este esquema preserva a forma das malhas, tanto em termos globais como locais, ou seja, os segmentos da malha e as suas fronteiras, bem como os seus vincos e ápices são preservados, não importa o nível de resolução que usamos durante a/o simplificação/refinamento da malha. Além disso, ao contrário de outros esquemas de segmentação, tornou-se possível ter segmentos adjacentes com dois ou mais níveis de resolução de diferença. Isto é particularmente útil em animação por computador, compressão e transmissão de malhas, operações de modelação geométrica, visualização científica e computação gráfica. Em suma, esta tese apresenta um esquema genérico, automático, e orientado à percepção humana, que torna possível a simbiose dos conceitos de segmentação e multiresolução de malhas trianguladas que sejam representativas de objectos 3D.The mesh segmentation is an important topic in computer graphics, in particular in geometric computing. This is so because mesh segmentation techniques find many applications in movies, computer animation, virtual reality, mesh compression, and games. Infact, trianglemeshesarewidelyusedininteractiveapplications, sothattheir segmentation in meaningful parts (i.e., human-perceptually segmentation, perceptive segmentationormeaningfulsegmentation)isoftenseenasawayofspeedinguptheuser interaction, detecting collisions between these mesh-covered objects in a 3D scene, as well as animating one or more meaningful parts (e.g., the head of a humanoid) independently of the other parts of a given object. It happens that there is no known technique capable of correctly segmenting any mesh into meaningful parts. Some techniques are more adequate for non-freeform objects (e.g., quadricmechanicalparts), whileothersperformbetterinthedomainoffreeform objects. Only recently, some techniques have been developed for the entire universe of objects and shapes. Even worse it is the fact that most segmentation techniques are not entirely automated in the sense that almost all techniques require some sort of pre-requisites and user assistance. Summing up, these three challenges related to perceptual proximity, generality and automation are at the core of the work described in this thesis. In order to face these challenges, we have developed the first contour-based mesh segmentation algorithm that we may find in the literature, which is inspired in the edgebased segmentation techniques used in image analysis, as opposite to region-based segmentation techniques. Its leading idea is to firstly find the contour of each region, and then to identify and collect all of its inner triangles. The encountered mesh regions correspond to ups and downs, which do not need to be strictly convex nor strictly concave, respectively. These regions, called relaxedly convex regions (or saliences) and relaxedly concave regions (or recesses), produce segmentations that are less-sensitive to noise and, at the same time, are more intuitive from the human point of view; hence it is called human perception- oriented (HPO) segmentation. Besides, and unlike the current state-of-the-art in mesh segmentation, the existence of these relaxed regions makes the algorithm suited to both non-freeform and freeform objects. In this thesis, we have also tackled a fourth challenge, which is related with the fusion of mesh segmentation and multi-resolution. Truly speaking, a plethora of segmentation techniques, as well as a number of multiresolution techniques, for triangle meshes already exist in the literature. However, it is not so common to find algorithms and data structures that fuse these two concepts, multiresolution and segmentation, into a symbiotic multi-resolution scheme for both plain and segmented meshes, in which a plainmeshisunderstoodasameshwithasinglesegment. So, weintroducesuchanovel multiresolution segmentation scheme, called extended Ghost Cell (xGC) scheme. This scheme preserves the shape of the meshes in both global and local terms, i.e., mesh segments and their boundaries, as well as creases and apices are preserved, no matter the level of resolution we use for simplification/refinement of the mesh. Moreover, unlike other segmentation schemes, it was made possible to have adjacent segments with two or more resolution levels of difference. This is particularly useful in computer animation, mesh compression and transmission, geometric computing, scientific visualization, and computer graphics. In short, this thesis presents a fully automatic, general, and human perception-oriented scheme that symbiotically integrates the concepts of mesh segmentation and multiresolution

    A Survey of Surface Reconstruction from Point Clouds

    Get PDF
    International audienceThe area of surface reconstruction has seen substantial progress in the past two decades. The traditional problem addressed by surface reconstruction is to recover the digital representation of a physical shape that has been scanned, where the scanned data contains a wide variety of defects. While much of the earlier work has been focused on reconstructing a piece-wise smooth representation of the original shape, recent work has taken on more specialized priors to address significantly challenging data imperfections, where the reconstruction can take on different representations – not necessarily the explicit geometry. We survey the field of surface reconstruction, and provide a categorization with respect to priors, data imperfections, and reconstruction output. By considering a holistic view of surface reconstruction, we show a detailed characterization of the field, highlight similarities between diverse reconstruction techniques, and provide directions for future work in surface reconstruction

    Regular Grids: An Irregular Approach to the 3D Modelling Pipeline

    Get PDF
    The 3D modelling pipeline covers the process by which a physical object is scanned to create a set of points that lay on its surface. These data are then cleaned to remove outliers or noise, and the points are reconstructed into a digital representation of the original object. The aim of this thesis is to present novel grid-based methods and provide several case studies of areas in the 3D modelling pipeline in which they may be effectively put to use. The first is a demonstration of how using a grid can allow a significant reduction in memory required to perform the reconstruction. The second is the detection of surface features (ridges, peaks, troughs, etc.) during the surface reconstruction process. The third contribution is the alignment of two meshes with zero prior knowledge. This is particularly suited to aligning two related, but not identical, models. The final contribution is the comparison of two similar meshes with support for both qualitative and quantitative outputs
    • …
    corecore