Search CORE

1,188 research outputs found

SEGCloud: Semantic Segmentation of 3D Point Clouds

Author: Armeni Iro
Choy Christopher B.
Gwak JunYoung
Savarese Silvio
Tchapmi Lyne P.
Publication venue
Publication date: 20/10/2017
Field of study

3D semantic scene labeling is fundamental to agents operating in the real world. In particular, labeling raw 3D point sets from sensors provides fine-grained semantics. Recent works leverage the capabilities of Neural Networks (NNs), but are limited to coarse voxel predictions and do not explicitly enforce global consistency. We present SEGCloud, an end-to-end framework to obtain 3D point-level segmentation that combines the advantages of NNs, trilinear interpolation(TI) and fully connected Conditional Random Fields (FC-CRF). Coarse voxel predictions from a 3D Fully Convolutional NN are transferred back to the raw 3D points via trilinear interpolation. Then the FC-CRF enforces global consistency and provides fine-grained semantics on the points. We implement the latter as a differentiable Recurrent NN to allow joint optimization. We evaluate the framework on two indoor and two outdoor 3D datasets (NYU V2, S3DIS, KITTI, Semantic3D.net), and show performance comparable or superior to the state-of-the-art on all datasets.Comment: Accepted as a spotlight at the International Conference of 3D Vision (3DV 2017

arXiv.org e-Print Archive

Crossref

Data-Driven Shape Analysis and Processing

Author: Huang Qixing
Kalogerakis Evangelos
Kim Vladimir G.
Xu Kai
Publication venue
Publication date: 23/02/2015
Field of study

Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and visualization of geometric data. In contrast to traditional approaches, a key feature of data-driven approaches is that they aggregate information from a collection of shapes to improve the analysis and processing of individual shapes. In addition, they are able to learn models that reason about properties and relationships of shapes without relying on hard-coded rules or explicitly programmed instructions. We provide an overview of the main concepts and components of these techniques, and discuss their application to shape classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis, through reviewing the literature and relating the existing works with both qualitative and numerical comparisons. We conclude our report with ideas that can inspire future research in data-driven shape analysis and processing.Comment: 10 pages, 19 figure

arXiv.org e-Print Archive

CiteSeerX

3D Reconstruction of Indoor Corridor Models Using Single Imagery and Video Sequences

Author: Jahromi Ali Baligh
Publication venue
Publication date: 11/05/2020
Field of study

In recent years, 3D indoor modeling has gained more attention due to its role in decision-making process of maintaining the status and managing the security of building indoor spaces. In this thesis, the problem of continuous indoor corridor space modeling has been tackled through two approaches. The first approach develops a modeling method based on middle-level perceptual organization. The second approach develops a visual Simultaneous Localisation and Mapping (SLAM) system with model-based loop closure. In the first approach, the image space was searched for a corridor layout that can be converted into a geometrically accurate 3D model. Manhattan rule assumption was adopted, and indoor corridor layout hypotheses were generated through a random rule-based intersection of image physical line segments and virtual rays of orthogonal vanishing points. Volumetric reasoning, correspondences to physical edges, orientation map and geometric context of an image are all considered for scoring layout hypotheses. This approach provides physically plausible solutions while facing objects or occlusions in a corridor scene. In the second approach, Layout SLAM is introduced. Layout SLAM performs camera localization while maps layout corners and normal point features in 3D space. Here, a new feature matching cost function was proposed considering both local and global context information. In addition, a rotation compensation variable makes Layout SLAM robust against cameras orientation errors accumulations. Moreover, layout model matching of keyframes insures accurate loop closures that prevent miss-association of newly visited landmarks to previously visited scene parts. The comparison of generated single image-based 3D models to ground truth models showed that average ratio differences in widths, heights and lengths were 1.8%, 3.7% and 19.2% respectively. Moreover, Layout SLAM performed with the maximum absolute trajectory error of 2.4m in position and 8.2 degree in orientation for approximately 318m path on RAWSEEDS data set. Loop closing was strongly performed for Layout SLAM and provided 3D indoor corridor layouts with less than 1.05m displacement errors in length and less than 20cm in width and height for approximately 315m path on York University data set. The proposed methods can successfully generate 3D indoor corridor models compared to their major counterpart

YorkSpace

Semantic Mapping of Road Scenes

Author: Sengupta S
Publication venue: 'Oxford Brookes University'
Publication date: 01/01/2014
Field of study

The problem of understanding road scenes has been on the fore-front in the computer vision community for the last couple of years. This enables autonomous systems to navigate and understand the surroundings in which it operates. It involves reconstructing the scene and estimating the objects present in it, such as ‘vehicles’, ‘road’, ‘pavements’ and ‘buildings’. This thesis focusses on these aspects and proposes solutions to address them. First, we propose a solution to generate a dense semantic map from multiple street-level images. This map can be imagined as the bird’s eye view of the region with associated semantic labels for ten’s of kilometres of street level data. We generate the overhead semantic view from street level images. This is in contrast to existing approaches using satellite/overhead imagery for classification of urban region, allowing us to produce a detailed semantic map for a large scale urban area. Then we describe a method to perform large scale dense 3D reconstruction of road scenes with associated semantic labels. Our method fuses the depth-maps in an online fashion, generated from the stereo pairs across time into a global 3D volume, in order to accommodate arbitrarily long image sequences. The object class labels estimated from the street level stereo image sequence are used to annotate the reconstructed volume. Then we exploit the scene structure in object class labelling by performing inference over the meshed representation of the scene. By performing labelling over the mesh we solve two issues: Firstly, images often have redundant information with multiple images describing the same scene. Solving these images separately is slow, where our method is approximately a magnitude faster in the inference stage compared to normal inference in the image domain. Secondly, often multiple images, even though they describe the same scene result in inconsistent labelling. By solving a single mesh, we remove the inconsistency of labelling across the images. Also our mesh based labelling takes into account of the object layout in the scene, which is often ambiguous in the image domain, thereby increasing the accuracy of object labelling. Finally, we perform labelling and structure computation through a hierarchical robust PN Markov Random Field defined on voxels and super-voxels given by an octree. This allows us to infer the 3D structure and the object-class labels in a principled manner, through bounded approximate minimisation of a well defined and studied energy functional. In this thesis, we also introduce two object labelled datasets created from real world data. The 15 kilometre Yotta Labelled dataset consists of 8,000 images per camera view of the roadways of the United Kingdom with a subset of them annotated with object class labels and the second dataset is comprised of ground truth object labels for the publicly available KITTI dataset. Both the datasets are available publicly and we hope will be helpful to the vision research community

Oxford Brookes University: RADAR

Recommended from our members

Real-time spatial modeling to detect and track resources on construction sites

Author: Teizer Jochen
Publication venue
Publication date: 01/01/2006
Field of study

For more than 10 years the U.S. construction industry has experienced over 1,000 fatalities annually. Many fatalities may have been prevented had the individuals and equipment involved been more aware of and alert to the physical state of the environment around them. Awareness may be improved by automatic 3D (three-dimensional) sensing and modeling of the job site environment in real-time. Existing 3D modeling approaches based on range scanning techniques are capable of modeling static objects only, and thus cannot model in real-time dynamic objects in an environment comprised of moving humans, equipment, and materials. Emerging prototype 3D video range cameras offer another alternative by facilitating affordable, wide field of view, automated static and dynamic object detection and tracking at frame rates better than 1Hz (real-time). This dissertation presents an imperical work and methodology to rapidly create a spatial model of construction sites and in particular to detect, model, and track the position, dimension, direction, and velocity of static and moving project resources in real-time, based on range data obtained from a three-dimensional video range camera in a static or moving position. Existing construction site 3D modeling approaches based on optical range sensing technologies (laser scanners, rangefinders, etc.) and 3D modeling approaches (dense, sparse, etc.) that offered potential solutions for this research are reviewed. The choice of an emerging sensing tool and preliminary experiments with this prototype sensing technology are discussed. These findings led to the development of a range data processing algorithm based on three-dimensional occupancy grids which is demonstrated in detail. Testing and validation of the proposed algorithms have been conducted to quantify the performance of sensor and algorithm through extensive experimentation involving static and moving objects. Experiments in indoor laboratory and outdoor construction environments have been conducted with construction resources such as humans, equipment, materials, or structures to verify the accuracy of the occupancy grid modeling approach. Results show that modeling objects and measuring their position, dimension, direction, and speed had an accuracy level compatible to the requirements of active safety features for construction. Results demonstrate that video rate 3D data acquisition and analysis of construction environments can support effective detection, tracking, and convex hull modeling of objects. Exploiting rapidly generated three-dimensional models for improved visualization, communications, and process control has inherent value, broad application, and potential impact, e.g. as-built vs. as-planned comparison, condition assessment, maintenance, operations, and construction activities control. In combination with effective management practices, this sensing approach has the potential to assist equipment operators to avoid incidents that result in reduce human injury, death, or collateral damage on construction sites.Civil, Architectural, and Environmental Engineerin

Texas ScholarWorks

Automatic Generation of Labeled 3D Point Clouds of Natural Environments with Gazebo.

Author: Martínez-Rodríguez Jorge Luis
Morales-Rodríguez Jesús
Moran Prados Mariano
Robles Alfredo
Sánchez Manuel
Publication venue: IEEE
Publication date: 27/05/2019
Field of study

https://conferences.ieeeauthorcenter.ieee.org/author-ethics/guidelines-and-policies/post-publication-policies/#preprintProgress in applying supervised learning for nat- ural scene classification is impeded by the lack of appropriate datasets for training. This paper describes the automatic generation of synthetic three-dimensional (3D) scans of natural environments with each point labelled individually with its element class. The developed software employs the robotic simulator Gazebo to obtain range and intensity measurements from a 3D laser rangefinder aboard a ground mobile robot. Precisely, the returned intensity values are used to annotate every 3D point within its corresponding class 100% error free. Several examples are provided to show the utility of the proposed approach

Crossref

Repositorio Institucional Universidad de Málaga

Automatic Reconstruction of Textured 3D Models

Author: Pitzer Benjamin
Publication venue: KIT Scientific Publishing
Publication date: 30/07/2019
Field of study

Three dimensional modeling and visualization of environments is an increasingly important problem. This work addresses the problem of automatic 3D reconstruction and we present a system for unsupervised reconstruction of textured 3D models in the context of modeling indoor environments. We present solutions to all aspects of the modeling process and an integrated system for the automatic creation of large scale 3D models

Directory of Open Access Books (DOAB)

Automatic Extrinsic Calibration of Vision and Lidar by Maximizing Mutual Information

Author: Barzilai
Boughorbal
Chao
Cramer
Forrest
Gong
Hartley
Hausser
Hill
Kirkpatrick
Levenberg
Li
Maes
Maes
Marquadrt
McDonald
Mirzaei
Nelder
Pandey
Panin
Scott
Tamjidi
Unnikrishnan
Viola
Wenzel
Whittaker
Woods
Xu
Zhang
Publication venue: 'Wiley'
Publication date: 01/08/2015
Field of study

Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/112212/1/rob21542.pd

CiteSeerX

Crossref

Deep Blue Documents at the University of Michigan