Search CORE

2,629 research outputs found

Deep Learning for 3D Information Extraction from Indoor and Outdoor Point Clouds

Author: Li Ying
Publication venue: 'University of Waterloo'
Publication date: 23/12/2020
Field of study

This thesis focuses on the challenges and opportunities that come with deep learning in the extraction of 3D information from point clouds. To achieve this, 3D information such as point-based or object-based attributes needs to be extracted from highly-accurate and information-rich 3D data, which are commonly collected by LiDAR or RGB-D cameras from real-world environments. Driven by the breakthroughs brought by deep learning techniques and the accessibility of reliable 3D datasets, 3D deep learning frameworks have been investigated with a string of empirical successes. However, two main challenges lead to the complexity of deep learning based per-point labeling and object detection in real scenes. First, the variation of sensing conditions and unconstrained environments result in unevenly distributed point clouds with various geometric patterns and incomplete shapes. Second, the irregular data format and the requirements for both accurate and efficient algorithms pose problems for deep learning models. To deal with the above two challenges, this doctoral dissertation mainly considers the following four features when constructing 3D deep models for point-based or object-based information extraction: (1) the exploration of geometric correlations between local points when defining convolution kernels, (2) the hierarchical local and global feature learning within an end-to-end trainable framework, (3) the relation feature learning from nearby objects, and (4) 2D image leveraging for 3D object detection from point clouds. Correspondingly, this doctoral thesis proposes a set of deep learning frameworks to deal with the 3D information extraction specific for scene segmentation and object detection from indoor and outdoor point clouds. Firstly, an end-to-end geometric graph convolution architecture on the graph representation of a point cloud is proposed for semantic scene segmentation. Secondly, a 3D proposal-based object detection framework is constructed to extract the geometric information of objects and relation features among proposals for bounding box reasoning. Thirdly, a 2D-driven approach is proposed to detect 3D objects from point clouds in indoor and outdoor scenes. Both semantic features from 2D images and the context information in 3D space are explicitly exploited to enhance the 3D detection performance. Qualitative and quantitative experiments compared with existing state-of-the-art models on indoor and outdoor datasets demonstrate the effectiveness of the proposed frameworks. A list of remaining challenges and future research issues that help to advance the development of deep learning approaches for the extraction of 3D information from point clouds are addressed at the end of this thesis

University of Waterloo's Institutional Repository

3D point cloud video segmentation oriented to the analysis of interactions

Author: Casas Pla Josep Ramon
Lin Xiao
Pardàs Feliu Montse
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Given the widespread availability of point cloud data from consumer depth sensors, 3D point cloud segmentation becomes a promising building block for high level applications such as scene understanding and interaction analysis. It benefits from the richer information contained in real world 3D data compared to 2D images. This also implies that the classical color segmentation challenges have shifted to RGBD data, and new challenges have also emerged as the depth information is usually noisy, sparse and unorganized. Meanwhile, the lack of 3D point cloud ground truth labeling also limits the development and comparison among methods in 3D point cloud segmentation. In this paper, we present two contributions: a novel graph based point cloud segmentation method for RGBD stream data with interacting objects and a new ground truth labeling for a previously published data set. This data set focuses on interaction (merge and split between ’object’ point clouds), which differentiates itself from the few existing labeled RGBD data sets which are more oriented to Simultaneous Localization And Mapping (SLAM) tasks. The proposed point cloud segmentation method is evaluated with the 3D point cloud ground truth labeling. Experiments show the promising result of our approach.Postprint (published version

Crossref

UPCommons. Portal del coneixement obert de la UPC

Recognizing point clouds using conditional random fields

Author: Dellen Babette
Husain Syed Farzad
Torras Carme
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Detecting objects in cluttered scenes is a necessary step for many robotic tasks and facilitates the interaction of the robot with its environment. Because of the availability of efficient 3D sensing devices as the Kinect, methods for the recognition of objects in 3D point clouds have gained importance during the last years. In this paper, we propose a new supervised learning approach for the recognition of objects from 3D point clouds using Conditional Random Fields, a type of discriminative, undirected probabilistic graphical model. The various features and contextual relations of the objects are described by the potential functions in the graph. Our method allows for learning and inference from unorganized point clouds of arbitrary sizes and shows significant benefit in terms of computational speed during prediction when compared to a state-of-the-art approach based on constrained optimization.Peer ReviewedPostprint (author’s final draft

CiteSeerX

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

SEGCloud: Semantic Segmentation of 3D Point Clouds

Author: Armeni Iro
Choy Christopher B.
Gwak JunYoung
Savarese Silvio
Tchapmi Lyne P.
Publication venue
Publication date: 20/10/2017
Field of study

3D semantic scene labeling is fundamental to agents operating in the real world. In particular, labeling raw 3D point sets from sensors provides fine-grained semantics. Recent works leverage the capabilities of Neural Networks (NNs), but are limited to coarse voxel predictions and do not explicitly enforce global consistency. We present SEGCloud, an end-to-end framework to obtain 3D point-level segmentation that combines the advantages of NNs, trilinear interpolation(TI) and fully connected Conditional Random Fields (FC-CRF). Coarse voxel predictions from a 3D Fully Convolutional NN are transferred back to the raw 3D points via trilinear interpolation. Then the FC-CRF enforces global consistency and provides fine-grained semantics on the points. We implement the latter as a differentiable Recurrent NN to allow joint optimization. We evaluate the framework on two indoor and two outdoor 3D datasets (NYU V2, S3DIS, KITTI, Semantic3D.net), and show performance comparable or superior to the state-of-the-art on all datasets.Comment: Accepted as a spotlight at the International Conference of 3D Vision (3DV 2017

arXiv.org e-Print Archive

Crossref

A LiDAR Point Cloud Generator: from a Virtual World to Autonomous Driving

Author: Keutzer Kurt
Sangiovanni-Vincentelli Alberto L.
Seshia Sanjit A.
Wu Bichen
Yue Xiangyu
Publication venue
Publication date: 01/01/2018
Field of study

3D LiDAR scanners are playing an increasingly important role in autonomous driving as they can generate depth information of the environment. However, creating large 3D LiDAR point cloud datasets with point-level labels requires a significant amount of manual annotation. This jeopardizes the efficient development of supervised deep learning algorithms which are often data-hungry. We present a framework to rapidly create point clouds with accurate point-level labels from a computer game. The framework supports data collection from both auto-driving scenes and user-configured scenes. Point clouds from auto-driving scenes can be used as training data for deep learning algorithms, while point clouds from user-configured scenes can be used to systematically test the vulnerability of a neural network, and use the falsifying examples to make the neural network more robust through retraining. In addition, the scene images can be captured simultaneously in order for sensor fusion tasks, with a method proposed to do automatic calibration between the point clouds and captured scene images. We show a significant improvement in accuracy (+9%) in point cloud segmentation by augmenting the training dataset with the generated synthesized data. Our experiments also show by testing and retraining the network using point clouds from user-configured scenes, the weakness/blind spots of the neural network can be fixed

arXiv.org e-Print Archive

Crossref

eScholarship - University of California