27,022 research outputs found
VConv-DAE: Deep Volumetric Shape Learning Without Object Labels
With the advent of affordable depth sensors, 3D capture becomes more and more
ubiquitous and already has made its way into commercial products. Yet,
capturing the geometry or complete shapes of everyday objects using scanning
devices (e.g. Kinect) still comes with several challenges that result in noise
or even incomplete shapes. Recent success in deep learning has shown how to
learn complex shape distributions in a data-driven way from large scale 3D CAD
Model collections and to utilize them for 3D processing on volumetric
representations and thereby circumventing problems of topology and
tessellation. Prior work has shown encouraging results on problems ranging from
shape completion to recognition. We provide an analysis of such approaches and
discover that training as well as the resulting representation are strongly and
unnecessarily tied to the notion of object labels. Thus, we propose a full
convolutional volumetric auto encoder that learns volumetric representation
from noisy data by estimating the voxel occupancy grids. The proposed method
outperforms prior work on challenging tasks like denoising and shape
completion. We also show that the obtained deep embedding gives competitive
performance when used for classification and promising results for shape
interpolation
Machine learning methods for 3D object classification and segmentation
Field of study: Computer science.Dr. Ye Duan, Thesis Supervisor.Includes vita."July 2018."Object understanding is a fundamental problem in computer vision and it has been extensively researched in recent years thanks to the availability of powerful GPUs and labelled data, especially in the context of images. However, 3D object understanding is still not on par with its 2D domain and deep learning for 3D has not been fully explored yet. In this dissertation, I work on two approaches, both of which advances the state-of-the-art results in 3D classification and segmentation. The first approach, called MVRNN, is based multi-view paradigm. In contrast to MVCNN which does not generate consistent result across different views, by treating the multi-view images as a temporal sequence, our MVRNN correlates the features and generates coherent segmentation across different views. MVRNN demonstrated state-of-the-art performance on the Princeton Segmentation Benchmark dataset. The second approach, called PointGrid, is a hybrid method which combines points and regular grid structure. 3D points can retain fine details but irregular, which is challenge for deep learning methods. Volumetric grid is simple and has regular structure, but does not scale well with data resolution. Our PointGrid, which is simple, allows the fine details to be consumed by normal convolutions under a coarser resolution grid. PointGrid achieved state-of-the-art performance on ModelNet40 and ShapeNet datasets in 3D classification and object part segmentation.Includes bibliographical references (pages 116-140)
Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis
We introduce a data-driven approach to complete partial 3D shapes through a
combination of volumetric deep neural networks and 3D shape synthesis. From a
partially-scanned input shape, our method first infers a low-resolution -- but
complete -- output. To this end, we introduce a 3D-Encoder-Predictor Network
(3D-EPN) which is composed of 3D convolutional layers. The network is trained
to predict and fill in missing data, and operates on an implicit surface
representation that encodes both known and unknown space. This allows us to
predict global structure in unknown areas at high accuracy. We then correlate
these intermediary results with 3D geometry from a shape database at test time.
In a final pass, we propose a patch-based 3D shape synthesis method that
imposes the 3D geometry from these retrieved shapes as constraints on the
coarsely-completed mesh. This synthesis process enables us to reconstruct
fine-scale detail and generate high-resolution output while respecting the
global mesh structure obtained by the 3D-EPN. Although our 3D-EPN outperforms
state-of-the-art completion method, the main contribution in our work lies in
the combination of a data-driven shape predictor and analytic 3D shape
synthesis. In our results, we show extensive evaluations on a newly-introduced
shape completion benchmark for both real-world and synthetic data
Robustness of 3D Deep Learning in an Adversarial Setting
Understanding the spatial arrangement and nature of real-world objects is of
paramount importance to many complex engineering tasks, including autonomous
navigation. Deep learning has revolutionized state-of-the-art performance for
tasks in 3D environments; however, relatively little is known about the
robustness of these approaches in an adversarial setting. The lack of
comprehensive analysis makes it difficult to justify deployment of 3D deep
learning models in real-world, safety-critical applications. In this work, we
develop an algorithm for analysis of pointwise robustness of neural networks
that operate on 3D data. We show that current approaches presented for
understanding the resilience of state-of-the-art models vastly overestimate
their robustness. We then use our algorithm to evaluate an array of
state-of-the-art models in order to demonstrate their vulnerability to
occlusion attacks. We show that, in the worst case, these networks can be
reduced to 0% classification accuracy after the occlusion of at most 6.5% of
the occupied input space.Comment: 10 pages, 8 figures, 1 tabl
High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference
We propose a data-driven method for recovering miss-ing parts of 3D shapes.
Our method is based on a new deep learning architecture consisting of two
sub-networks: a global structure inference network and a local geometry
refinement network. The global structure inference network incorporates a long
short-term memorized context fusion module (LSTM-CF) that infers the global
structure of the shape based on multi-view depth information provided as part
of the input. It also includes a 3D fully convolutional (3DFCN) module that
further enriches the global structure representation according to volumetric
information in the input. Under the guidance of the global structure network,
the local geometry refinement network takes as input lo-cal 3D patches around
missing regions, and progressively produces a high-resolution, complete surface
through a volumetric encoder-decoder architecture. Our method jointly trains
the global structure inference and local geometry refinement networks in an
end-to-end manner. We perform qualitative and quantitative evaluations on six
object categories, demonstrating that our method outperforms existing
state-of-the-art work on shape completion.Comment: 8 pages paper, 11 pages supplementary material, ICCV spotlight pape
- …