17 research outputs found
Learning to Reconstruct Shapes from Unseen Classes
From a single image, humans are able to perceive the full 3D shape of an
object by exploiting learned shape priors from everyday life. Contemporary
single-image 3D reconstruction algorithms aim to solve this task in a similar
fashion, but often end up with priors that are highly biased by training
classes. Here we present an algorithm, Generalizable Reconstruction (GenRe),
designed to capture more generic, class-agnostic shape priors. We achieve this
with an inference network and training procedure that combine 2.5D
representations of visible surfaces (depth and silhouette), spherical shape
representations of both visible and non-visible surfaces, and 3D voxel-based
representations, in a principled manner that exploits the causal structure of
how 3D shapes give rise to 2D images. Experiments demonstrate that GenRe
performs well on single-view shape reconstruction, and generalizes to diverse
novel objects from categories not seen during training.Comment: NeurIPS 2018 (Oral). The first two authors contributed equally to
this paper. Project page: http://genre.csail.mit.edu
3D Convolutional Neural Networks Initialized from Pretrained 2D Convolutional Neural Networks for Classification of Industrial Parts
Deep learning methods have been successfully applied to image processing, mainly using
2D vision sensors. Recently, the rise of depth cameras and other similar 3D sensors has opened
the field for new perception techniques. Nevertheless, 3D convolutional neural networks perform
slightly worse than other 3D deep learning methods, and even worse than their 2D version. In
this paper, we propose to improve 3D deep learning results by transferring the pretrained weights
learned in 2D networks to their corresponding 3D version. Using an industrial object recognition
context, we have analyzed different combinations of 3D convolutional networks (VGG16, ResNet,
Inception ResNet, and EfficientNet), comparing the recognition accuracy. The highest accuracy is
obtained with EfficientNetB0 using extrusion with an accuracy of 0.9217, which gives comparable
results to state-of-the art methods. We also observed that the transfer approach enabled to improve
the accuracy of the Inception ResNet 3D version up to 18% with respect to the score of the 3D
approach alone.This paper has been supported by the project ELKARBOT under the Basque program
ELKARTEK, grant agreement No. KK-2020/00092
HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-scale Point Clouds
We present a novel deep neural network architecture for end-to-end scene flow
estimation that directly operates on large-scale 3D point clouds. Inspired by
Bilateral Convolutional Layers (BCL), we propose novel DownBCL, UpBCL, and
CorrBCL operations that restore structural information from unstructured point
clouds, and fuse information from two consecutive point clouds. Operating on
discrete and sparse permutohedral lattice points, our architectural design is
parsimonious in computational cost. Our model can efficiently process a pair of
point cloud frames at once with a maximum of 86K points per frame. Our approach
achieves state-of-the-art performance on the FlyingThings3D and KITTI Scene
Flow 2015 datasets. Moreover, trained on synthetic data, our approach shows
great generalization ability on real-world data and on different point
densities without fine-tuning