40 research outputs found
Analysis of Sampling Strategies for Implicit 3D Reconstruction
In the training process of the implicit 3D reconstruction network, the choice
of spatial query points' sampling strategy affects the final performance of the
model. Different works have differences in the selection of sampling
strategies, not only in the spatial distribution of query points but also in
the order of magnitude difference in the density of query points. For how to
select the sampling strategy of query points, current works are more akin to an
enumerating operation to find the optimal solution, which seriously affects
work efficiency. In this work, we explored the relationship between sampling
strategy and network final performance through classification analysis and
experimental comparison from three aspects: the relationship between network
type and sampling strategy, the relationship between implicit function and
sampling strategy, and the impact of sampling density on model performance. In
addition, we also proposed two methods, linear sampling and distance mask, to
improve the sampling strategy of query points, making it more general and
robust
DeepJoin: Learning a Joint Occupancy, Signed Distance, and Normal Field Function for Shape Repair
We introduce DeepJoin, an automated approach to generate high-resolution
repairs for fractured shapes using deep neural networks. Existing approaches to
perform automated shape repair operate exclusively on symmetric objects,
require a complete proxy shape, or predict restoration shapes using
low-resolution voxels which are too coarse for physical repair. We generate a
high-resolution restoration shape by inferring a corresponding complete shape
and a break surface from an input fractured shape. We present a novel implicit
shape representation for fractured shape repair that combines the occupancy
function, signed distance function, and normal field. We demonstrate repairs
using our approach for synthetically fractured objects from ShapeNet, 3D scans
from the Google Scanned Objects dataset, objects in the style of ancient Greek
pottery from the QP Cultural Heritage dataset, and real fractured objects. We
outperform three baseline approaches in terms of chamfer distance and normal
consistency. Unlike existing approaches and restorations using subtraction,
DeepJoin restorations do not exhibit surface artifacts and join closely to the
fractured region of the fractured shape. Our code is available at:
https://github.com/Terascale-All-sensing-Research-Studio/DeepJoin.Comment: To be published at SIGGRAPH Asia 2022 (Journal
ROAD: Learning an Implicit Recursive Octree Auto-Decoder to Efficiently Encode 3D Shapes
Compact and accurate representations of 3D shapes are central to many
perception and robotics tasks. State-of-the-art learning-based methods can
reconstruct single objects but scale poorly to large datasets. We present a
novel recursive implicit representation to efficiently and accurately encode
large datasets of complex 3D shapes by recursively traversing an implicit
octree in latent space. Our implicit Recursive Octree Auto-Decoder (ROAD)
learns a hierarchically structured latent space enabling state-of-the-art
reconstruction results at a compression ratio above 99%. We also propose an
efficient curriculum learning scheme that naturally exploits the coarse-to-fine
properties of the underlying octree spatial representation. We explore the
scaling law relating latent space dimension, dataset size, and reconstruction
accuracy, showing that increasing the latent space dimension is enough to scale
to large shape datasets. Finally, we show that our learned latent space encodes
a coarse-to-fine hierarchical structure yielding reusable latents across
different levels of details, and we provide qualitative evidence of
generalization to novel shapes outside the training set.Comment: Accepted to Conference on Robot Learning (CoRL), 202
Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning
The goal of contrastive learning based pre-training is to leverage large
quantities of unlabeled data to produce a model that can be readily adapted
downstream. Current approaches revolve around solving an image discrimination
task: given an anchor image, an augmented counterpart of that image, and some
other images, the model must produce representations such that the distance
between the anchor and its counterpart is small, and the distances between the
anchor and the other images are large. There are two significant problems with
this approach: (i) by contrasting representations at the image-level, it is
hard to generate detailed object-sensitive features that are beneficial to
downstream object-level tasks such as instance segmentation; (ii) the
augmentation strategy of producing an augmented counterpart is fixed, making
learning less effective at the later stages of pre-training. In this work, we
introduce Curricular Contrastive Object-level Pre-training (CCOP) to tackle
these problems: (i) we use selective search to find rough object regions and
use them to build an inter-image object-level contrastive loss and an
intra-image object-level discrimination loss into our pre-training objective;
(ii) we present a curriculum learning mechanism that adaptively augments the
generated regions, which allows the model to consistently acquire a useful
learning signal, even in the later stages of pre-training. Our experiments show
that our approach improves on the MoCo v2 baseline by a large margin on
multiple object-level tasks when pre-training on multi-object scene image
datasets. Code is available at https://github.com/ChenhongyiYang/CCOP
Towards Generalising Neural Implicit Representations
Neural implicit representations have shown substantial improvements in
efficiently storing 3D data, when compared to conventional formats. However,
the focus of existing work has mainly been on storage and subsequent
reconstruction. In this work, we show that training neural representations for
reconstruction tasks alongside conventional tasks can produce more general
encodings that admit equal quality reconstructions to single task training,
whilst improving results on conventional tasks when compared to single task
encodings. We reformulate the semantic segmentation task, creating a more
representative task for implicit representation contexts, and through
multi-task experiments on reconstruction, classification, and segmentation,
show our approach learns feature rich encodings that admit equal performance
for each task