14,532 research outputs found
Deformable Shape Completion with Graph Convolutional Autoencoders
The availability of affordable and portable depth sensors has made scanning
objects and people simpler than ever. However, dealing with occlusions and
missing parts is still a significant challenge. The problem of reconstructing a
(possibly non-rigidly moving) 3D object from a single or multiple partial scans
has received increasing attention in recent years. In this work, we propose a
novel learning-based method for the completion of partial shapes. Unlike the
majority of existing approaches, our method focuses on objects that can undergo
non-rigid deformations. The core of our method is a variational autoencoder
with graph convolutional operations that learns a latent space for complete
realistic shapes. At inference, we optimize to find the representation in this
latent space that best fits the generated shape to the known partial input. The
completed shape exhibits a realistic appearance on the unknown part. We show
promising results towards the completion of synthetic and real scans of human
body and face meshes exhibiting different styles of articulation and
partiality.Comment: CVPR 201
Surface Networks
We study data-driven representations for three-dimensional triangle meshes,
which are one of the prevalent objects used to represent 3D geometry. Recent
works have developed models that exploit the intrinsic geometry of manifolds
and graphs, namely the Graph Neural Networks (GNNs) and its spectral variants,
which learn from the local metric tensor via the Laplacian operator. Despite
offering excellent sample complexity and built-in invariances, intrinsic
geometry alone is invariant to isometric deformations, making it unsuitable for
many applications. To overcome this limitation, we propose several upgrades to
GNNs to leverage extrinsic differential geometry properties of
three-dimensional surfaces, increasing its modeling power.
In particular, we propose to exploit the Dirac operator, whose spectrum
detects principal curvature directions --- this is in stark contrast with the
classical Laplace operator, which directly measures mean curvature. We coin the
resulting models \emph{Surface Networks (SN)}. We prove that these models
define shape representations that are stable to deformation and to
discretization, and we demonstrate the efficiency and versatility of SNs on two
challenging tasks: temporal prediction of mesh deformations under non-linear
dynamics and generative models using a variational autoencoder framework with
encoders/decoders given by SNs
Learning to Reconstruct Shapes from Unseen Classes
From a single image, humans are able to perceive the full 3D shape of an
object by exploiting learned shape priors from everyday life. Contemporary
single-image 3D reconstruction algorithms aim to solve this task in a similar
fashion, but often end up with priors that are highly biased by training
classes. Here we present an algorithm, Generalizable Reconstruction (GenRe),
designed to capture more generic, class-agnostic shape priors. We achieve this
with an inference network and training procedure that combine 2.5D
representations of visible surfaces (depth and silhouette), spherical shape
representations of both visible and non-visible surfaces, and 3D voxel-based
representations, in a principled manner that exploits the causal structure of
how 3D shapes give rise to 2D images. Experiments demonstrate that GenRe
performs well on single-view shape reconstruction, and generalizes to diverse
novel objects from categories not seen during training.Comment: NeurIPS 2018 (Oral). The first two authors contributed equally to
this paper. Project page: http://genre.csail.mit.edu
CARPe Posterum: A Convolutional Approach for Real-time Pedestrian Path Prediction
Pedestrian path prediction is an essential topic in computer vision and video
understanding. Having insight into the movement of pedestrians is crucial for
ensuring safe operation in a variety of applications including autonomous
vehicles, social robots, and environmental monitoring. Current works in this
area utilize complex generative or recurrent methods to capture many possible
futures. However, despite the inherent real-time nature of predicting future
paths, little work has been done to explore accurate and computationally
efficient approaches for this task. To this end, we propose a convolutional
approach for real-time pedestrian path prediction, CARPe. It utilizes a
variation of Graph Isomorphism Networks in combination with an agile
convolutional neural network design to form a fast and accurate path prediction
approach. Notable results in both inference speed and prediction accuracy are
achieved, improving FPS considerably in comparison to current state-of-the-art
methods while delivering competitive accuracy on well-known path prediction
datasets.Comment: AAAI-21 Camera Read
Recovery Guarantees for Quadratic Tensors with Limited Observations
We consider the tensor completion problem of predicting the missing entries
of a tensor. The commonly used CP model has a triple product form, but an
alternate family of quadratic models which are the sum of pairwise products
instead of a triple product have emerged from applications such as
recommendation systems. Non-convex methods are the method of choice for
learning quadratic models, and this work examines their sample complexity and
error guarantee. Our main result is that with the number of samples being only
linear in the dimension, all local minima of the mean squared error objective
are global minima and recover the original tensor accurately. The techniques
lead to simple proofs showing that convex relaxation can recover quadratic
tensors provided with linear number of samples. We substantiate our theoretical
results with experiments on synthetic and real-world data, showing that
quadratic models have better performance than CP models in scenarios where
there are limited amount of observations available
Encoding Robust Representation for Graph Generation
Generative networks have made it possible to generate meaningful signals such
as images and texts from simple noise. Recently, generative methods based on
GAN and VAE were developed for graphs and graph signals. However, the
mathematical properties of these methods are unclear, and training good
generative models is difficult. This work proposes a graph generation model
that uses a recent adaptation of Mallat's scattering transform to graphs. The
proposed model is naturally composed of an encoder and a decoder. The encoder
is a Gaussianized graph scattering transform, which is robust to signal and
graph manipulation. The decoder is a simple fully connected network that is
adapted to specific tasks, such as link prediction, signal generation on graphs
and full graph and signal generation. The training of our proposed system is
efficient since it is only applied to the decoder and the hardware requirements
are moderate. Numerical results demonstrate state-of-the-art performance of the
proposed system for both link prediction and graph and signal generation.Comment: 9 pages, 7 figures, 6 table
Sparsity-Based Super Resolution for SEM Images
The scanning electron microscope (SEM) produces an image of a sample by
scanning it with a focused beam of electrons. The electrons interact with the
atoms in the sample, which emit secondary electrons that contain information
about the surface topography and composition. The sample is scanned by the
electron beam point by point, until an image of the surface is formed. Since
its invention in 1942, SEMs have become paramount in the discovery and
understanding of the nanometer world, and today it is extensively used for both
research and in industry. In principle, SEMs can achieve resolution better than
one nanometer. However, for many applications, working at sub-nanometer
resolution implies an exceedingly large number of scanning points. For exactly
this reason, the SEM diagnostics of microelectronic chips is performed either
at high resolution (HR) over a small area or at low resolution (LR) while
capturing a larger portion of the chip. Here, we employ sparse coding and
dictionary learning to algorithmically enhance LR SEM images of microelectronic
chips up to the level of the HR images acquired by slow SEM scans, while
considerably reducing the noise. Our methodology consists of two steps: an
offline stage of learning a joint dictionary from a sequence of LR and HR
images of the same region in the chip, followed by a fast-online
super-resolution step where the resolution of a new LR image is enhanced. We
provide several examples with typical chips used in the microelectronics
industry, as well as a statistical study on arbitrary images with
characteristic structural features. Conceptually, our method works well when
the images have similar characteristics. This work demonstrates that employing
sparsity concepts can greatly improve the performance of SEM, thereby
considerably increasing the scanning throughput without compromising on
analysis quality and resolution.Comment: Final publication available at ACS Nano Letter
- …