502 research outputs found
Recommended from our members
3D Shape Understanding and Generation
In recent years, Machine Learning techniques have revolutionized solutions to longstanding image-based problems, like image classification, generation, semantic segmentation, object detection and many others. However, if we want to be able to build agents that can successfully interact with the real world, those techniques need to be capable of reasoning about the world as it truly is: a tridimensional space. There are two main challenges while handling 3D information in machine learning models. First, it is not clear what is the best 3D representation. For images, convolutional neural networks (CNNs) operating on raster images yield the best results in virtually all image-based benchmarks. For 3D data, the best combination of model and representation is still an open question. Second, 3D data is not available on the same scale as images – taking pictures is a common procedure in our daily lives, whereas capturing 3D content is an activity usually restricted to specialized professionals. This thesis is focused on addressing both of these issues. Which model and representation should we use for generating and recognizing 3D data? What are efficient ways of learning 3D representations from a few examples? Is it possible to leverage image data to build models capable of reasoning about the world in 3D?
Our research findings show that it is possible to build models that efficiently generate 3D shapes as irregularly structured representations. Those models require significantly less memory while generating higher quality shapes than the ones based on voxels and multi-view representations. We start by developing techniques to generate shapes represented as point clouds. This class of models leads to high quality reconstructions and better unsupervised feature learning. However, since point clouds are not amenable to editing and human manipulation, we also present models capable of generating shapes as sets of shape handles -- simpler primitives that summarize complex 3D shapes and were specifically designed for high-level tasks and user interaction. Despite their effectiveness, those approaches require some form of 3D supervision, which is scarce. We present multiple alternatives to this problem. First, we investigate how approximate convex decomposition techniques can be used as self-supervision to improve recognition models when only a limited number of labels are available. Second, we study how neural network architectures induce shape priors that can be used in multiple reconstruction tasks -- using both volumetric and manifold representations. In this regime, reconstruction is performed from a single example -- either a sparse point cloud or multiple silhouettes. Finally, we demonstrate how to train generative models of 3D shapes without using any 3D supervision by combining differentiable rendering techniques and Generative Adversarial Networks
Nonparametric ridge estimation
We study the problem of estimating the ridges of a density function. Ridge
estimation is an extension of mode finding and is useful for understanding the
structure of a density. It can also be used to find hidden structure in point
cloud data. We show that, under mild regularity conditions, the ridges of the
kernel density estimator consistently estimate the ridges of the true density.
When the data are noisy measurements of a manifold, we show that the ridges are
close and topologically similar to the hidden manifold. To find the estimated
ridges in practice, we adapt the modified mean-shift algorithm proposed by
Ozertem and Erdogmus [J. Mach. Learn. Res. 12 (2011) 1249-1286]. Some numerical
experiments verify that the algorithm is accurate.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1218 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Manifold Graph Signal Restoration using Gradient Graph Laplacian Regularizer
In the graph signal processing (GSP) literature, graph Laplacian regularizer
(GLR) was used for signal restoration to promote piecewise smooth / constant
reconstruction with respect to an underlying graph. However, for signals slowly
varying across graph kernels, GLR suffers from an undesirable "staircase"
effect. In this paper, focusing on manifold graphs -- collections of uniform
discrete samples on low-dimensional continuous manifolds -- we generalize GLR
to gradient graph Laplacian regularizer (GGLR) that promotes planar / piecewise
planar (PWP) signal reconstruction. Specifically, for a graph endowed with
sampling coordinates (e.g., 2D images, 3D point clouds), we first define a
gradient operator, using which we construct a gradient graph for nodes'
gradients in sampling manifold space. This maps to a gradient-induced nodal
graph (GNG) and a positive semi-definite (PSD) Laplacian matrix with planar
signals as the 0 frequencies. For manifold graphs without explicit sampling
coordinates, we propose a graph embedding method to obtain node coordinates via
fast eigenvector computation. We derive the means-square-error minimizing
weight parameter for GGLR efficiently, trading off bias and variance of the
signal estimate. Experimental results show that GGLR outperformed previous
graph signal priors like GLR and graph total variation (GTV) in a range of
graph signal restoration tasks
- …