2 research outputs found
Disentangling Geometric Deformation Spaces in Generative Latent Shape Models
A complete representation of 3D objects requires characterizing the space of
deformations in an interpretable manner, from articulations of a single
instance to changes in shape across categories. In this work, we improve on a
prior generative model of geometric disentanglement for 3D shapes, wherein the
space of object geometry is factorized into rigid orientation, non-rigid pose,
and intrinsic shape. The resulting model can be trained from raw 3D shapes,
without correspondences, labels, or even rigid alignment, using a combination
of classical spectral geometry and probabilistic disentanglement of a
structured latent representation space. Our improvements include more
sophisticated handling of rotational invariance and the use of a diffeomorphic
flow network to bridge latent and spectral space. The geometric structuring of
the latent space imparts an interpretable characterization of the deformation
space of an object. Furthermore, it enables tasks like pose transfer and
pose-aware retrieval without requiring supervision. We evaluate our model on
its generative modelling, representation learning, and disentanglement
performance, showing improved rotation invariance and intrinsic-extrinsic
factorization quality over the prior model.Comment: 22 page
Spectral methods for multimodal data analysis
Spectral methods have proven themselves as an important and versatile tool in a wide range of problems in the fields of computer graphics, machine learning, pattern recognition, and computer vision, where many important problems boil down to constructing a Laplacian operator and finding a few of its eigenvalues and eigenfunctions. Classical examples include the computation of diffusion distances on manifolds in computer graphics, Laplacian eigenmaps, and spectral clustering in machine learning. In many cases, one has to deal with multiple data spaces simultaneously. For example, clustering multimedia data in machine learning applications involves various modalities or ``views'' (e.g., text and images), and finding correspondence between shapes in computer graphics problems is an operation performed between two or more modalities. In this thesis, we develop a generalization of spectral methods to deal with multiple data spaces and apply them to problems from the domains of computer graphics, machine learning, and image processing. Our main construction is based on simultaneous diagonalization of Laplacian operators. We present an efficient numerical technique for computing joint approximate eigenvectors of two or more Laplacians in challenging noisy scenarios, which also appears to be the first general non-smooth manifold optimization method. Finally, we use the relation between joint approximate diagonalizability and approximate commutativity of operators to define a structural similarity measure for images. We use this measure to perform structure-preserving color manipulations of a given image