616 research outputs found
Neural 3D Morphable Models: Spiral Convolutional Networks for 3D Shape Representation Learning and Generation
Generative models for 3D geometric data arise in many important applications
in 3D computer vision and graphics. In this paper, we focus on 3D deformable
shapes that share a common topological structure, such as human faces and
bodies. Morphable Models and their variants, despite their linear formulation,
have been widely used for shape representation, while most of the recently
proposed nonlinear approaches resort to intermediate representations, such as
3D voxel grids or 2D views. In this work, we introduce a novel graph
convolutional operator, acting directly on the 3D mesh, that explicitly models
the inductive bias of the fixed underlying graph. This is achieved by enforcing
consistent local orderings of the vertices of the graph, through the spiral
operator, thus breaking the permutation invariance property that is adopted by
all the prior work on Graph Neural Networks. Our operator comes by construction
with desirable properties (anisotropic, topology-aware, lightweight,
easy-to-optimise), and by using it as a building block for traditional deep
generative architectures, we demonstrate state-of-the-art results on a variety
of 3D shape datasets compared to the linear Morphable Model and other graph
convolutional operators.Comment: to appear at ICCV 201
What's Behind the Mask: Understanding Masked Graph Modeling for Graph Autoencoders
The last years have witnessed the emergence of a promising self-supervised
learning strategy, referred to as masked autoencoding. However, there is a lack
of theoretical understanding of how masking matters on graph autoencoders
(GAEs). In this work, we present masked graph autoencoder (MaskGAE), a
self-supervised learning framework for graph-structured data. Different from
standard GAEs, MaskGAE adopts masked graph modeling (MGM) as a principled
pretext task - masking a portion of edges and attempting to reconstruct the
missing part with partially visible, unmasked graph structure. To understand
whether MGM can help GAEs learn better representations, we provide both
theoretical and empirical evidence to comprehensively justify the benefits of
this pretext task. Theoretically, we establish close connections between GAEs
and contrastive learning, showing that MGM significantly improves the
self-supervised learning scheme of GAEs. Empirically, we conduct extensive
experiments on a variety of graph benchmarks, demonstrating the superiority of
MaskGAE over several state-of-the-arts on both link prediction and node
classification tasks.Comment: KDD 2023 research track. Code available at
https://github.com/EdisonLeeeee/MaskGA
How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders
Masked Autoencoders (MAE) based on a reconstruction task have risen to be a
promising paradigm for self-supervised learning (SSL) and achieve
state-of-the-art performance across different benchmark datasets. However,
despite its impressive empirical success, there is still limited theoretical
understanding of it. In this paper, we propose a theoretical understanding of
how masking matters for MAE to learn meaningful features. We establish a close
connection between MAE and contrastive learning, which shows that MAE implicit
aligns the mask-induced positive pairs. Built upon this connection, we develop
the first downstream guarantees for MAE methods, and analyze the effect of mask
ratio. Besides, as a result of the implicit alignment, we also point out the
dimensional collapse issue of MAE, and propose a Uniformity-enhanced MAE
(U-MAE) loss that can effectively address this issue and bring significant
improvements on real-world datasets, including CIFAR-10, ImageNet-100, and
ImageNet-1K. Code is available at (https://github.com/zhangq327/U-MAE)
Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules
Masked graph modeling excels in the self-supervised representation learning
of molecular graphs. Scrutinizing previous studies, we can reveal a common
scheme consisting of three key components: (1) graph tokenizer, which breaks a
molecular graph into smaller fragments (i.e., subgraphs) and converts them into
tokens; (2) graph masking, which corrupts the graph with masks; (3) graph
autoencoder, which first applies an encoder on the masked graph to generate the
representations, and then employs a decoder on the representations to recover
the tokens of the original graph. However, the previous MGM studies focus
extensively on graph masking and encoder, while there is limited understanding
of tokenizer and decoder. To bridge the gap, we first summarize popular
molecule tokenizers at the granularity of node, edge, motif, and Graph Neural
Networks (GNNs), and then examine their roles as the MGM's reconstruction
targets. Further, we explore the potential of adopting an expressive decoder in
MGM. Our results show that a subgraph-level tokenizer and a sufficiently
expressive decoder with remask decoding have a large impact on the encoder's
representation learning. Finally, we propose a novel MGM method SimSGT,
featuring a Simple GNN-based Tokenizer (SGT) and an effective decoding
strategy. We empirically validate that our method outperforms the existing
molecule self-supervised learning methods. Our codes and checkpoints are
available at https://github.com/syr-cn/SimSGT.Comment: NeurIPS 2023. 10 page
- …