1,349 research outputs found
Graph Contrastive Invariant Learning from the Causal Perspective
Graph contrastive learning (GCL), learning the node representation by
contrasting two augmented graphs in a self-supervised way, has attracted
considerable attention. GCL is usually believed to learn the invariant
representation. However, does this understanding always hold in practice? In
this paper, we first study GCL from the perspective of causality. By analyzing
GCL with the structural causal model (SCM), we discover that traditional GCL
may not well learn the invariant representations due to the non-causal
information contained in the graph. How can we fix it and encourage the current
GCL to learn better invariant representations? The SCM offers two requirements
and motives us to propose a novel GCL method. Particularly, we introduce the
spectral graph augmentation to simulate the intervention upon non-causal
factors. Then we design the invariance objective and independence objective to
better capture the causal factors. Specifically, (i) the invariance objective
encourages the encoder to capture the invariant information contained in causal
variables, and (ii) the independence objective aims to reduce the influence of
confounders on the causal variables. Experimental results demonstrate the
effectiveness of our approach on node classification tasks
Hierarchical Contrastive Learning Enhanced Heterogeneous Graph Neural Network
Heterogeneous graph neural networks (HGNNs) as an emerging technique have
shown superior capacity of dealing with heterogeneous information network
(HIN). However, most HGNNs follow a semi-supervised learning manner, which
notably limits their wide use in reality since labels are usually scarce in
real applications. Recently, contrastive learning, a self-supervised method,
becomes one of the most exciting learning paradigms and shows great potential
when there are no labels. In this paper, we study the problem of
self-supervised HGNNs and propose a novel co-contrastive learning mechanism for
HGNNs, named HeCo. Different from traditional contrastive learning which only
focuses on contrasting positive and negative samples, HeCo employs cross-view
contrastive mechanism. Specifically, two views of a HIN (network schema and
meta-path views) are proposed to learn node embeddings, so as to capture both
of local and high-order structures simultaneously. Then the cross-view
contrastive learning, as well as a view mask mechanism, is proposed, which is
able to extract the positive and negative embeddings from two views. This
enables the two views to collaboratively supervise each other and finally learn
high-level node embeddings. Moreover, to further boost the performance of HeCo,
two additional methods are designed to generate harder negative samples with
high quality. Besides the invariant factors, view-specific factors
complementally provide the diverse structure information between different
nodes, which also should be contained into the final embeddings. Therefore, we
need to further explore each view independently and propose a modified model,
called HeCo++. Specifically, HeCo++ conducts hierarchical contrastive learning,
including cross-view and intra-view contrasts, which aims to enhance the mining
of respective structures.Comment: This paper has been accepted by TKDE as a regular paper. arXiv admin
note: substantial text overlap with arXiv:2105.0911
Provable Training for Graph Contrastive Learning
Graph Contrastive Learning (GCL) has emerged as a popular training approach
for learning node embeddings from augmented graphs without labels. Despite the
key principle that maximizing the similarity between positive node pairs while
minimizing it between negative node pairs is well established, some fundamental
problems are still unclear. Considering the complex graph structure, are some
nodes consistently well-trained and following this principle even with
different graph augmentations? Or are there some nodes more likely to be
untrained across graph augmentations and violate the principle? How to
distinguish these nodes and further guide the training of GCL? To answer these
questions, we first present experimental evidence showing that the training of
GCL is indeed imbalanced across all nodes. To address this problem, we propose
the metric "node compactness", which is the lower bound of how a node follows
the GCL principle related to the range of augmentations. We further derive the
form of node compactness theoretically through bound propagation, which can be
integrated into binary cross-entropy as a regularization. To this end, we
propose the PrOvable Training (POT) for GCL, which regularizes the training of
GCL to encode node embeddings that follows the GCL principle better. Through
extensive experiments on various benchmarks, POT consistently improves the
existing GCL approaches, serving as a friendly plugin
Generalizing Graph Neural Networks on Out-Of-Distribution Graphs
Graph Neural Networks (GNNs) are proposed without considering the agnostic
distribution shifts between training and testing graphs, inducing the
degeneration of the generalization ability of GNNs on Out-Of-Distribution (OOD)
settings. The fundamental reason for such degeneration is that most GNNs are
developed based on the I.I.D hypothesis. In such a setting, GNNs tend to
exploit subtle statistical correlations existing in the training set for
predictions, even though it is a spurious correlation. However, such spurious
correlations may change in testing environments, leading to the failure of
GNNs. Therefore, eliminating the impact of spurious correlations is crucial for
stable GNNs. To this end, we propose a general causal representation framework,
called StableGNN. The main idea is to extract high-level representations from
graph data first and resort to the distinguishing ability of causal inference
to help the model get rid of spurious correlations. Particularly, we exploit a
graph pooling layer to extract subgraph-based representations as high-level
representations. Furthermore, we propose a causal variable distinguishing
regularizer to correct the biased training distribution. Hence, GNNs would
concentrate more on the stable correlations. Extensive experiments on both
synthetic and real-world OOD graph datasets well verify the effectiveness,
flexibility and interpretability of the proposed framework.Comment: IEEE TPAMI 202
- …