13 research outputs found
Understanding Negative Sampling in Graph Representation Learning
Graph representation learning has been extensively studied in recent years.
Despite its potential in generating continuous embeddings for various networks,
both the effectiveness and efficiency to infer high-quality representations
toward large corpus of nodes are still challenging. Sampling is a critical
point to achieve the performance goals. Prior arts usually focus on sampling
positive node pairs, while the strategy for negative sampling is left
insufficiently explored. To bridge the gap, we systematically analyze the role
of negative sampling from the perspectives of both objective and risk,
theoretically demonstrating that negative sampling is as important as positive
sampling in determining the optimization objective and the resulted variance.
To the best of our knowledge, we are the first to derive the theory and
quantify that the negative sampling distribution should be positively but
sub-linearly correlated to their positive sampling distribution. With the
guidance of the theory, we propose MCNS, approximating the positive
distribution with self-contrast approximation and accelerating negative
sampling by Metropolis-Hastings. We evaluate our method on 5 datasets that
cover extensive downstream graph learning tasks, including link prediction,
node classification and personalized recommendation, on a total of 19
experimental settings. These relatively comprehensive experimental results
demonstrate its robustness and superiorities.Comment: KDD 202
NS4AR: A new, focused on sampling areas sampling method in graphical recommendation Systems
The effectiveness of graphical recommender system depends on the quantity and
quality of negative sampling. This paper selects some typical recommender
system models, as well as some latest negative sampling strategies on the
models as baseline. Based on typical graphical recommender model, we divide
sample region into assigned-n areas and use AdaSim to give different weight to
these areas to form positive set and negative set. Because of the volume and
significance of negative items, we also proposed a subset selection model to
narrow the core negative samples
Neighborhood-based Hard Negative Mining for Sequential Recommendation
Negative sampling plays a crucial role in training successful sequential
recommendation models. Instead of merely employing random negative sample
selection, numerous strategies have been proposed to mine informative negative
samples to enhance training and performance. However, few of these approaches
utilize structural information. In this work, we observe that as training
progresses, the distributions of node-pair similarities in different groups
with varying degrees of neighborhood overlap change significantly, suggesting
that item pairs in distinct groups may possess different negative
relationships. Motivated by this observation, we propose a Graph-based Negative
sampling approach based on Neighborhood Overlap (GNNO) to exploit structural
information hidden in user behaviors for negative mining. GNNO first constructs
a global weighted item transition graph using training sequences. Subsequently,
it mines hard negative samples based on the degree of overlap with the target
item on the graph. Furthermore, GNNO employs curriculum learning to control the
hardness of negative samples, progressing from easy to difficult. Extensive
experiments on three Amazon benchmarks demonstrate GNNO's effectiveness in
consistently enhancing the performance of various state-of-the-art models and
surpassing existing negative sampling strategies. The code will be released at
\url{https://github.com/floatSDSDS/GNNO}
Robust Training of Temporal GNNs using Nearest Neighbours based Hard Negatives
Temporal graph neural networks Tgnn have exhibited state-of-art performance
in future-link prediction tasks. Training of these TGNNs is enumerated by
uniform random sampling based unsupervised loss. During training, in the
context of a positive example, the loss is computed over uninformative
negatives, which introduces redundancy and sub-optimal performance. In this
paper, we propose modified unsupervised learning of Tgnn, by replacing the
uniform negative sampling with importance-based negative sampling. We
theoretically motivate and define the dynamically computed distribution for a
sampling of negative examples. Finally, using empirical evaluations over three
real-world datasets, we show that Tgnn trained using loss based on proposed
negative sampling provides consistent superior performance.Comment: 10 page
Learning Robust Node Representations on Graphs
Graph neural networks (GNN), as a popular methodology for node representation
learning on graphs, currently mainly focus on preserving the smoothness and
identifiability of node representations. A robust node representation on graphs
should further hold the stability property which means a node representation is
resistant to slight perturbations on the input. In this paper, we introduce the
stability of node representations in addition to the smoothness and
identifiability, and develop a novel method called contrastive graph neural
networks (CGNN) that learns robust node representations in an unsupervised
manner. Specifically, CGNN maintains the stability and identifiability by a
contrastive learning objective, while preserving the smoothness with existing
GNN models. Furthermore, the proposed method is a generic framework that can be
equipped with many other backbone models (e.g. GCN, GraphSage and GAT).
Extensive experiments on four benchmarks under both transductive and inductive
learning setups demonstrate the effectiveness of our method in comparison with
recent supervised and unsupervised models.Comment: 16 page
Not All Negatives Are Worth Attending to: Meta-Bootstrapping Negative Sampling Framework for Link Prediction
The rapid development of graph neural networks (GNNs) encourages the rising
of link prediction, achieving promising performance with various applications.
Unfortunately, through a comprehensive analysis, we surprisingly find that
current link predictors with dynamic negative samplers (DNSs) suffer from the
migration phenomenon between "easy" and "hard" samples, which goes against the
preference of DNS of choosing "hard" negatives, thus severely hindering
capability. Towards this end, we propose the MeBNS framework, serving as a
general plugin that can potentially improve current negative sampling based
link predictors. In particular, we elaborately devise a Meta-learning Supported
Teacher-student GNN (MST-GNN) that is not only built upon teacher-student
architecture for alleviating the migration between "easy" and "hard" samples
but also equipped with a meta learning based sample re-weighting module for
helping the student GNN distinguish "hard" samples in a fine-grained manner. To
effectively guide the learning of MST-GNN, we prepare a Structure enhanced
Training Data Generator (STD-Generator) and an Uncertainty based Meta Data
Collector (UMD-Collector) for supporting the teacher and student GNN,
respectively. Extensive experiments show that the MeBNS achieves remarkable
performance across six link prediction benchmark datasets
Adversarial Curriculum Graph Contrastive Learning with Pair-wise Augmentation
Graph contrastive learning (GCL) has emerged as a pivotal technique in the
domain of graph representation learning. A crucial aspect of effective GCL is
the caliber of generated positive and negative samples, which is intrinsically
dictated by their resemblance to the original data. Nevertheless, precise
control over similarity during sample generation presents a formidable
challenge, often impeding the effective discovery of representative graph
patterns. To address this challenge, we propose an innovative framework:
Adversarial Curriculum Graph Contrastive Learning (ACGCL), which capitalizes on
the merits of pair-wise augmentation to engender graph-level positive and
negative samples with controllable similarity, alongside subgraph contrastive
learning to discern effective graph patterns therein. Within the ACGCL
framework, we have devised a novel adversarial curriculum training methodology
that facilitates progressive learning by sequentially increasing the difficulty
of distinguishing the generated samples. Notably, this approach transcends the
prevalent sparsity issue inherent in conventional curriculum learning
strategies by adaptively concentrating on more challenging training data.
Finally, a comprehensive assessment of ACGCL is conducted through extensive
experiments on six well-known benchmark datasets, wherein ACGCL conspicuously
surpasses a set of state-of-the-art baselines