84,923 research outputs found
Adaptive Pairwise Encodings for Link Prediction
Link prediction is a common task on graph-structured data that has seen
applications in a variety of domains. Classically, hand-crafted heuristics were
used for this task. Heuristic measures are chosen such that they correlate well
with the underlying factors related to link formation. In recent years, a new
class of methods has emerged that combines the advantages of message-passing
neural networks (MPNN) and heuristics methods. These methods perform
predictions by using the output of an MPNN in conjunction with a "pairwise
encoding" that captures the relationship between nodes in the candidate link.
They have been shown to achieve strong performance on numerous datasets.
However, current pairwise encodings often contain a strong inductive bias,
using the same underlying factors to classify all links. This limits the
ability of existing methods to learn how to properly classify a variety of
different links that may form from different factors. To address this
limitation, we propose a new method, LPFormer, which attempts to adaptively
learn the pairwise encodings for each link. LPFormer models the link factors
via an attention module that learns the pairwise encoding that exists between
nodes by modeling multiple factors integral to link prediction. Extensive
experiments demonstrate that LPFormer can achieve SOTA performance on numerous
datasets while maintaining efficiency
Neural Common Neighbor with Completion for Link Prediction
Despite its outstanding performance in various graph tasks, vanilla Message
Passing Neural Network (MPNN) usually fails in link prediction tasks, as it
only uses representations of two individual target nodes and ignores the
pairwise relation between them. To capture the pairwise relations, some models
add manual features to the input graph and use the output of MPNN to produce
pairwise representations. In contrast, others directly use manual features as
pairwise representations. Though this simplification avoids applying a GNN to
each link individually and thus improves scalability, these models still have
much room for performance improvement due to the hand-crafted and unlearnable
pairwise features. To upgrade performance while maintaining scalability, we
propose Neural Common Neighbor (NCN), which uses learnable pairwise
representations. To further boost NCN, we study the unobserved link problem.
The incompleteness of the graph is ubiquitous and leads to distribution shifts
between the training and test set, loss of common neighbor information, and
performance degradation of models. Therefore, we propose two intervention
methods: common neighbor completion and target link removal. Combining the two
methods with NCN, we propose Neural Common Neighbor with Completion (NCNC). NCN
and NCNC outperform recent strong baselines by large margins. NCNC achieves
state-of-the-art performance in link prediction tasks. Our code is available at
https://github.com/GraphPKU/NeuralCommonNeighbor
Link Mining for Kernel-based Compound-Protein Interaction Predictions Using a Chemogenomics Approach
Virtual screening (VS) is widely used during computational drug discovery to
reduce costs. Chemogenomics-based virtual screening (CGBVS) can be used to
predict new compound-protein interactions (CPIs) from known CPI network data
using several methods, including machine learning and data mining. Although
CGBVS facilitates highly efficient and accurate CPI prediction, it has poor
performance for prediction of new compounds for which CPIs are unknown. The
pairwise kernel method (PKM) is a state-of-the-art CGBVS method and shows high
accuracy for prediction of new compounds. In this study, on the basis of link
mining, we improved the PKM by combining link indicator kernel (LIK) and
chemical similarity and evaluated the accuracy of these methods. The proposed
method obtained an average area under the precision-recall curve (AUPR) value
of 0.562, which was higher than that achieved by the conventional Gaussian
interaction profile (GIP) method (0.425), and the calculation time was only
increased by a few percent
Learning from Heterogeneity: A Dynamic Learning Framework for Hypergraphs
Graph neural network (GNN) has gained increasing popularity in recent years
owing to its capability and flexibility in modeling complex graph structure
data. Among all graph learning methods, hypergraph learning is a technique for
exploring the implicit higher-order correlations when training the embedding
space of the graph. In this paper, we propose a hypergraph learning framework
named LFH that is capable of dynamic hyperedge construction and attentive
embedding update utilizing the heterogeneity attributes of the graph.
Specifically, in our framework, the high-quality features are first generated
by the pairwise fusion strategy that utilizes explicit graph structure
information when generating initial node embedding. Afterwards, a hypergraph is
constructed through the dynamic grouping of implicit hyperedges, followed by
the type-specific hypergraph learning process. To evaluate the effectiveness of
our proposed framework, we conduct comprehensive experiments on several popular
datasets with eleven state-of-the-art models on both node classification and
link prediction tasks, which fall into categories of homogeneous pairwise graph
learning, heterogeneous pairwise graph learning, and hypergraph learning. The
experiment results demonstrate a significant performance gain (average 12.5% in
node classification and 13.3% in link prediction) compared with recent
state-of-the-art methods
Disentangling Node Attributes from Graph Topology for Improved Generalizability in Link Prediction
Link prediction is a crucial task in graph machine learning with diverse
applications. We explore the interplay between node attributes and graph
topology and demonstrate that incorporating pre-trained node attributes
improves the generalization power of link prediction models. Our proposed
method, UPNA (Unsupervised Pre-training of Node Attributes), solves the
inductive link prediction problem by learning a function that takes a pair of
node attributes and predicts the probability of an edge, as opposed to Graph
Neural Networks (GNN), which can be prone to topological shortcuts in graphs
with power-law degree distribution. In this manner, UPNA learns a significant
part of the latent graph generation mechanism since the learned function can be
used to add incoming nodes to a growing graph. By leveraging pre-trained node
attributes, we overcome observational bias and make meaningful predictions
about unobserved nodes, surpassing state-of-the-art performance (3X to 34X
improvement on benchmark datasets). UPNA can be applied to various pairwise
learning tasks and integrated with existing link prediction models to enhance
their generalizability and bolster graph generative models.Comment: 17 pages, 6 figure
Recommended from our members
Network link prediction by global silencing of indirect correlations
Predicting physical and functional links between cellular components is a fundamental challenge of biology and network science. Yet, correlations, a ubiquitous input for biological link prediction, are affected by both direct and indirect effects, confounding our ability to identify true pairwise interactions. Here we exploit the fundamental properties of dynamical correlations in networks to develop a method to silence indirect effects. The method receives as input the observed correlations between node pairs and uses a matrix transformation to turn the correlation matrix into a highly discriminative silenced matrix, which enhances only the terms associated with direct causal links. Achieving perfect accuracy in model systems, we test the method against empirical data collected for the Escherichia coli regulatory interaction network, showing that it improves on the best preforming link prediction methods. Overall the silencing methodology helps translate the abundant correlation data into valuable local information, with applications ranging from link prediction to inferring the dynamical mechanisms governing biological networks
Latent Space Model for Multi-Modal Social Data
With the emergence of social networking services, researchers enjoy the
increasing availability of large-scale heterogenous datasets capturing online
user interactions and behaviors. Traditional analysis of techno-social systems
data has focused mainly on describing either the dynamics of social
interactions, or the attributes and behaviors of the users. However,
overwhelming empirical evidence suggests that the two dimensions affect one
another, and therefore they should be jointly modeled and analyzed in a
multi-modal framework. The benefits of such an approach include the ability to
build better predictive models, leveraging social network information as well
as user behavioral signals. To this purpose, here we propose the Constrained
Latent Space Model (CLSM), a generalized framework that combines Mixed
Membership Stochastic Blockmodels (MMSB) and Latent Dirichlet Allocation (LDA)
incorporating a constraint that forces the latent space to concurrently
describe the multiple data modalities. We derive an efficient inference
algorithm based on Variational Expectation Maximization that has a
computational cost linear in the size of the network, thus making it feasible
to analyze massive social datasets. We validate the proposed framework on two
problems: prediction of social interactions from user attributes and behaviors,
and behavior prediction exploiting network information. We perform experiments
with a variety of multi-modal social systems, spanning location-based social
networks (Gowalla), social media services (Instagram, Orkut), e-commerce and
review sites (Amazon, Ciao), and finally citation networks (Cora). The results
indicate significant improvement in prediction accuracy over state of the art
methods, and demonstrate the flexibility of the proposed approach for
addressing a variety of different learning problems commonly occurring with
multi-modal social data.Comment: 12 pages, 7 figures, 2 table
Higher-order temporal network effects through triplet evolution
We study the evolution of networks through ‘triplets’ — three-node graphlets. We develop a method to compute a transition matrix to describe the evolution of triplets in temporal networks. To identify the importance of higher-order interactions in the evolution of networks, we compare both artificial and real-world data to a model based on pairwise interactions only. The significant differences between the computed matrix and the calculated matrix from the fitted parameters demonstrate that non-pairwise interactions exist for various real-world systems in space and time, such as our data sets. Furthermore, this also reveals that different patterns of higher-order interaction are involved in different real-world situations. To test our approach, we then use these transition matrices as the basis of a link prediction algorithm. We investigate our algorithm’s performance on four temporal networks, comparing our approach against ten other link prediction methods. Our results show that higher-order interactions in both space and time play a crucial role in the evolution of networks as we find our method, along with two other methods based on non-local interactions, give the best overall performance. The results also confirm the concept that the higher-order interaction patterns, i.e., triplet dynamics, can help us understand and predict the evolution of different real-world systems
- …