38,714 research outputs found
E-CLoG: Counting edge-centric local graphlets
In recent years, graphlet counting has emerged as an important task in topological graph analysis. However, the existing works on graphlet counting obtain the graphlet counts for the entire network as a whole. These works capture the key graphical patterns that prevail in a given network but they fail to meet the demand of the majority of real-life graph related prediction tasks such as link prediction, edge/node classification, etc., which require to build features for an edge (or a vertex) of a network. To meet the demand for such applications, efficient algorithms are needed for counting local graphlets within the context of an edge (or a vertex). In this work, we propose an efficient method, titled E-CLOG, for counting all 3,4 and 5 size local graphlets with the context of a given edge for its all different edge orbits. We also provide a shared-memory, multi-core implementation of E-CLOG, which makes it even more scalable for very large real-world networks. In particular, We obtain strong scaling on a variety of graphs (14x-20x on 36 cores). We provide extensive experimental results to demonstrate the efficiency and effectiveness of the proposed method. For instance, we show that E-CLOG is faster than existing work by multiple order of magnitudes; for the Wordnet graph E-CLOG counts all 3,4 and 5-size local graphlets in 1.5 hours using a single thread and in only a few minutes using the parallel implementation, whereas the baseline method does not finish in more than 4 days. We also show that local graphlet counts around an edge are much better features for link prediction than well-known topological features; our experiments show that the former enjoys between 10% to 45% of improvement in the AUC value for predicting future links in three real-life social and collaboration networks
Multi-Scale Link Prediction
The automated analysis of social networks has become an important problem due
to the proliferation of social networks, such as LiveJournal, Flickr and
Facebook. The scale of these social networks is massive and continues to grow
rapidly. An important problem in social network analysis is proximity
estimation that infers the closeness of different users. Link prediction, in
turn, is an important application of proximity estimation. However, many
methods for computing proximity measures have high computational complexity and
are thus prohibitive for large-scale link prediction problems. One way to
address this problem is to estimate proximity measures via low-rank
approximation. However, a single low-rank approximation may not be sufficient
to represent the behavior of the entire network. In this paper, we propose
Multi-Scale Link Prediction (MSLP), a framework for link prediction, which can
handle massive networks. The basis idea of MSLP is to construct low rank
approximations of the network at multiple scales in an efficient manner. Based
on this approach, MSLP combines predictions at multiple scales to make robust
and accurate predictions. Experimental results on real-life datasets with more
than a million nodes show the superior performance and scalability of our
method.Comment: 20 pages, 10 figure
Gravity-Inspired Graph Autoencoders for Directed Link Prediction
Graph autoencoders (AE) and variational autoencoders (VAE) recently emerged
as powerful node embedding methods. In particular, graph AE and VAE were
successfully leveraged to tackle the challenging link prediction problem,
aiming at figuring out whether some pairs of nodes from a graph are connected
by unobserved edges. However, these models focus on undirected graphs and
therefore ignore the potential direction of the link, which is limiting for
numerous real-life applications. In this paper, we extend the graph AE and VAE
frameworks to address link prediction in directed graphs. We present a new
gravity-inspired decoder scheme that can effectively reconstruct directed
graphs from a node embedding. We empirically evaluate our method on three
different directed link prediction tasks, for which standard graph AE and VAE
perform poorly. We achieve competitive results on three real-world graphs,
outperforming several popular baselines.Comment: ACM International Conference on Information and Knowledge Management
(CIKM 2019
A Degeneracy Framework for Scalable Graph Autoencoders
In this paper, we present a general framework to scale graph autoencoders
(AE) and graph variational autoencoders (VAE). This framework leverages graph
degeneracy concepts to train models only from a dense subset of nodes instead
of using the entire graph. Together with a simple yet effective propagation
mechanism, our approach significantly improves scalability and training speed
while preserving performance. We evaluate and discuss our method on several
variants of existing graph AE and VAE, providing the first application of these
models to large graphs with up to millions of nodes and edges. We achieve
empirically competitive results w.r.t. several popular scalable node embedding
methods, which emphasizes the relevance of pursuing further research towards
more scalable graph AE and VAE.Comment: International Joint Conference on Artificial Intelligence (IJCAI
2019
Representation Learning for Attributed Multiplex Heterogeneous Network
Network embedding (or graph embedding) has been widely used in many
real-world applications. However, existing methods mainly focus on networks
with single-typed nodes/edges and cannot scale well to handle large networks.
Many real-world networks consist of billions of nodes and edges of multiple
types, and each node is associated with different attributes. In this paper, we
formalize the problem of embedding learning for the Attributed Multiplex
Heterogeneous Network and propose a unified framework to address this problem.
The framework supports both transductive and inductive learning. We also give
the theoretical analysis of the proposed framework, showing its connection with
previous works and proving its better expressiveness. We conduct systematical
evaluations for the proposed framework on four different genres of challenging
datasets: Amazon, YouTube, Twitter, and Alibaba. Experimental results
demonstrate that with the learned embeddings from the proposed framework, we
can achieve statistically significant improvements (e.g., 5.99-28.23% lift by
F1 scores; p<<0.01, t-test) over previous state-of-the-art methods for link
prediction. The framework has also been successfully deployed on the
recommendation system of a worldwide leading e-commerce company, Alibaba Group.
Results of the offline A/B tests on product recommendation further confirm the
effectiveness and efficiency of the framework in practice.Comment: Accepted to KDD 2019. Website: https://sites.google.com/view/gatn
- …