156 research outputs found
Understanding Coarsening for Embedding Large-Scale Graphs
A significant portion of the data today, e.g, social networks, web
connections, etc., can be modeled by graphs. A proper analysis of graphs with
Machine Learning (ML) algorithms has the potential to yield far-reaching
insights into many areas of research and industry. However, the irregular
structure of graph data constitutes an obstacle for running ML tasks on graphs
such as link prediction, node classification, and anomaly detection. Graph
embedding is a compute-intensive process of representing graphs as a set of
vectors in a d-dimensional space, which in turn makes it amenable to ML tasks.
Many approaches have been proposed in the literature to improve the performance
of graph embedding, e.g., using distributed algorithms, accelerators, and
pre-processing techniques. Graph coarsening, which can be considered a
pre-processing step, is a structural approximation of a given, large graph with
a smaller one. As the literature suggests, the cost of embedding significantly
decreases when coarsening is employed. In this work, we thoroughly analyze the
impact of the coarsening quality on the embedding performance both in terms of
speed and accuracy. Our experiments with a state-of-the-art, fast graph
embedding tool show that there is an interplay between the coarsening decisions
taken and the embedding quality.Comment: 10 pages, 6 figures, submitted to 2020 IEEE International Conference
on Big Dat
SMGRL: Scalable Multi-resolution Graph Representation Learning
Graph convolutional networks (GCNs) allow us to learn topologically-aware
node embeddings, which can be useful for classification or link prediction.
However, they are unable to capture long-range dependencies between nodes
without adding additional layers -- which in turn leads to over-smoothing and
increased time and space complexity. Further, the complex dependencies between
nodes make mini-batching challenging, limiting their applicability to large
graphs. We propose a Scalable Multi-resolution Graph Representation Learning
(SMGRL) framework that enables us to learn multi-resolution node embeddings
efficiently. Our framework is model-agnostic and can be applied to any existing
GCN model. We dramatically reduce training costs by training only on a
reduced-dimension coarsening of the original graph, then exploit
self-similarity to apply the resulting algorithm at multiple resolutions. The
resulting multi-resolution embeddings can be aggregated to yield high-quality
node embeddings that capture both long- and short-range dependencies. Our
experiments show that this leads to improved classification accuracy, without
incurring high computational costs.Comment: 22 page
Recent Advances in Graph Partitioning
We survey recent trends in practical algorithms for balanced graph
partitioning together with applications and future research directions
- …