96,153 research outputs found
GoFFish: A Sub-Graph Centric Framework for Large-Scale Graph Analytics
Large scale graph processing is a major research area for Big Data
exploration. Vertex centric programming models like Pregel are gaining traction
due to their simple abstraction that allows for scalable execution on
distributed systems naturally. However, there are limitations to this approach
which cause vertex centric algorithms to under-perform due to poor compute to
communication overhead ratio and slow convergence of iterative superstep. In
this paper we introduce GoFFish a scalable sub-graph centric framework
co-designed with a distributed persistent graph storage for large scale graph
analytics on commodity clusters. We introduce a sub-graph centric programming
abstraction that combines the scalability of a vertex centric approach with the
flexibility of shared memory sub-graph computation. We map Connected
Components, SSSP and PageRank algorithms to this model to illustrate its
flexibility. Further, we empirically analyze GoFFish using several real world
graphs and demonstrate its significant performance improvement, orders of
magnitude in some cases, compared to Apache Giraph, the leading open source
vertex centric implementation.Comment: Under review by a conference, 201
A Scalable Null Model for Directed Graphs Matching All Degree Distributions: In, Out, and Reciprocal
Degree distributions are arguably the most important property of real world
networks. The classic edge configuration model or Chung-Lu model can generate
an undirected graph with any desired degree distribution. This serves as a good
null model to compare algorithms or perform experimental studies. Furthermore,
there are scalable algorithms that implement these models and they are
invaluable in the study of graphs. However, networks in the real-world are
often directed, and have a significant proportion of reciprocal edges. A
stronger relation exists between two nodes when they each point to one another
(reciprocal edge) as compared to when only one points to the other (one-way
edge). Despite their importance, reciprocal edges have been disregarded by most
directed graph models.
We propose a null model for directed graphs inspired by the Chung-Lu model
that matches the in-, out-, and reciprocal-degree distributions of the real
graphs. Our algorithm is scalable and requires random numbers to
generate a graph with edges. We perform a series of experiments on real
datasets and compare with existing graph models.Comment: Camera ready version for IEEE Workshop on Network Science; fixed some
typos in tabl
Void Traversal for Guaranteed Delivery in Geometric Routing
Geometric routing algorithms like GFG (GPSR) are lightweight, scalable
algorithms that can be used to route in resource-constrained ad hoc wireless
networks. However, such algorithms run on planar graphs only. To efficiently
construct a planar graph, they require a unit-disk graph. To make the topology
unit-disk, the maximum link length in the network has to be selected
conservatively. In practical setting this leads to the designs where the node
density is rather high. Moreover, the network diameter of a planar subgraph is
greater than the original graph, which leads to longer routes. To remedy this
problem, we propose a void traversal algorithm that works on arbitrary
geometric graphs. We describe how to use this algorithm for geometric routing
with guaranteed delivery and compare its performance with GFG
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable
There has been significant recent interest in parallel graph processing due
to the need to quickly analyze the large graphs available today. Many graph
codes have been designed for distributed memory or external memory. However,
today even the largest publicly-available real-world graph (the Hyperlink Web
graph with over 3.5 billion vertices and 128 billion edges) can fit in the
memory of a single commodity multicore server. Nevertheless, most experimental
work in the literature report results on much smaller graphs, and the ones for
the Hyperlink graph use distributed or external memory. Therefore, it is
natural to ask whether we can efficiently solve a broad class of graph problems
on this graph in memory.
This paper shows that theoretically-efficient parallel graph algorithms can
scale to the largest publicly-available graphs using a single machine with a
terabyte of RAM, processing them in minutes. We give implementations of
theoretically-efficient parallel algorithms for 20 important graph problems. We
also present the optimizations and techniques that we used in our
implementations, which were crucial in enabling us to process these large
graphs quickly. We show that the running times of our implementations
outperform existing state-of-the-art implementations on the largest real-world
graphs. For many of the problems that we consider, this is the first time they
have been solved on graphs at this scale. We have made the implementations
developed in this work publicly-available as the Graph-Based Benchmark Suite
(GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium
on Parallelism in Algorithms and Architectures (SPAA), 201
Instantly Decodable Network Coding for Real-Time Scalable Video Broadcast over Wireless Networks
In this paper, we study a real-time scalable video broadcast over wireless
networks in instantly decodable network coded (IDNC) systems. Such real-time
scalable video has a hard deadline and imposes a decoding order on the video
layers.We first derive the upper bound on the probability that the individual
completion times of all receivers meet the deadline. Using this probability, we
design two prioritized IDNC algorithms, namely the expanding window IDNC
(EW-IDNC) algorithm and the non-overlapping window IDNC (NOW-IDNC) algorithm.
These algorithms provide a high level of protection to the most important video
layer before considering additional video layers in coding decisions. Moreover,
in these algorithms, we select an appropriate packet combination over a given
number of video layers so that these video layers are decoded by the maximum
number of receivers before the deadline. We formulate this packet selection
problem as a two-stage maximal clique selection problem over an IDNC graph.
Simulation results over a real scalable video stream show that our proposed
EW-IDNC and NOW-IDNC algorithms improve the received video quality compared to
the existing IDNC algorithms
Compact routing on the Internet AS-graph
Compact routing algorithms have been presented as candidates for scalable routing in the future Internet, achieving near-shortest path routing with considerably less forwarding state than the Border Gateway Protocol. Prior analyses have shown strong performance on power-law random graphs, but to better understand the applicability of compact routing algorithms in the context of the Internet, they must be evaluated against real- world data. To this end, we present the first systematic analysis of the behaviour of the Thorup-Zwick (TZ) and Brady-Cowen (BC) compact routing algorithms on snapshots of the Internet Autonomous System graph spanning a 14 year period. Both algorithms are shown to offer consistently strong performance on the AS graph, producing small forwarding tables with low stretch for all snapshots tested. We find that the average stretch for the TZ algorithm increases slightly as the AS graph has grown, while previous results on synthetic data suggested the opposite would be true. We also present new results to show which features of the algorithms contribute to their strong performance on these graphs
- …