96,153 research outputs found

    GoFFish: A Sub-Graph Centric Framework for Large-Scale Graph Analytics

    Full text link
    Large scale graph processing is a major research area for Big Data exploration. Vertex centric programming models like Pregel are gaining traction due to their simple abstraction that allows for scalable execution on distributed systems naturally. However, there are limitations to this approach which cause vertex centric algorithms to under-perform due to poor compute to communication overhead ratio and slow convergence of iterative superstep. In this paper we introduce GoFFish a scalable sub-graph centric framework co-designed with a distributed persistent graph storage for large scale graph analytics on commodity clusters. We introduce a sub-graph centric programming abstraction that combines the scalability of a vertex centric approach with the flexibility of shared memory sub-graph computation. We map Connected Components, SSSP and PageRank algorithms to this model to illustrate its flexibility. Further, we empirically analyze GoFFish using several real world graphs and demonstrate its significant performance improvement, orders of magnitude in some cases, compared to Apache Giraph, the leading open source vertex centric implementation.Comment: Under review by a conference, 201

    A Scalable Null Model for Directed Graphs Matching All Degree Distributions: In, Out, and Reciprocal

    Full text link
    Degree distributions are arguably the most important property of real world networks. The classic edge configuration model or Chung-Lu model can generate an undirected graph with any desired degree distribution. This serves as a good null model to compare algorithms or perform experimental studies. Furthermore, there are scalable algorithms that implement these models and they are invaluable in the study of graphs. However, networks in the real-world are often directed, and have a significant proportion of reciprocal edges. A stronger relation exists between two nodes when they each point to one another (reciprocal edge) as compared to when only one points to the other (one-way edge). Despite their importance, reciprocal edges have been disregarded by most directed graph models. We propose a null model for directed graphs inspired by the Chung-Lu model that matches the in-, out-, and reciprocal-degree distributions of the real graphs. Our algorithm is scalable and requires O(m)O(m) random numbers to generate a graph with mm edges. We perform a series of experiments on real datasets and compare with existing graph models.Comment: Camera ready version for IEEE Workshop on Network Science; fixed some typos in tabl

    Void Traversal for Guaranteed Delivery in Geometric Routing

    Full text link
    Geometric routing algorithms like GFG (GPSR) are lightweight, scalable algorithms that can be used to route in resource-constrained ad hoc wireless networks. However, such algorithms run on planar graphs only. To efficiently construct a planar graph, they require a unit-disk graph. To make the topology unit-disk, the maximum link length in the network has to be selected conservatively. In practical setting this leads to the designs where the node density is rather high. Moreover, the network diameter of a planar subgraph is greater than the original graph, which leads to longer routes. To remedy this problem, we propose a void traversal algorithm that works on arbitrary geometric graphs. We describe how to use this algorithm for geometric routing with guaranteed delivery and compare its performance with GFG

    Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

    Full text link
    There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server. Nevertheless, most experimental work in the literature report results on much smaller graphs, and the ones for the Hyperlink graph use distributed or external memory. Therefore, it is natural to ask whether we can efficiently solve a broad class of graph problems on this graph in memory. This paper shows that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes. We give implementations of theoretically-efficient parallel algorithms for 20 important graph problems. We also present the optimizations and techniques that we used in our implementations, which were crucial in enabling us to process these large graphs quickly. We show that the running times of our implementations outperform existing state-of-the-art implementations on the largest real-world graphs. For many of the problems that we consider, this is the first time they have been solved on graphs at this scale. We have made the implementations developed in this work publicly-available as the Graph-Based Benchmark Suite (GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 201

    Instantly Decodable Network Coding for Real-Time Scalable Video Broadcast over Wireless Networks

    Full text link
    In this paper, we study a real-time scalable video broadcast over wireless networks in instantly decodable network coded (IDNC) systems. Such real-time scalable video has a hard deadline and imposes a decoding order on the video layers.We first derive the upper bound on the probability that the individual completion times of all receivers meet the deadline. Using this probability, we design two prioritized IDNC algorithms, namely the expanding window IDNC (EW-IDNC) algorithm and the non-overlapping window IDNC (NOW-IDNC) algorithm. These algorithms provide a high level of protection to the most important video layer before considering additional video layers in coding decisions. Moreover, in these algorithms, we select an appropriate packet combination over a given number of video layers so that these video layers are decoded by the maximum number of receivers before the deadline. We formulate this packet selection problem as a two-stage maximal clique selection problem over an IDNC graph. Simulation results over a real scalable video stream show that our proposed EW-IDNC and NOW-IDNC algorithms improve the received video quality compared to the existing IDNC algorithms

    Compact routing on the Internet AS-graph

    Get PDF
    Compact routing algorithms have been presented as candidates for scalable routing in the future Internet, achieving near-shortest path routing with considerably less forwarding state than the Border Gateway Protocol. Prior analyses have shown strong performance on power-law random graphs, but to better understand the applicability of compact routing algorithms in the context of the Internet, they must be evaluated against real- world data. To this end, we present the first systematic analysis of the behaviour of the Thorup-Zwick (TZ) and Brady-Cowen (BC) compact routing algorithms on snapshots of the Internet Autonomous System graph spanning a 14 year period. Both algorithms are shown to offer consistently strong performance on the AS graph, producing small forwarding tables with low stretch for all snapshots tested. We find that the average stretch for the TZ algorithm increases slightly as the AS graph has grown, while previous results on synthetic data suggested the opposite would be true. We also present new results to show which features of the algorithms contribute to their strong performance on these graphs
    corecore