54 research outputs found
A Comparison of Parallel Graph Processing Implementations
The rapidly growing number of large network analysis problems has led to the
emergence of many parallel and distributed graph processing systems---one
survey in 2014 identified over 80. Since then, the landscape has evolved; some
packages have become inactive while more are being developed. Determining the
best approach for a given problem is infeasible for most developers. To enable
easy, rigorous, and repeatable comparison of the capabilities of such systems,
we present an approach and associated software for analyzing the performance
and scalability of parallel, open-source graph libraries. We demonstrate our
approach on five graph processing packages: GraphMat, the Graph500, the Graph
Algorithm Platform Benchmark Suite, GraphBIG, and PowerGraph using synthetic
and real-world datasets. We examine previously overlooked aspects of parallel
graph processing performance such as phases of execution and energy usage for
three algorithms: breadth first search, single source shortest paths, and
PageRank and compare our results to Graphalytics.Comment: 10 pages, 10 figures, Submitted to EuroPar 2017 and rejected. Revised
and submitted to IEEE Cluster 201
Distributed Triangle Counting in the Graphulo Matrix Math Library
Triangle counting is a key algorithm for large graph analysis. The Graphulo
library provides a framework for implementing graph algorithms on the Apache
Accumulo distributed database. In this work we adapt two algorithms for
counting triangles, one that uses the adjacency matrix and another that also
uses the incidence matrix, to the Graphulo library for server-side processing
inside Accumulo. Cloud-based experiments show a similar performance profile for
these different approaches on the family of power law Graph500 graphs, for
which data skew increasingly bottlenecks. These results motivate the design of
skew-aware hybrid algorithms that we propose for future work.Comment: Honorable mention in the 2017 IEEE HPEC's Graph Challeng
The LDBC Graphalytics Benchmark
In this document, we describe LDBC Graphalytics, an industrial-grade
benchmark for graph analysis platforms. The main goal of Graphalytics is to
enable the fair and objective comparison of graph analysis platforms. Due to
the diversity of bottlenecks and performance issues such platforms need to
address, Graphalytics consists of a set of selected deterministic algorithms
for full-graph analysis, standard graph datasets, synthetic dataset generators,
and reference output for validation purposes. Its test harness produces deep
metrics that quantify multiple kinds of systems scalability, weak and strong,
and robustness, such as failures and performance variability. The benchmark
also balances comprehensiveness with runtime necessary to obtain the deep
metrics. The benchmark comes with open-source software for generating
performance data, for validating algorithm results, for monitoring and sharing
performance data, and for obtaining the final benchmark result as a standard
performance report
Performance Introspection of Graph Databases
The explosion of graph data in social and biological networks, recommendation systems, provenance databases, etc. makes graph storage and processing of paramount importance. We present a performance introspection framework for graph databases, PIG, which provides both a toolset and methodology for understanding graph database performance. PIG consists of a hierarchical collection of benchmarks that compose to produce performance models; the models provide a way to illuminate the strengths and weaknesses of a particular implementation. The suite has three layers of benchmarks: primitive operations, composite access patterns, and graph algorithms. While the framework could be used to compare different graph database systems, its primary goal is to help explain the observed performance of a particular system. Such introspection allows one to evaluate the degree to which systems exploit their knowledge of graph access patterns. We present both the PIG methodology and infrastructure and then demonstrate its efficacy by analyzing the popular Neo4j and DEX graph databases.Engineering and Applied Science
Computing methods for parallel processing and analysis on complex networks
Nowadays to solve some problems is required to model complex systems to simulate and
understand its behavior.
A good example of one of those complex systems is the Facebook Social Network, this
system represents people and their relationships, Other example, the Internet composed
by a vast number of servers, computers, modems and routers, All Science field (physics,
economics political, and so on) have complex systems which are complex because of the
big volume of data required to represent them and their fast change on their structure
Analyze the behavior of these complex systems is important to create simulations or
discover dynamics over it with main goal of understand how it works.
Some complex systems cannot be easily modeled; We can begin by analyzing their
structure, this is possible creating a network model, Mapping the problem´s entities and
the relations between them.
Some popular analysis over the structure of a network are:
• The Community Detection – discover how their entities are grouped
• Identify the most important entities – measure the node´s influence over the
network
• Features over whole network like – the diameter, number of triangles, clustering
coefficient, and the shortest path between two entities.
Multiple algorithms have been created to give a result to these analyses over the network
model although if they are executed by one machine take a lot of time to complete the task
or may not be executed due to machine limitation resources.
As more demanding applications have been appearing to process the algorithms of these
type of analysis, several parallel programming models and different kind of hardware
architecture have been created to deal with the big input of data, reduce the time
execution, save power consumption and enhance the efficiency in the computation in each
machine also taking in mine the application requirements.
Parallelize these algorithms are a challenge due to:
• We need to analyze data dependence to implement a parallel version of the
algorithm always taking in mine the scalability and the performance of the code.
• Create a implementation of the algorithm for one parallel programming model like
MapReduce (Apache Hadoop), RDD (Apache Spark), Pregel(Apache Giraph) these
oriented to bigdata or HPC models how MPI + OpenMP , OmpSS or CUDA.
• Distribute the data input over the processing platform for each node or offload it
into accelerators such as GPU or FPGA and so on.
• Store the data input and store the result of the processing requires techniques of
Distribute file systems(HDFS), distribute NoSQL Data Bases (Object Data Bases,
Graph Data Bases, Document Data Bases) or traditional relational Data
Bases(oracle, SQL server).
In this Master Thesis, we decided create Graph processing using Apache bigdata Tools
mainly creating testing over MareNostrum III and the Amazon cloud for some Community
Detection Algorithms using SNAP Graphs with ground-truth communities.
Creating a comparative between their parallel computational time execution and scalability
LDBC Graphalytics: A Benchmark for Large-Scale Graph Analysis on Parallel and Distributed Platforms
ABSTRACT In this paper we introduce LDBC Graphalytics, a new industrial-grade benchmark for graph analysis platforms. It consists of six deterministic algorithms, standard datasets, synthetic dataset generators, and reference output, that enable the objective comparison of graph analysis platforms. Its test harness produces deep metrics that quantify multiple kinds of system scalability, such as horizontal/vertical and weak/strong, and of robustness, such as failures and performance variability. The benchmark comes with open-source software for generating data and monitoring performance. We describe and analyze six implementations of the benchmark (three from the community, three from the industry), providing insights into the strengths and weaknesses of the platforms. Key to our contribution, vendors perform the tuning and benchmarking of their platforms
Graphulo Implementation of Server-Side Sparse Matrix Multiply in the Accumulo Database
The Apache Accumulo database excels at distributed storage and indexing and
is ideally suited for storing graph data. Many big data analytics compute on
graph data and persist their results back to the database. These graph
calculations are often best performed inside the database server. The GraphBLAS
standard provides a compact and efficient basis for a wide range of graph
applications through a small number of sparse matrix operations. In this
article, we implement GraphBLAS sparse matrix multiplication server-side by
leveraging Accumulo's native, high-performance iterators. We compare the
mathematics and performance of inner and outer product implementations, and
show how an outer product implementation achieves optimal performance near
Accumulo's peak write rate. We offer our work as a core component to the
Graphulo library that will deliver matrix math primitives for graph analytics
within Accumulo.Comment: To be presented at IEEE HPEC 2015: http://www.ieee-hpec.org
Adaptation, deployment and evaluation of a railway simulator in cloud environments
Many scientific areas make extensive use of computer simulations to study realworld
processes. As they become more complex and resource-intensive, traditional
programming paradigms running on supercomputers have shown to be limited by
their hardware resources.
The Cloud and its elastic nature has been increasingly seen as a valid alternative
for simulation execution, as it aims to provide virtually infinite resources, thus
unlimited scalability. In order to bene t from this, simulators must be adapted to
this paradigm since cloud migration tends to add virtualization and communication
overhead.
This work has the main objective of migrating a power consumption railway
simulator to the Cloud, with minimal impact in the original code and preserving
performance. We propose a data-centric adaptation based in MapReduce to distribute
the simulation load across several nodes while minimising data transmission.
We deployed our solution on an Amazon EC2 virtual cluster and measured its
performance. We did the same in in our local cluster to compare the solution's performance
against the original application when the Cloud's overhead is not present.
Our tests show that the resulting application is highly scalable and shows a better
overall performance regarding the original simulator in both environments.
This document summarises the author's work during the whole adaptation development
process .IngenierÃa Informátic
- …