6,337 research outputs found
GraphX: Unifying Data-Parallel and Graph-Parallel Analytics
From social networks to language modeling, the growing scale and importance
of graph data has driven the development of numerous new graph-parallel systems
(e.g., Pregel, GraphLab). By restricting the computation that can be expressed
and introducing new techniques to partition and distribute the graph, these
systems can efficiently execute iterative graph algorithms orders of magnitude
faster than more general data-parallel systems. However, the same restrictions
that enable the performance gains also make it difficult to express many of the
important stages in a typical graph-analytics pipeline: constructing the graph,
modifying its structure, or expressing computation that spans multiple graphs.
As a consequence, existing graph analytics pipelines compose graph-parallel and
data-parallel systems using external storage systems, leading to extensive data
movement and complicated programming model.
To address these challenges we introduce GraphX, a distributed graph
computation framework that unifies graph-parallel and data-parallel
computation. GraphX provides a small, core set of graph-parallel operators
expressive enough to implement the Pregel and PowerGraph abstractions, yet
simple enough to be cast in relational algebra. GraphX uses a collection of
query optimization techniques such as automatic join rewrites to efficiently
implement these graph-parallel operators. We evaluate GraphX on real-world
graphs and workloads and demonstrate that GraphX achieves comparable
performance as specialized graph computation systems, while outperforming them
in end-to-end graph pipelines. Moreover, GraphX achieves a balance between
expressiveness, performance, and ease of use
Recommended from our members
A data-driven model for parallel interpretation of logic programms [sic]
The main objective of this paper is to present a model of computation which permits logic programs to be executed on a highly-parallel computer architecture. It demonstrates how logic programs may be converted into collections of dataflow graphs in which resolution is viewed as a process of finding matches between certain graph templates and portions of the dataflow graphs. This graph fitting process is carried out by tokens propogating asynchronously through the dataflow graph; thus computation is entirely data-driven, without the need for any centralized control. It is shown that at the implementation level the proposed model is very similar to a general dataflow system and hence a dataflow architecture could easily be extended to support the proposed model
Modeling Resolution of Resources Contention in Synchronous Data Flow Graphs
Synchronous Data Flow graphs are widely adopted in the designing of streaming applications, but were originally formulated to describe only how an application is partitioned and which data are exchanged among different tasks. Since Synchronous Data Flow graphs are often used to describe and evaluate complete design solutions, missing information (e.g., mapping, scheduling, etc.) has to be included in them by means of further actors and channels to obtain accurate evaluations. To address this issue preserving the simplicity of the representation, techniques that model data transfer delays by means of ad-hoc actors have been proposed, but they model independently each communication ignoring contentions. Moreover, they do not usually consider at all delays due to buffer contentions, potentially overestimating the throughput of a design solution. In this paper a technique to extend Synchronous Data Flow graphs by adding ad-hoc actors and channels to model resolution of resources contentions is proposed. The results show that the number of added actors and channels is limited but that they can significantly increase the Synchronous Data Flow graph accuracy
Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code
This paper introduces Tiramisu, a polyhedral framework designed to generate
high performance code for multiple platforms including multicores, GPUs, and
distributed machines. Tiramisu introduces a scheduling language with novel
extensions to explicitly manage the complexities that arise when targeting
these systems. The framework is designed for the areas of image processing,
stencils, linear algebra and deep learning. Tiramisu has two main features: it
relies on a flexible representation based on the polyhedral model and it has a
rich scheduling language allowing fine-grained control of optimizations.
Tiramisu uses a four-level intermediate representation that allows full
separation between the algorithms, loop transformations, data layouts, and
communication. This separation simplifies targeting multiple hardware
architectures with the same algorithm. We evaluate Tiramisu by writing a set of
image processing, deep learning, and linear algebra benchmarks and compare them
with state-of-the-art compilers and hand-tuned libraries. We show that Tiramisu
matches or outperforms existing compilers and libraries on different hardware
architectures, including multicore CPUs, GPUs, and distributed machines.Comment: arXiv admin note: substantial text overlap with arXiv:1803.0041
Recommended from our members
Behavioral synthesis from VHDL using structured modeling
This dissertation describes work in behavioral synthesis involving the development of a VHDL Synthesis System VSS which accepts a VHDL behavioral input specification and performs technology independent synthesis to generate a circuit netlist of generic components. The VHDL language is used for input and output descriptions. An intermediate representation which incorporates signal typing and component attributes simplifies compilation and facilitates design optimization.A Structured Modeling methodology has been developed to suggest standard VHDL modeling practices for synthesis. Structured modeling provides recommendations for the use of available VHDL description styles so that optimal designs will be synthesized.A design composed of generic components is synthesized from the input description through a process of Graph Compilation, Graph Criticism, and Design Compilation. Experiments were performed to demonstrate the effects of different modeling styles on the quality of the design produced by VSS. Several alternative VHDL models were examined for each benchmark, illustrating the improvements in design quality achieved when Structured Modeling guidelines were followed
A transformation-based approach to business process management in the cloud
Business Process Management (BPM) has gained a lot of popularity in the last two decades, since it allows organizations to manage and optimize their business processes. However, purchasing a BPM system can be an expensive investment for a company, since not only the software itself needs to be purchased, but also hardware is required on which the process engine should run, and personnel need to be hired or allocated for setting up and maintaining the hardware and the software. Cloud computing gives its users the opportunity of using computing resources in a pay-per-use manner, and perceiving these resources as unlimited. Therefore, the application of cloud computing technologies to BPM can be extremely beneficial specially for small and middle-size companies. Nevertheless, the fear of losing or exposing sensitive data by placing these data in the cloud is one of the biggest obstacles to the deployment of cloud-based solutions in organizations nowadays. In this paper we introduce a transformation-based approach that allows companies to control the parts of their business processes that should be allocated to their own premises and to the cloud, to avoid unwanted exposure of confidential data and to profit from the high performance of cloud environments. In our approach, the user annotates activities and data that should be placed in the cloud or on-premise, and an automated transformation generates the process fragments for cloud and on-premise deployment. The paper discusses the challenges of developing the transformation and presents a case study that demonstrates the applicability of the approach
Scheduling and Compiling Rate-Synchronous Programs with End-To-End Latency Constraints
We present an extension of the synchronous-reactive model for specifying multi-rate systems. A set of periodically executed components and their communication dependencies are expressed in a Lustre-like programming language with features for load balancing, resource limiting, and specifying end-to-end latencies. The language abstracts from execution time and phase offsets. This permits simple clock typing rules and a stream-based semantics, but requires each component to execute within an overall base period. A program is compiled to a single periodic task in two stages. First, Integer Linear Programming is used to determine phase offsets using standard encodings for dependencies and load balancing, and a novel encoding for end-to-end latency. Second, a code generation scheme is adapted to produce step functions. As a result, components are synchronous relative to their respective rates, but not necessarily simultaneous relative to the base period. This approach has been implemented in a prototype compiler and validated on an industrial application
The role of concurrency in an evolutionary view of programming abstractions
In this paper we examine how concurrency has been embodied in mainstream
programming languages. In particular, we rely on the evolutionary talking
borrowed from biology to discuss major historical landmarks and crucial
concepts that shaped the development of programming languages. We examine the
general development process, occasionally deepening into some language, trying
to uncover evolutionary lineages related to specific programming traits. We
mainly focus on concurrency, discussing the different abstraction levels
involved in present-day concurrent programming and emphasizing the fact that
they correspond to different levels of explanation. We then comment on the role
of theoretical research on the quest for suitable programming abstractions,
recalling the importance of changing the working framework and the way of
looking every so often. This paper is not meant to be a survey of modern
mainstream programming languages: it would be very incomplete in that sense. It
aims instead at pointing out a number of remarks and connect them under an
evolutionary perspective, in order to grasp a unifying, but not simplistic,
view of the programming languages development process
- …