26,001 research outputs found
PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development
This paper describes PlinyCompute, a system for development of
high-performance, data-intensive, distributed computing tools and libraries. In
the large, PlinyCompute presents the programmer with a very high-level,
declarative interface, relying on automatic, relational-database style
optimization to figure out how to stage distributed computations. However, in
the small, PlinyCompute presents the capable systems programmer with a
persistent object data model and API (the "PC object model") and associated
memory management system that has been designed from the ground-up for high
performance, distributed, data-intensive computing. This contrasts with most
other Big Data systems, which are constructed on top of the Java Virtual
Machine (JVM), and hence must at least partially cede performance-critical
concerns such as memory management (including layout and de/allocation) and
virtual method/function dispatch to the JVM. This hybrid approach---declarative
in the large, trusting the programmer's ability to utilize PC object model
efficiently in the small---results in a system that is ideal for the
development of reusable, data-intensive tools and libraries. Through extensive
benchmarking, we show that implementing complex objects manipulation and
non-trivial, library-style computations on top of PlinyCompute can result in a
speedup of 2x to more than 50x or more compared to equivalent implementations
on Spark.Comment: 48 pages, including references and Appendi
Proceedings of the 3rd Workshop on Domain-Specific Language Design and Implementation (DSLDI 2015)
The goal of the DSLDI workshop is to bring together researchers and
practitioners interested in sharing ideas on how DSLs should be designed,
implemented, supported by tools, and applied in realistic application contexts.
We are both interested in discovering how already known domains such as graph
processing or machine learning can be best supported by DSLs, but also in
exploring new domains that could be targeted by DSLs. More generally, we are
interested in building a community that can drive forward the development of
modern DSLs. These informal post-proceedings contain the submitted talk
abstracts to the 3rd DSLDI workshop (DSLDI'15), and a summary of the panel
discussion on Language Composition
Teaching Parallel Programming Using Java
This paper presents an overview of the "Applied Parallel Computing" course
taught to final year Software Engineering undergraduate students in Spring 2014
at NUST, Pakistan. The main objective of the course was to introduce practical
parallel programming tools and techniques for shared and distributed memory
concurrent systems. A unique aspect of the course was that Java was used as the
principle programming language. The course was divided into three sections. The
first section covered parallel programming techniques for shared memory systems
that include multicore and Symmetric Multi-Processor (SMP) systems. In this
section, Java threads was taught as a viable programming API for such systems.
The second section was dedicated to parallel programming tools meant for
distributed memory systems including clusters and network of computers. We used
MPJ Express-a Java MPI library-for conducting programming assignments and lab
work for this section. The third and the final section covered advanced topics
including the MapReduce programming model using Hadoop and the General Purpose
Computing on Graphics Processing Units (GPGPU).Comment: 8 Pages, 6 figures, MPJ Express, MPI Java, Teaching Parallel
Programmin
Proof-Pattern Recognition and Lemma Discovery in ACL2
We present a novel technique for combining statistical machine learning for
proof-pattern recognition with symbolic methods for lemma discovery. The
resulting tool, ACL2(ml), gathers proof statistics and uses statistical
pattern-recognition to pre-processes data from libraries, and then suggests
auxiliary lemmas in new proofs by analogy with already seen examples. This
paper presents the implementation of ACL2(ml) alongside theoretical
descriptions of the proof-pattern recognition and lemma discovery methods
involved in it
Automatically Leveraging MapReduce Frameworks for Data-Intensive Applications
MapReduce is a popular programming paradigm for developing large-scale,
data-intensive computation. Many frameworks that implement this paradigm have
recently been developed. To leverage these frameworks, however, developers must
become familiar with their APIs and rewrite existing code. Casper is a new tool
that automatically translates sequential Java programs into the MapReduce
paradigm. Casper identifies potential code fragments to rewrite and translates
them in two steps: (1) Casper uses program synthesis to search for a program
summary (i.e., a functional specification) of each code fragment. The summary
is expressed using a high-level intermediate language resembling the MapReduce
paradigm and verified to be semantically equivalent to the original using a
theorem prover. (2) Casper generates executable code from the summary, using
either the Hadoop, Spark, or Flink API. We evaluated Casper by automatically
converting real-world, sequential Java benchmarks to MapReduce. The resulting
benchmarks perform up to 48.2x faster compared to the original.Comment: 12 pages, additional 4 pages of references and appendi
FooPar: A Functional Object Oriented Parallel Framework in Scala
We present FooPar, an extension for highly efficient Parallel Computing in
the multi-paradigm programming language Scala. Scala offers concise and clean
syntax and integrates functional programming features. Our framework FooPar
combines these features with parallel computing techniques. FooPar is designed
modular and supports easy access to different communication backends for
distributed memory architectures as well as high performance math libraries. In
this article we use it to parallelize matrix matrix multiplication and show its
scalability by a isoefficiency analysis. In addition, results based on a
empirical analysis on two supercomputers are given. We achieve close-to-optimal
performance wrt. theoretical peak performance. Based on this result we conclude
that FooPar allows to fully access Scala's design features without suffering
from performance drops when compared to implementations purely based on C and
MPI
Group Communication Patterns for High Performance Computing in Scala
We developed a Functional object-oriented Parallel framework (FooPar) for
high-level high-performance computing in Scala. Central to this framework are
Distributed Memory Parallel Data structures (DPDs), i.e., collections of data
distributed in a shared nothing system together with parallel operations on
these data. In this paper, we first present FooPar's architecture and the idea
of DPDs and group communications. Then, we show how DPDs can be implemented
elegantly and efficiently in Scala based on the Traversable/Builder pattern,
unifying Functional and Object-Oriented Programming. We prove the correctness
and safety of one communication algorithm and show how specification testing
(via ScalaCheck) can be used to bridge the gap between proof and
implementation. Furthermore, we show that the group communication operations of
FooPar outperform those of the MPJ Express open source MPI-bindings for Java,
both asymptotically and empirically. FooPar has already been shown to be
capable of achieving close-to-optimal performance for dense matrix-matrix
multiplication via JNI. In this article, we present results on a parallel
implementation of the Floyd-Warshall algorithm in FooPar, achieving more than
94 % efficiency compared to the serial version on a cluster using 100 cores for
matrices of dimension 38000 x 38000
- …