155,497 research outputs found
Learning from the Success of MPI
The Message Passing Interface (MPI) has been extremely successful as a
portable way to program high-performance parallel computers. This success has
occurred in spite of the view of many that message passing is difficult and
that other approaches, including automatic parallelization and directive-based
parallelism, are easier to use. This paper argues that MPI has succeeded
because it addresses all of the important issues in providing a parallel
programming model.Comment: 12 pages, 1 figur
Group Communication Patterns for High Performance Computing in Scala
We developed a Functional object-oriented Parallel framework (FooPar) for
high-level high-performance computing in Scala. Central to this framework are
Distributed Memory Parallel Data structures (DPDs), i.e., collections of data
distributed in a shared nothing system together with parallel operations on
these data. In this paper, we first present FooPar's architecture and the idea
of DPDs and group communications. Then, we show how DPDs can be implemented
elegantly and efficiently in Scala based on the Traversable/Builder pattern,
unifying Functional and Object-Oriented Programming. We prove the correctness
and safety of one communication algorithm and show how specification testing
(via ScalaCheck) can be used to bridge the gap between proof and
implementation. Furthermore, we show that the group communication operations of
FooPar outperform those of the MPJ Express open source MPI-bindings for Java,
both asymptotically and empirically. FooPar has already been shown to be
capable of achieving close-to-optimal performance for dense matrix-matrix
multiplication via JNI. In this article, we present results on a parallel
implementation of the Floyd-Warshall algorithm in FooPar, achieving more than
94 % efficiency compared to the serial version on a cluster using 100 cores for
matrices of dimension 38000 x 38000
Towards formal models and languages for verifiable Multi-Robot Systems
Incorrect operations of a Multi-Robot System (MRS) may not only lead to
unsatisfactory results, but can also cause economic losses and threats to
safety. These threats may not always be apparent, since they may arise as
unforeseen consequences of the interactions between elements of the system.
This call for tools and techniques that can help in providing guarantees about
MRSs behaviour. We think that, whenever possible, these guarantees should be
backed up by formal proofs to complement traditional approaches based on
testing and simulation.
We believe that tailored linguistic support to specify MRSs is a major step
towards this goal. In particular, reducing the gap between typical features of
an MRS and the level of abstraction of the linguistic primitives would simplify
both the specification of these systems and the verification of their
properties. In this work, we review different agent-oriented languages and
their features; we then consider a selection of case studies of interest and
implement them useing the surveyed languages. We also evaluate and compare
effectiveness of the proposed solution, considering, in particular, easiness of
expressing non-trivial behaviour.Comment: Changed formattin
Issues about the Adoption of Formal Methods for Dependable Composition of Web Services
Web Services provide interoperable mechanisms for describing, locating and
invoking services over the Internet; composition further enables to build
complex services out of simpler ones for complex B2B applications. While
current studies on these topics are mostly focused - from the technical
viewpoint - on standards and protocols, this paper investigates the adoption of
formal methods, especially for composition. We logically classify and analyze
three different (but interconnected) kinds of important issues towards this
goal, namely foundations, verification and extensions. The aim of this work is
to individuate the proper questions on the adoption of formal methods for
dependable composition of Web Services, not necessarily to find the optimal
answers. Nevertheless, we still try to propose some tentative answers based on
our proposal for a composition calculus, which we hope can animate a proper
discussion
Strategic Directions in Object-Oriented Programming
This paper has provided an overview of the field of object-oriented programming. After presenting a historical perspective and some major achievements in the field, four research directions were introduced: technologies integration, software components, distributed programming, and new paradigms. In general there is a need to continue research in traditional areas:\ud
(1) as computer systems become more and more complex, there is a need to further develop the work on architecture and design; \ud
(2) to support the development of complex systems, there is a need for better languages, environments, and tools; \ud
(3) foundations in the form of the conceptual framework and other theories must be extended to enhance the means for modeling and formal analysis, as well as for understanding future computer systems
Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics
Graphics Processing Units (GPUs) are having a transformational effect on
numerical lattice quantum chromodynamics (LQCD) calculations of importance in
nuclear and particle physics. The QUDA library provides a package of mixed
precision sparse matrix linear solvers for LQCD applications, supporting single
GPUs based on NVIDIA's Compute Unified Device Architecture (CUDA). This
library, interfaced to the QDP++/Chroma framework for LQCD calculations, is
currently in production use on the "9g" cluster at the Jefferson Laboratory,
enabling unprecedented price/performance for a range of problems in LQCD.
Nevertheless, memory constraints on current GPU devices limit the problem sizes
that can be tackled. In this contribution we describe the parallelization of
the QUDA library onto multiple GPUs using MPI, including strategies for the
overlapping of communication and computation. We report on both weak and strong
scaling for up to 32 GPUs interconnected by InfiniBand, on which we sustain in
excess of 4 Tflops.Comment: 11 pages, 7 figures, to appear in the Proceedings of Supercomputing
2010 (submitted April 12, 2010
- …