1,361 research outputs found

    Session-Based Programming for Parallel Algorithms: Expressiveness and Performance

    Full text link
    This paper investigates session programming and typing of benchmark examples to compare productivity, safety and performance with other communications programming languages. Parallel algorithms are used to examine the above aspects due to their extensive use of message passing for interaction, and their increasing prominence in algorithmic research with the rising availability of hardware resources such as multicore machines and clusters. We contribute new benchmark results for SJ, an extension of Java for type-safe, binary session programming, against MPJ Express, a Java messaging system based on the MPI standard. In conclusion, we observe that (1) despite rich libraries and functionality, MPI remains a low-level API, and can suffer from commonly perceived disadvantages of explicit message passing such as deadlocks and unexpected message types, and (2) the benefits of high-level session abstraction, which has significant impact on program structure to improve readability and reliability, and session type-safety can greatly facilitate the task of communications programming whilst retaining competitive performance

    Behavioral types in programming languages

    Get PDF
    A recent trend in programming language research is to use behav- ioral type theory to ensure various correctness properties of large- scale, communication-intensive systems. Behavioral types encompass concepts such as interfaces, communication protocols, contracts, and choreography. The successful application of behavioral types requires a solid understanding of several practical aspects, from their represen- tation in a concrete programming language, to their integration with other programming constructs such as methods and functions, to de- sign and monitoring methodologies that take behaviors into account. This survey provides an overview of the state of the art of these aspects, which we summarize as the pragmatics of behavioral types

    Resiliency in numerical algorithm design for extreme scale simulations

    Get PDF
    This work is based on the seminar titled ‘Resiliency in Numerical Algorithm Design for Extreme Scale Simulations’ held March 1–6, 2020, at Schloss Dagstuhl, that was attended by all the authors. Advanced supercomputing is characterized by very high computation speeds at the cost of involving an enormous amount of resources and costs. A typical large-scale computation running for 48 h on a system consuming 20 MW, as predicted for exascale systems, would consume a million kWh, corresponding to about 100k Euro in energy cost for executing 1023 floating-point operations. It is clearly unacceptable to lose the whole computation if any of the several million parallel processes fails during the execution. Moreover, if a single operation suffers from a bit-flip error, should the whole computation be declared invalid? What about the notion of reproducibility itself: should this core paradigm of science be revised and refined for results that are obtained by large-scale simulation? Naive versions of conventional resilience techniques will not scale to the exascale regime: with a main memory footprint of tens of Petabytes, synchronously writing checkpoint data all the way to background storage at frequent intervals will create intolerable overheads in runtime and energy consumption. Forecasts show that the mean time between failures could be lower than the time to recover from such a checkpoint, so that large calculations at scale might not make any progress if robust alternatives are not investigated. More advanced resilience techniques must be devised. The key may lie in exploiting both advanced system features as well as specific application knowledge. Research will face two essential questions: (1) what are the reliability requirements for a particular computation and (2) how do we best design the algorithms and software to meet these requirements? While the analysis of use cases can help understand the particular reliability requirements, the construction of remedies is currently wide open. One avenue would be to refine and improve on system- or application-level checkpointing and rollback strategies in the case an error is detected. Developers might use fault notification interfaces and flexible runtime systems to respond to node failures in an application-dependent fashion. Novel numerical algorithms or more stochastic computational approaches may be required to meet accuracy requirements in the face of undetectable soft errors. These ideas constituted an essential topic of the seminar. The goal of this Dagstuhl Seminar was to bring together a diverse group of scientists with expertise in exascale computing to discuss novel ways to make applications resilient against detected and undetected faults. In particular, participants explored the role that algorithms and applications play in the holistic approach needed to tackle this challenge. This article gathers a broad range of perspectives on the role of algorithms, applications and systems in achieving resilience for extreme scale simulations. The ultimate goal is to spark novel ideas and encourage the development of concrete solutions for achieving such resilience holistically.Peer Reviewed"Article signat per 36 autors/es: Emmanuel Agullo, Mirco Altenbernd, Hartwig Anzt, Leonardo Bautista-Gomez, Tommaso Benacchio, Luca Bonaventura, Hans-Joachim Bungartz, Sanjay Chatterjee, Florina M. Ciorba, Nathan DeBardeleben, Daniel Drzisga, Sebastian Eibl, Christian Engelmann, Wilfried N. Gansterer, Luc Giraud, Dominik G ̈oddeke, Marco Heisig, Fabienne Jezequel, Nils Kohl, Xiaoye Sherry Li, Romain Lion, Miriam Mehl, Paul Mycek, Michael Obersteiner, Enrique S. Quintana-Ortiz, Francesco Rizzi, Ulrich Rude, Martin Schulz, Fred Fung, Robert Speck, Linda Stals, Keita Teranishi, Samuel Thibault, Dominik Thonnes, Andreas Wagner and Barbara Wohlmuth"Postprint (author's final draft

    PARALLEX FILE SYSTEM (PXFS): BRIDGING THE GAP BETWEEN EXASCALE PROCESSING CAPABILITIES AND I/O PERFORMANCE

    Get PDF
    Due to processors reaching the maximum performance allowable by current technology, architectural trends for computer systems continue to increase the number of cores per processing chip to maximize system performance. Most estimates suggest massively parallel systems will be available within the decade, containing millions of cores and capable of exaFlops of performance. New models of execution are necessary to maximize processor utilization and minimize power costs for these exascale systems. ParalleX is one such execution model, which attempts to address inefficiencies of current execution models by exposing fine-grained parallelism, increasing system utilization using asynchronous workflow, and resolving resource contention through the use of adaptive and dynamic resource scheduling. A particularly important aspect of these exascale execution models is the design of the I/O subsystem, which has seen limited performance increases compared to processor and network technologies. Parallel file systems have been designed to help alleviate the poor performance of storage technologies by distributing file data across multiple nodes of a parallel system to maximize the aggregate throughput attainable by file system clients. However, the design of parallel file systems needs to be modified to explicitly address the inherent high-latency of remote file system operations without degrading file system performance and scalability. We present modifications to OrangeFS, a high-performance, working model parallel file system geared towards the facilitation of research in the field of parallel I/O, to help address the inefficiencies of current file systems. We deem our resultant parallel file system implementation ParalleX File System (PXFS), as it attempts to support the features required by the I/O subsystem of the ParalleX execution model. Specifically, PXFS offers mechanisms for masking the latency of file system operations, defining meaningful computation to be overlapped with file system communication, and maintaining the high-performance and scalability exhibited by OrangeFS. Our results indicate PXFS successfully improves file system performance and supports the semantics of ParalleX with limited programmer intervention, potentially simplifying the design and increasing the performance of many ParalleX applications

    Guaranteed bandwidth implementation of message passing interface on workstation clusters

    Get PDF
    Due to their wide availability, networks of workstations (NOW) are an attractive platform for parallel processing. Parallel programming environments such as Parallel Virtual Machine (PVM), and Message Passing Interface (MPI) offer the user a convenient way to express parallel computing and communication for a network of workstations. Currently, a number of MPI implementations are available that offer low (average ) latency and high bandwidth environments to users by utilizing an efficient MPI library specification and high speed networks. In addition to high bandwidth and low average latency requirements, mission critical distributed applications, audio/video communications require a completely different type of service, guaranteed bandwidth and worst case delays (worst case latency) to be guaranteed by underlying protocol. The hypothesis presented in this paper is that it is possible to provide an application a low level reliable transport protocol with performance and guaranteed bandwidth as close to the hardware on which it is executing. The hypothesis is proven by designing and implementing a reliable high performance message passing protocol interface which also provides the guaranteed bandwidth to MPI and to mission critical distributed MPI applications. This protocol interface works with the Fiber Distributed Data Interface (FDDI) driver which has been designed and implemented for Performance Technology Inc. commercial high performance FDDI product, the Station Management Software 7.3, and the ADI / MPICH (Argonne National Laboratory and Mississippi State University\u27s free MPI implementation)

    Java Grande Forum Report: Making Java Work for High-End Computing

    Get PDF
    This document describes the Java Grande Forum and includes its initial deliverables.Theseare reports that convey a succinct set of recommendations from this forum to SunMicrosystems and other purveyors of Javaâ„¢ technology that will enable GrandeApplications to be developed with the Java programming language

    Doctor of Philosophy

    Get PDF
    dissertationMessage passing (MP) has gained a widespread adoption over the years, so much so, that even heterogeneous embedded multicore systems are running programs that are developed using message passing libraries. Such a phenomenon is a shift in computing practices, since, traditionally MP programs have been developed specifically for high performance computing. With growing importance and the complexity of MP programs in today's times, it becomes absolutely imperative to have formal tools and sound methodologies that can help reason about the correctness of the program. It has been demonstrated by many researchers in the area of concurrent program verification that a suitable strategy to verify programs which rely heavily on nondeterminism, is dynamic verification. Dynamic verification integrates the best features of testing and model checking. In the area of MP program verification, however, there have been only a handful of dynamic verifiers. These dynamic verifiers, despite their strengths, suffer from the explosion in execution scenarios. All existing dynamic verifiers, to our knowledge, exhaustively explore the nondeterministic choices in an MP program. It is apparent that an MP program with many nondeterministic constructs will quickly inundate such tools. This dissertation focuses on the problem of containing the exponential space of execution scenarios (or interleavings) while providing a soundness and completeness guarantee over safety properties of MP programs (specifically deadlocks). We present a predictive verification methodology and an associated framework, called MAAPED(Messaging Application Analysis with Predictive Error Discovery), that operates in polynomial time over MP programs to detect deadlocks among other safety property violations. In brief, we collect a single execution trace of an MP program and without re-running other execution schedules, reliably construct the artifacts necessary to predict any mishappening in an unexplored execution schedule with the aforementioned formal guarantee. The main contributions of the thesis are the following: The Functionally Irrelevant Barrier Algorithm to increase program productivity and ease in verification complexity. A sound pragmatic strategy to reduce the interleaving space of existing dynamic verifiers which is complete only for a certain class of MPI programs. A generalized matches-before ordering for MP programs. A predictive polynomial time verification framework as an alternate solution in the dynamic MP verification landscape. A soundness and completeness proof for the predictive framework's deadlock detection strategy for many formally characterized classes of MP programs. In the process of developing solutions that are mentioned above, we also collected important experiences relating to the development of dynamic verification schedulers. We present those experiences as a minor contribution of this thesis
    • …
    corecore