54 research outputs found

    Java Grande Forum Report: Making Java Work for High-End Computing

    Get PDF
    This document describes the Java Grande Forum and includes its initial deliverables.Theseare reports that convey a succinct set of recommendations from this forum to SunMicrosystems and other purveyors of Java™ technology that will enable GrandeApplications to be developed with the Java programming language

    Device level communication libraries for high‐performance computing in Java

    Get PDF
    This is the peer reviewed version of the following article: Taboada, G. L., Touriño, J. , Doallo, R. , Shafi, A. , Baker, M. and Carpenter, B. (2011), Device level communication libraries for high‐performance computing in Java. Concurrency Computat.: Pract. Exper., 23: 2382-2403. doi:10.1002/cpe.1777, which has been published in final form at https://doi.org/10.1002/cpe.1777. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions.[Abstract] Since its release, the Java programming language has attracted considerable attention from the high‐performance computing (HPC) community because of its portability, high programming productivity, and built‐in multithreading and networking support. As a consequence, several initiatives have been taken to develop a high‐performance Java message‐passing library to program distributed memory architectures, such as clusters. The performance of Java message‐passing applications relies heavily on the communications performance. Thus, the design and implementation of low‐level communication devices that support message‐passing libraries is an important research issue in Java for HPC. MPJ Express is our Java message‐passing implementation for developing high‐performance parallel Java applications. Its public release currently contains three communication devices: the first one is built using the Java New Input/Output (NIO) package for the TCP/IP; the second one is specifically designed for the Myrinet Express library on Myrinet; and the third one supports thread‐based shared memory communications. Although these devices have been successfully deployed in many production environments, previous performance evaluations of MPJ Express suggest that the buffering layer, tightly coupled with these devices, incurs a certain degree of copying overhead, which represents one of the main performance penalties. This paper presents a more efficient Java message‐passing communications device, based on Java Input/Output sockets, that avoids this buffering overhead. Moreover, this device implements several strategies, both in the communication protocol and in the HPC hardware support, which optimizes Java message‐passing communications. In order to evaluate its benefits, this paper analyzes the performance of this device comparatively with other Java and native message‐passing libraries on various high‐speed networks, such as Gigabit Ethernet, Scalable Coherent Interface, Myrinet, and InfiniBand, as well as on a shared memory multicore scenario. The reported communication overhead reduction encourages the upcoming incorporation of this device in MPJ ExpressMinisterio de Ciencia e Innovación; TIN2010-16735

    MPJ: MPI-like message passing for Java

    Get PDF
    Recently, there has been a lot of interest in using Java for parallel programming. Efforts have been hindered by lack of standard Java parallel programming APIs. To alleviate this problem, various groups started projects to develop Java message passing systems modelled on the successful Message Passing Interface (MPI). Official MPI bindings are currently defined only for C, Fortran, and C++, so early MPI-like environments for Java have been divergent. This paper relates an effort undertaken by a working group of the Java Grande Forum, seeking a consensus on an MPI-like API, to enhance the viability of parallel programming using Java

    STAPL-RTS: A Runtime System for Massive Parallelism

    Get PDF
    Modern High Performance Computing (HPC) systems are complex, with deep memory hierarchies and increasing use of computational heterogeneity via accelerators. When developing applications for these platforms, programmers are faced with two bad choices. On one hand, they can explicitly manage machine resources, writing programs using low level primitives from multiple APIs (e.g., MPI+OpenMP), creating efficient but rigid, difficult to extend, and non-portable implementations. Alternatively, users can adopt higher level programming environments, often at the cost of lost performance. Our approach is to maintain the high level nature of the application without sacrificing performance by relying on the transfer of high level, application semantic knowledge between layers of the software stack at an appropriate level of abstraction and performing optimizations on a per-layer basis. In this dissertation, we present the STAPL Runtime System (STAPL-RTS), a runtime system built for portable performance, suitable for massively parallel machines. While the STAPL-RTS abstracts and virtualizes the underlying platform for portability, it uses information from the upper layers to perform the appropriate low level optimizations that restore the performance characteristics. We outline the fundamental ideas behind the design of the STAPL-RTS, such as the always distributed communication model and its asynchronous operations. Through appropriate code examples and benchmarks, we prove that high level information allows applications written on top of the STAPL-RTS to attain the performance of optimized, but ad hoc solutions. Using the STAPL library, we demonstrate how this information guides important decisions in the STAPL-RTS, such as multi-protocol communication coordination and request aggregation using established C++ programming idioms. Recognizing that nested parallelism is of increasing interest for both expressivity and performance, we present a parallel model that combines asynchronous, one-sided operations with isolated nested parallel sections. Previous approaches to nested parallelism targeted either static applications through the use of blocking, isolated sections, or dynamic applications by using asynchronous mechanisms (i.e., recursive task spawning) which come at the expense of isolation. We combine the flexibility of dynamic task creation with the isolation guarantees of the static models by allowing the creation of asynchronous, one-sided nested parallel sections that work in tandem with the more traditional, synchronous, collective nested parallelism. This allows selective, run-time customizable use of parallelism in an application, based on the input and the algorithm

    Functional Testing Approaches for "BIFST-able" tlm_fifo

    Get PDF
    Evolution of Electronic System Level design methodologies, allows a wider use of Transaction-Level Modeling (TLM). TLM is a high-level approach to modeling digital systems that emphasizes on separating communications among modules from the details of functional units. This paper explores different functional testing approaches for the implementation of Built-in Functional Self Test facilities in the TLM primitive channel tlm_fifo. In particular, it focuses on three different test approaches based on a finite state machine model of tlm_fifo, functional fault models, and march tests respectivel

    Running parallel applications on a heterogeneous environment with accessible development practices and automatic scalability

    Get PDF
    Grid computing makes it possible to gather large quantities of resources to work on a problem. In order to exploit this potential, a framework that presents the resources to the user programmer in a form that maintains productivity is necessary. The framework must not only provide accessible development, but it must make efficient use of the resources. The Seeds framework is proposed. It uses the current Grid and distributed computing middleware to provide a parallel programming environment to a wider community of programmers. The framework was used to investigate the feasibility of scaling skeleton/pattern parallel programming into Grid computing. The research accomplished two goals: it made parallel programming on the Grid more accessible to domain­specific programmers, and it made parallel programs scale on a heterogeneous resource environ­ ment. Programming is made easier to the programmer by using skeleton and pat­ tern­based programming approaches that effectively isolate the program from the envi­ ronment. To extend the pattern approach, the pattern adder operator is proposed, imple­ mented and tested. The results show the pattern operator can reduce the number of lines of code when compared with an MPJ­Express implementation for a stencil algorithm while having an overhead of at most ten microseconds per iteration. The research in scal­ ability involved adapting existing load­balancing techniques to skeletons and patterns re­ quiring little additional configuration on the part of the programmer. The hierarchical de­ pendency concept is proposed as well, which uses a streamed data flow programming model. The concept introduces data flow computation hibernation and dependencies that can split to accommodate additional processors. The results from implementing skeleton/patterns on hierarchical dependencies show an 18.23% increase in code is neces­ sary to enable automatic scalability. The concept can increase speedup depending on the algorithm and grain size

    MPJ: A Proposed Java Message Passing API and Environment for High Performance Computing

    Full text link
    In this paper we sketch out a proposed reference implementation for message passing in Java (MPJ), an MPI-like API from the Message-Passing Working Group of the Java Grande Forum [1,2]. The proposal relies heavily on RMI and Jini for finding computational resources, creating slave processes, and handling failures. User-level communication is implemented efficiently directly on top of Java sockets. 1

    A Pure Java Parallel Flow Solver

    Get PDF
    In this paper an overview is given on the "Have Java" project to attain a pure Java parallel Navier-Stokes flow solver (JParNSS) based on the thread concept and remote method invocation (RMI). The goal of this project is to produce an industrial flow solver running on an arbitrary sequential or parallel architecture, utilizing the Internet, capable of handling the most complex 3D geometries as well as flow physics, and also linking to codes in other areas such as aeroelasticity etc. Since Java is completely object-oriented the code has been written in an object-oriented programming (OOP) style. The code also includes a graphics user interface (GUI) as well as an interactive steering package for the parallel architecture. The Java OOP approach provides profoundly improved software productivity, robustness, and security as well as reusability and maintainability. OOP allows code construction similar to the aerodynamic design process because objects can be software coded and integrated, reflecting actual design procedures. In addition, Java is the programming language of the Internet and thus Java is the programming language of the Internet and thus Java objects on disparate machines or even separate networks can be connected. We explain the motivation for the design of JParNSS along with its capabilities that set it apart from other solvers. In the first two sections we present a discussion of the Java language as the programming tool for aerospace applications. In section three the objectives of the Have Java project are presented. In the next section the layer structures of JParNSS are discussed with emphasis on the parallelization and client-server (RMI) layers. JParNSS, like its predecessor ParNSS (ANSI-C), is based on the multiblock idea, and allows for arbitrarily complex topologies. Grids are accepted in GridPro property settings, grids of any size or block number can be directly read by JParNSS without any further modifications, requiring no additional preparation time for the solver input. In the last section, computational results are presented, with emphasis on multiprocessor Pentium and Sun parallel systems run by the Solaris operating system (OS)

    Use of paralelism and distributed processing of data in practice

    Get PDF
    katedra: KMO; přílohy: 1 CD ROM; rozsah: 41Bakalárská práce je zamerena na problematiku paralelního zpracování dat a možného využití techto principu v praxi V práci jsou rozebrány principy paralelismu, Amdahluv zákon, standard MPI, ale predevším je soustredena na paralelizacní schopnosti programovacího jazyka Java. Ty jsou zastoupené v technologiích Multithreading, Serializace a Remote Method Invocation. Na záver práce je funkcní systém, který je schopný distribuovaných výpoctu, otestován na modelové úloze.This bachelor diploma is focused on problematics of parallel data processing and possible usage of this principles in practice. In the work, there are explained principles of parallelism, Amdahl´s law, MPI standard, but above all is this work focused on parallel abilities of Java programming language. These are supplied by technologies Multithreading, Serialization and Remote Method Invocation. In the end of this work is functional system, able to carry on distributed computing, tested on model excersise
    corecore