598 research outputs found

    Distributed Virtual System (DIVIRS) Project

    Get PDF
    As outlined in our continuation proposal 92-ISI-50R (revised) on contract NCC 2-539, we are (1) developing software, including a system manager and a job manager, that will manage available resources and that will enable programmers to program parallel applications in terms of a virtual configuration of processors, hiding the mapping to physical nodes; (2) developing communications routines that support the abstractions implemented in item one; (3) continuing the development of file and information systems based on the virtual system model; and (4) incorporating appropriate security measures to allow the mechanisms developed in items 1 through 3 to be used on an open network. The goal throughout our work is to provide a uniform model that can be applied to both parallel and distributed systems. We believe that multiprocessor systems should exist in the context of distributed systems, allowing them to be more easily shared by those that need them. Our work provides the mechanisms through which nodes on multiprocessors are allocated to jobs running within the distributed system and the mechanisms through which files needed by those jobs can be located and accessed

    Algorithmic Based Fault Tolerance Applied to High Performance Computing

    Full text link
    We present a new approach to fault tolerance for High Performance Computing system. Our approach is based on a careful adaptation of the Algorithmic Based Fault Tolerance technique (Huang and Abraham, 1984) to the need of parallel distributed computation. We obtain a strongly scalable mechanism for fault tolerance. We can also detect and correct errors (bit-flip) on the fly of a computation. To assess the viability of our approach, we have developed a fault tolerant matrix-matrix multiplication subroutine and we propose some models to predict its running time. Our parallel fault-tolerant matrix-matrix multiplication scores 1.4 TFLOPS on 484 processors (cluster jacquard.nersc.gov) and returns a correct result while one process failure has happened. This represents 65% of the machine peak efficiency and less than 12% overhead with respect to the fastest failure-free implementation. We predict (and have observed) that, as we increase the processor count, the overhead of the fault tolerance drops significantly

    Extending and Implementing the Self-adaptive Virtual Processor for Distributed Memory Architectures

    Get PDF
    Many-core architectures of the future are likely to have distributed memory organizations and need fine grained concurrency management to be used effectively. The Self-adaptive Virtual Processor (SVP) is an abstract concurrent programming model which can provide this, but the model and its current implementations assume a single address space shared memory. We investigate and extend SVP to handle distributed environments, and discuss a prototype SVP implementation which transparently supports execution on heterogeneous distributed memory clusters over TCP/IP connections, while retaining the original SVP programming model

    Java on Networks of Workstations (JavaNOW): A Parallel Computing Framework Inspired by Linda and the Message Passing Interface (MPI)

    Get PDF
    Networks of workstations are a dominant force in the distributed computing arena, due primarily to the excellent price/performance ratio of such systems when compared to traditionally massively parallel architectures. It is therefore critical to develop programming languages and environments that can help harness the raw computational power available on these systems. In this article, we present JavaNOW (Java on Networks of Workstations), a Java‐based framework for parallel programming on networks of workstations. It creates a virtual parallel machine similar to the MPI (Message Passing Interface) model, and provides distributed associative shared memory similar to the Linda memory model but with a richer set of primitive operations. JavaNOW provides a simple yet powerful framework for performing computation on networks of workstations. In addition to the Linda memory model, it provides for shared objects, implicit multithreading, implicit synchronization, object dataflow, and collective communications similar to those defined in MPI. JavaNOW is also a component of the Computational Neighborhood, a Java‐enabled suite of services for desktop computational sharing. The intent of JavaNOW is to present an environment for parallel computing that is both expressive and reliable and ultimately can deliver good to excellent performance. As JavaNOW is a work in progress, this article emphasizes the expressive potential of the JavaNOW environment and presents preliminary performance results only

    Vcluster: A Portable Virtual Computing Library For Cluster Computing

    Get PDF
    Message passing has been the dominant parallel programming model in cluster computing, and libraries like Message Passing Interface (MPI) and Portable Virtual Machine (PVM) have proven their novelty and efficiency through numerous applications in diverse areas. However, as clusters of Symmetric Multi-Processor (SMP) and heterogeneous machines become popular, conventional message passing models must be adapted accordingly to support this new kind of clusters efficiently. In addition, Java programming language, with its features like object oriented architecture, platform independent bytecode, and native support for multithreading, makes it an alternative language for cluster computing. This research presents a new parallel programming model and a library called VCluster that implements this model on top of a Java Virtual Machine (JVM). The programming model is based on virtual migrating threads to support clusters of heterogeneous SMP machines efficiently. VCluster is implemented in 100% Java, utilizing the portability of Java to address the problems of heterogeneous machines. VCluster virtualizes computational and communication resources such as threads, computation states, and communication channels across multiple separate JVMs, which makes a mobile thread possible. Equipped with virtual migrating thread, it is feasible to balance the load of computing resources dynamically. Several large scale parallel applications have been developed using VCluster to compare the performance and usage of VCluster with other libraries. The results of the experiments show that VCluster makes it easier to develop multithreading parallel applications compared to conventional libraries like MPI. At the same time, the performance of VCluster is comparable to MPICH, a widely used MPI library, combined with popular threading libraries like POSIX Thread and OpenMP. In the next phase of our work, we implemented thread group and thread migration to demonstrate the feasibility of dynamic load balancing in VCluster. We carried out experiments to show that the load can be dynamically balanced in VCluster, resulting in a better performance. Thread group also makes it possible to implement collective communication functions between threads, which have been proved to be useful in process based libraries

    One-Sided Communication for High Performance Computing Applications

    Get PDF
    Thesis (Ph.D.) - Indiana University, Computer Sciences, 2009Parallel programming presents a number of critical challenges to application developers. Traditionally, message passing, in which a process explicitly sends data and another explicitly receives the data, has been used to program parallel applications. With the recent growth in multi-core processors, the level of parallelism necessary for next generation machines is cause for concern in the message passing community. The one-sided programming paradigm, in which only one of the two processes involved in communication actively participates in message transfer, has seen increased interest as a potential replacement for message passing. One-sided communication does not carry the heavy per-message overhead associated with modern message passing libraries. The paradigm offers lower synchronization costs and advanced data manipulation techniques such as remote atomic arithmetic and synchronization operations. These combine to present an appealing interface for applications with random communication patterns, which traditionally present message passing implementations with difficulties. This thesis presents a taxonomy of both the one-sided paradigm and of applications which are ideal for the one-sided interface. Three case studies, based on real-world applications, are used to motivate both taxonomies and verify the applicability of the MPI one-sided communication and Cray SHMEM one-sided interfaces to real-world problems. While our results show a number of short-comings with existing implementations, they also suggest that a number of applications could benefit from the one-sided paradigm. Finally, an implementation of the MPI one-sided interface within Open MPI is presented, which provides a number of unique performance features necessary for efficient use of the one-sided programming paradigm
    corecore