4 research outputs found

    Concurrent use of two programming tools for heterogeneous supercomputers

    Get PDF
    In this thesis, a demostration of the heterogeneous use of two programming paradigms for heterogeneous computing called Cluster-M and HAsC is presented. Both paradigms can efficiently support heterogeneous networks by preserving a level of abstraction which does not include any architecture mapping details. Furthermore, they are both machine independent and hence are scalable. Unlike, almost all existing heterogeneous orchestration tools which are MIMD based, HAsC is based on the fundamental concepts of SIMD associative computing. HAsC models a heterogeneous network as a coarse grained associative computer and is designed to optimize the execution of problems with large ratios of computations to instructions. Ease of programming and execution speed, not the utilization of idle resources are the primary goals of HAsC On the other hand, Cluster-M is a generic technique that can be applied to both coarse grained as well as fine grained networks. Cluster-M provides an environment for porting various tasks onto the machines in a heterogeneous suite such that resources utilization is maximized and the overall execution time is minimized. An illustration of how these two paradigms can be used together to provide an efficient medium for heterogeneous programming is included. Finally, their scalability is discussed

    Implementation of an automatic mapping tool for massively parallel computing

    Get PDF
    In this thesis, an implementation of a generic technique for fine grain mapping of portable parallel algorithms onto multiprocessor architectures is presented. The implemented mapping algorithm is a component of Cluster-M. Cluster-M is a novel parallel programming tool which facilitates the design and mapping of portable softwares onto various parallel systems. The other components of Cluster-M are the Specifications and the Representations. Using the Specifications, machine independent parallel algorithms are presented in a clustered fashion specifying the concurrent computations and communications at every step of the overall execution. The Representations, on the other hand, are a form of clustering the underlying architecture to simplify the mapping process. The mapping algorithm implemented and tested in this thesis is an efficient method for matching the Specification clusters to the Representation clusters

    Mapping of portable parallel programs

    Get PDF
    An efficient parallel program designed for a parallel architecture includes a detailed outline of accurate assignments of concurrent computations onto processors, and data transfers onto communication links, such that the overall execution time is minimized. This process may be complex depending on the application task and the target multiprocessor architecture. Furthermore, this process is to be repeated for every different architecture even though the application task may be the same. Consequently, this has a major impact on the ever increasing cost of software development for multiprocessor systems. A remedy for this problem would be to design portable parallel programs which can be mapped efficiently onto any computer system. In this dissertation, we present a portable programming tool called Cluster-M. The three components of Cluster-M are the Specification Module, the Representation Module, and the Mapping Module. In the Specification Module, for a given problem, a machine-independent program is generated and represented in the form of a clustered task graph called Spec graph. Similarly, in the Representation Module, for a given architecture or heterogeneous suite of computers, a clustered system graph called Rep graph is generated. The Mapping Module is responsible for efficient mapping of Spec graphs onto Rep graphs. As part of this module, we present the first algorithm which produces a near-optimal mapping of an arbitrary non-uniform machine-independent task graph with M modules, onto an arbitrary non-uniform task-independent system graph having N processors, in 0(M P) time, where P = max(M, N). Our experimental results indicate that Cluster-M produces better or similar mapping results compared to other leading techniques which work only for restricted task or system graphs

    Reactive Scheduling of DAG Applications on Heterogeneous and Dynamic Distributed Computing Systems

    Get PDF
    Institute for Computing Systems ArchitectureEmerging technologies enable a set of distributed resources across a network to be linked together and used in a coordinated fashion to solve a particular parallel application at the same time. Such applications are often abstracted as directed acyclic graphs (DAGs), in which vertices represent application tasks and edges represent data dependencies between tasks. Effective scheduling mechanisms for DAG applications are essential to exploit the tremendous potential of computational resources. The core issues are that the availability and performance of resources, which are already by their nature heterogeneous, can be expected to vary dynamically, even during the course of an execution. In this thesis, we first consider the problem of scheduling DAG task graphs onto heterogeneous resources with changeable capabilities. We propose a list-scheduling heuristic approach, the Global Task Positioning (GTP) scheduling method, which addresses the problem by allowing rescheduling and migration of tasks in response to significant variations in resource characteristics. We observed from experiments with GTP that in an execution with relatively frequent migration, it may be that, over time, the results of some task have been copied to several other sites, and so a subsequent migrated task may have several possible sources for each of its inputs. Some of these copies may now be more quickly accessible than the original, due to dynamic variations in communication capabilities. To exploit this observation, we extended our model with a Copying Management(CM) function, resulting in a new version, the Global Task Positioning with copying facilities (GTP/c) system. The idea is to reuse such copies, in subsequent migration of placed tasks, in order to reduce the impact of migration cost on makespan. Finally, we believe that fault tolerance is an important issue in heterogeneous and dynamic computational environments as the availability of resources cannot be guaranteed. To address the problem of processor failure, we propose a rewinding mechanism which rewinds the progress of the application to a previous state, thereby preserving the execution in spite of the failed processor(s). We evaluate our mechanisms through simulation, since this allow us to generate repeatable patterns of resource performance variation. We use a standard benchmark set of DAGs, comparing performance against that of competing algorithms from the scheduling literature
    corecore