17 research outputs found

    Management of SPMD based parallel processing on clusters of workstations

    Full text link
    Current attempts to manage parallel applications on Clusters of Workstations (COWs) have either generally followed the parallel execution environment approach or been extensions to existing network operating systems, both of which do not provide complete or satisfactory solutions. The efficient and transparent management of parallelism within the COW environment requires enhanced methods of process instantiation, mapping of parallel process to workstations, maintenance of process relationships, process communication facilities, and process coordination mechanisms. The aim of this research is to synthesise, design, develop and experimentally study a system capable of efficiently and transparently managing SPMD parallelism on a COW. This system should both improve the performance of SPMD based parallel programs and relieve the programmer from the involvement into parallelism management in order to allow them to concentrate on application programming. It is also the aim of this research to show that such a system, to achieve these objectives, is best achieved by adding new special services and exploiting the existing services of a client/server and microkernel based distributed operating system. To achieve these goals the research methods of the experimental computer science should be employed. In order to specify the scope of this project, this work investigated the issues related to parallel processing on COWs and surveyed a number of relevant systems including PVM, NOW and MOSIX. It was shown that although the MOSIX system provide a number of good services related to parallelism management, none of the system forms a complete solution. The problems identified with these systems include: instantiation services that are not suited to parallel processing; duplication of services between the parallelism management environment and the operating system; and poor levels of transparency. A high performance and transparent system capable of managing the execution of SPMD parallel applications was synthesised and the specific services of process instantiation, process mapping and process interaction detailed. The process instantiation service designed here provides the capability to instantiate parallel processes using either creation or duplication methods and also supports multiple and group based instantiation which is specifically design for SPMD parallel processing. The process mapping service provides the combination of process allocation and dynamic load balancing to ensure the load of a COW remains balanced not only at the time a parallel program is initialised but also during the execution of the program. The process interaction service guarantees to maintain transparently process relationships, communications and coordination services between parallel processes regardless of their location within the COW. The combination of these services provides an original architecture and organisation of a system that is capable of fully managing the execution of SPMD parallel applications on a COW. A logical design of a parallelism management system was developed derived from the synthesised system and was shown that it should ideally be based on a distributed operating system employing the client server model. The client/server based distributed operating system provides the level of transparency, modularity and flexibility necessary for a complete parallelism management system. The services identified in the synthesised system have been mapped to a set of server processes including: Process Instantiation Server providing advanced multiple and group based process creation and duplication; Process Mapping Server combining load collection, process allocation and dynamic load balancing services; and Process Interaction Server providing transparent interprocess communication and coordination. A Process Migration Server was also identified as vital to support both the instantiation and mapping servers. The RHODOS client/server and microkernel based distributed operating system was selected to carry out research into the detailed design and to be used for the implementation this parallelism management system. RHODOS was enhanced to provide the required servers and resulted in the development of the REX Manager, Global Scheduler and Process Migration Manager to provide the services of process instantiation, mapping and migration, respectively. The process interaction services were already provided within RHODOS and only required some extensions to the existing Process Manager and IPC Managers. Through a variety of experiments it was shown that when this system was used to support the execution of SPMD parallel applications the overall execution times were improved, especially when multiple and group based instantiation services are employed. The RHODOS PMS was also shown to greatly reduce the programming burden experienced by users when writing SPMD parallel applications by providing a small set of powerful primitives specially designed to support parallel processing. The system was also shown to be applicable and has been used in a variety of other research areas such as Distributed Shared Memory, Parallelising Compilers and assisting the port of PVM to the RHODOS system. The RHODOS Parallelism Management System (PMS) provides a unique and creative solution to the problem of transparently and efficiently controlling the execution of SPMD parallel applications on COWs. Combining advanced services such as multiple and group based process creation and duplication; combined process allocation and dynamic load balancing; and complete COW wide transparency produces a totally new system that addresses many of the problems not addressed in other systems

    Automated parallel application creation and execution tool for clusters

    Full text link
    This research investigated an automated approach to re-writing traditional sequential computer programs into parallel programs for networked computers. A tool was designed and developed for generating parallel programs automatically and also executing these parallel programs on a network of computers. Performance is maximized by utilising all idle resources

    Preserving message integrity in dynamic process migration

    Get PDF
    Processor and network management have a great impact on the performance of Distributed Memory Parallel Computers. Dynamic Process Migration allows load balancing and communication balancing at execution time. Managing the communications involving the migrating process is one of the problems that Dynamic Process Migration implies. To study this problem, which we have called the Message Integrity Problem, six algorithms have been analysed. These algorithms have been studied by sequential simulation, and have also been implemented in a parallel machine for different user process patterns in the presence of dynamic migration. To compare the algorithms, different performance parameters have been considered. The results obtained have given preliminary information about the algorithms behaviour, and have allowed us to perform an initial comparative evaluation among them

    Preserving message integrity in dynamic process migration

    Get PDF
    Processor and network management have a great impact on the performance of Distributed Memory Parallel Computers. Dynamic Process Migration allows load balancing and communication balancing at execution time. Managing the communications involving the migrating process is one of the problems that Dynamic Process Migration implies. To study this problem, which we have called the Message Integrity Problem, six algorithms have been analysed. These algorithms have been studied by sequential simulation, and have also been implemented in a parallel machine for different user process patterns in the presence of dynamic migration. To compare the algorithms, different performance parameters have been considered. The results obtained have given preliminary information about the algorithms behaviour, and have allowed us to perform an initial comparative evaluation among them.Facultad de Inform谩tic

    Platform for reliable computing on clusters using group communications

    Full text link
    Shared clusters represent an excellent platform for the execution of parallel applications given their low price/performance ratio and the presence of cluster infrastructure in many organisations. The focus of recent research efforts are on parallelism management, transport and efficient access to resources, and making clusters easy to use. In this thesis, we examine reliable parallel computing on clusters. The aim of this research is to demonstrate the feasibility of developing an operating system facility providing transport fault tolerance using existing, enhanced and newly built operating system services for supporting parallel applications. In particular, we use existing process duplication and process migration services, and synthesise a group communications facility for use in a transparent checkpointing facility. This research is carried out using the methods of experimental computer science. To provide a foundation for the synthesis of the group communications and checkpointing facilities, we survey and review related work in both fields. For group communications, we examine the V Distributed System, the x-kernel and Psync, the ISIS Toolkit, and Horus. We identify a need for services that consider the placement of processes on computers in the cluster. For Checkpointing, we examine Manetho, KeyKOS, libckpt, and Diskless Checkpointing. We observe the use of remote computer memories for storing checkpoints, and the use of copy-on-write mechanisms to reduce the time to create a checkpoint of a process. We propose a group communications facility providing two sets of services: user-oriented services and system-oriented services. User-oriented services provide transparency and target application. System-oriented services supplement the user-oriented services for supporting other operating systems services and do not provide transparency. Additional flexibility is achieved by providing delivery and ordering semantics independently. An operating system facility providing transparent checkpointing is synthesised using coordinated checkpointing. To ensure a consistent set of checkpoints are generated by the facility, instead of blindly blocking the processes of a parallel application, only non-deterministic events are blocked. This allows the processes of the parallel application to continue execution during the checkpoint operation. Checkpoints are created by adapting process duplication mechanisms, and checkpoint data is transferred to remote computer memories and disk for storage using the mechanisms of process migration. The services of the group communications facility are used to coordinate the checkpoint operation, and to transport checkpoint data to remote computer memories and disk. Both the group communications facility and the checkpointing facility have been implemented in the GENESIS cluster operating system and provide proof-of-concept. GENESIS uses a microkernel and client-server based operating system architecture, and is demonstrated to provide an appropriate environment for the development of these facilities. We design a number of experiments to test the performance of both the group communications facility and checkpointing facility, and to provide proof-of-performance. We present our approach to testing, the challenges raised in testing the facilities, and how we overcome them. For group communications, we examine the performance of a number of delivery semantics. Good speed-ups are observed and system-oriented group communication services are shown to provide significant performance advantages over user-oriented semantics in the presence of packet loss. For checkpointing, we examine the scalability of the facility given different levels of resource usage and a variable number of computers. Low overheads are observed for checkpointing a parallel application. It is made clear by this research that the microkernel and client-server based cluster operating system provide an ideal environment for the development of a high performance group communications facility and a transparent checkpointing facility for generating a platform for reliable parallel computing on clusters

    Preserving message integrity in dynamic process migration

    Get PDF
    Processor and network management have a great impact on the performance of Distributed Memory Parallel Computers. Dynamic Process Migration allows load balancing and communication balancing at execution time. Managing the communications involving the migrating process is one of the problems that Dynamic Process Migration implies. To study this problem, which we have called the Message Integrity Problem, six algorithms have been analysed. These algorithms have been studied by sequential simulation, and have also been implemented in a parallel machine for different user process patterns in the presence of dynamic migration. To compare the algorithms, different performance parameters have been considered. The results obtained have given preliminary information about the algorithms behaviour, and have allowed us to perform an initial comparative evaluation among them.Facultad de Inform谩tic

    Ver枚ffentlichungen und Vortr盲ge 2003 der Mitgleider der Fakult盲t f眉r Informatik

    Get PDF

    Parallelization of the maximum likelihood approach to phylogenetic inference

    Get PDF
    Phylogenetic inference refers to the reconstruction of evolutionary relationships among various species, usually presented in the form of a tree. DNA sequences are most often used to determine these relationships. The results of phylogenetic inference have many important applications, including protein function determination, drug discovery, disease tracking and forensics. There are several popular computational methods used for phylogenetic inference, among them distance-based (i.e. neighbor joining), maximum parsimony, maximum likelihood, and Bayesian methods. This thesis focuses on the maximum likelihood method, which is regarded as one of the most accurate methods, with its computational demand being the main hindrance to its widespread use. Maximum likelihood is generally considered to be a heuristic method providing a statistical evaluation of the results, where potential tree topologies are judged by how well they predict the observed sequences. While there have been several previous efforts to parallelize the maximum likelihood method, sequential implementations are more widely used in the biological research community. This is due to a lack of confidence in the results produced by the more recent, parallel programs. However, because phylogenetic inference can be extremely computationally intensive, with the number of possible tree topologies growing exponentially with the number of species, parallelization is necessary to reduce the computation time to a reasonable amount. A parallel program was developed for phylogenetic inference based on the trusted algorithms of fastDNAml, a sequential program for phylogenetic inference utilizing the maximum likelihood approach. Parallelization is achieved using the popular master/workers scheme, where workers evaluate potential tree topologies in parallel. Three innovative optimizations are employed to alleviate the associated communication bottleneck encountered when using the master/workers technique with large-scale systems and problems. First, message packing reduces the number of messages sent out by the master, along with the associated overheads. Secondly, allowing workers to keep the best trees evaluated reduces the number of messages received by the master, as low-scoring results are discarded by the workers. Finally, multiple masters are utilized to parallelize the responsibilities of what is traditionally a single master process. These last two optimizations led to a dramatic improvement in performance over the unoptimized parallelization under the conditions tested. Message packing, however, demonstrated a slight reduction in performance. Although testing with large-scale systems and problems was not possible, results for all three optimizations suggested likely performance enhancement under such conditions, potentially leading to relief of the bottleneck
    corecore