104 research outputs found

    Management of SPMD based parallel processing on clusters of workstations

    Full text link
    Current attempts to manage parallel applications on Clusters of Workstations (COWs) have either generally followed the parallel execution environment approach or been extensions to existing network operating systems, both of which do not provide complete or satisfactory solutions. The efficient and transparent management of parallelism within the COW environment requires enhanced methods of process instantiation, mapping of parallel process to workstations, maintenance of process relationships, process communication facilities, and process coordination mechanisms. The aim of this research is to synthesise, design, develop and experimentally study a system capable of efficiently and transparently managing SPMD parallelism on a COW. This system should both improve the performance of SPMD based parallel programs and relieve the programmer from the involvement into parallelism management in order to allow them to concentrate on application programming. It is also the aim of this research to show that such a system, to achieve these objectives, is best achieved by adding new special services and exploiting the existing services of a client/server and microkernel based distributed operating system. To achieve these goals the research methods of the experimental computer science should be employed. In order to specify the scope of this project, this work investigated the issues related to parallel processing on COWs and surveyed a number of relevant systems including PVM, NOW and MOSIX. It was shown that although the MOSIX system provide a number of good services related to parallelism management, none of the system forms a complete solution. The problems identified with these systems include: instantiation services that are not suited to parallel processing; duplication of services between the parallelism management environment and the operating system; and poor levels of transparency. A high performance and transparent system capable of managing the execution of SPMD parallel applications was synthesised and the specific services of process instantiation, process mapping and process interaction detailed. The process instantiation service designed here provides the capability to instantiate parallel processes using either creation or duplication methods and also supports multiple and group based instantiation which is specifically design for SPMD parallel processing. The process mapping service provides the combination of process allocation and dynamic load balancing to ensure the load of a COW remains balanced not only at the time a parallel program is initialised but also during the execution of the program. The process interaction service guarantees to maintain transparently process relationships, communications and coordination services between parallel processes regardless of their location within the COW. The combination of these services provides an original architecture and organisation of a system that is capable of fully managing the execution of SPMD parallel applications on a COW. A logical design of a parallelism management system was developed derived from the synthesised system and was shown that it should ideally be based on a distributed operating system employing the client server model. The client/server based distributed operating system provides the level of transparency, modularity and flexibility necessary for a complete parallelism management system. The services identified in the synthesised system have been mapped to a set of server processes including: Process Instantiation Server providing advanced multiple and group based process creation and duplication; Process Mapping Server combining load collection, process allocation and dynamic load balancing services; and Process Interaction Server providing transparent interprocess communication and coordination. A Process Migration Server was also identified as vital to support both the instantiation and mapping servers. The RHODOS client/server and microkernel based distributed operating system was selected to carry out research into the detailed design and to be used for the implementation this parallelism management system. RHODOS was enhanced to provide the required servers and resulted in the development of the REX Manager, Global Scheduler and Process Migration Manager to provide the services of process instantiation, mapping and migration, respectively. The process interaction services were already provided within RHODOS and only required some extensions to the existing Process Manager and IPC Managers. Through a variety of experiments it was shown that when this system was used to support the execution of SPMD parallel applications the overall execution times were improved, especially when multiple and group based instantiation services are employed. The RHODOS PMS was also shown to greatly reduce the programming burden experienced by users when writing SPMD parallel applications by providing a small set of powerful primitives specially designed to support parallel processing. The system was also shown to be applicable and has been used in a variety of other research areas such as Distributed Shared Memory, Parallelising Compilers and assisting the port of PVM to the RHODOS system. The RHODOS Parallelism Management System (PMS) provides a unique and creative solution to the problem of transparently and efficiently controlling the execution of SPMD parallel applications on COWs. Combining advanced services such as multiple and group based process creation and duplication; combined process allocation and dynamic load balancing; and complete COW wide transparency produces a totally new system that addresses many of the problems not addressed in other systems

    Platform for reliable computing on clusters using group communications

    Full text link
    Shared clusters represent an excellent platform for the execution of parallel applications given their low price/performance ratio and the presence of cluster infrastructure in many organisations. The focus of recent research efforts are on parallelism management, transport and efficient access to resources, and making clusters easy to use. In this thesis, we examine reliable parallel computing on clusters. The aim of this research is to demonstrate the feasibility of developing an operating system facility providing transport fault tolerance using existing, enhanced and newly built operating system services for supporting parallel applications. In particular, we use existing process duplication and process migration services, and synthesise a group communications facility for use in a transparent checkpointing facility. This research is carried out using the methods of experimental computer science. To provide a foundation for the synthesis of the group communications and checkpointing facilities, we survey and review related work in both fields. For group communications, we examine the V Distributed System, the x-kernel and Psync, the ISIS Toolkit, and Horus. We identify a need for services that consider the placement of processes on computers in the cluster. For Checkpointing, we examine Manetho, KeyKOS, libckpt, and Diskless Checkpointing. We observe the use of remote computer memories for storing checkpoints, and the use of copy-on-write mechanisms to reduce the time to create a checkpoint of a process. We propose a group communications facility providing two sets of services: user-oriented services and system-oriented services. User-oriented services provide transparency and target application. System-oriented services supplement the user-oriented services for supporting other operating systems services and do not provide transparency. Additional flexibility is achieved by providing delivery and ordering semantics independently. An operating system facility providing transparent checkpointing is synthesised using coordinated checkpointing. To ensure a consistent set of checkpoints are generated by the facility, instead of blindly blocking the processes of a parallel application, only non-deterministic events are blocked. This allows the processes of the parallel application to continue execution during the checkpoint operation. Checkpoints are created by adapting process duplication mechanisms, and checkpoint data is transferred to remote computer memories and disk for storage using the mechanisms of process migration. The services of the group communications facility are used to coordinate the checkpoint operation, and to transport checkpoint data to remote computer memories and disk. Both the group communications facility and the checkpointing facility have been implemented in the GENESIS cluster operating system and provide proof-of-concept. GENESIS uses a microkernel and client-server based operating system architecture, and is demonstrated to provide an appropriate environment for the development of these facilities. We design a number of experiments to test the performance of both the group communications facility and checkpointing facility, and to provide proof-of-performance. We present our approach to testing, the challenges raised in testing the facilities, and how we overcome them. For group communications, we examine the performance of a number of delivery semantics. Good speed-ups are observed and system-oriented group communication services are shown to provide significant performance advantages over user-oriented semantics in the presence of packet loss. For checkpointing, we examine the scalability of the facility given different levels of resource usage and a variable number of computers. Low overheads are observed for checkpointing a parallel application. It is made clear by this research that the microkernel and client-server based cluster operating system provide an ideal environment for the development of a high performance group communications facility and a transparent checkpointing facility for generating a platform for reliable parallel computing on clusters

    Automated parallel application creation and execution tool for clusters

    Full text link
    This research investigated an automated approach to re-writing traditional sequential computer programs into parallel programs for networked computers. A tool was designed and developed for generating parallel programs automatically and also executing these parallel programs on a network of computers. Performance is maximized by utilising all idle resources

    Single system image: A survey

    Get PDF
    Single system image is a computing paradigm where a number of distributed computing resources are aggregated and presented via an interface that maintains the illusion of interaction with a single system. This approach encompasses decades of research using a broad variety of techniques at varying levels of abstraction, from custom hardware and distributed hypervisors to specialized operating system kernels and user-level tools. Existing classification schemes for SSI technologies are reviewed, and an updated classification scheme is proposed. A survey of implementation techniques is provided along with relevant examples. Notable deployments are examined and insights gained from hands-on experience are summarized. Issues affecting the adoption of kernel-level SSI are identified and discussed in the context of technology adoption literature

    Selected aspects of the development of the RHODOS naming facility

    Full text link

    Towards An Integrated Tourism Landscape Analysis And Evaluation. The Case Study Of Lindos (Rhodes, Greece)

    Get PDF
    This PhD thesis aims at responding to the emerging need for theoretical and methodological deepening of the multiple relationships developed between tourism and landscape, attempting, on the one hand to provide an appropriate conceptualization of the term “tourism landscape” and, on the other hand, a methodological framework for its integrated analysis and evaluation. In order to achieve this aim, a deep bibliographic research was performed, exploring research paradigms from various disciplinary fields, such as spatial sciences, landscape ecology, environmental psychology and cultural geography. On the basis of these bibliographic references, a theoretical and methodological framework has been developed, built upon three key factors of the tourism landscape (contextual factors, landscape character and mental images). In order to apply and verify this framework, the case study of Lindos has been adopted. The main reason for which Lindos area has been chosen as a unit of analysis is that the landscape of Lindos belongs to the characteristic typology of the Aegean coastal landscape, where the protected image of traditional settlements paradoxically coexists with great landscape transformations due to tourism development. Research methods used include both those aiming at the acquisition and analysis of objective data based on expert-based techniques (such as remote sensing, statistical analysis of primary and secondary data, application of indicators) as well as those focusing on the acquisition and analysis of subjective data through a questionnaire-based survey from which primary data have been collected from a considerable number of tourists. Using these methods, in the first place it has been possible to draw significant results concerning the case study which have allowed formulating some suggestions for future landscape management; in the second place, from this study, general considerations about the limits and potentialities of an integrated research on tourism landscape have been emerged

    Cells in Space

    Get PDF
    Discussions and presentations addressed three aspects of cell research in space: the suitability of the cell as a subject in microgravity experiments, the requirements for generic flight hardware to support cell research, and the potential for collaboration between academia, industry, and government to develop these studies in space. Synopses are given for the presentations and follow-on discussions at the conference and papers are presented from which the presentations were based. An Executive Summary outlines the recommendations and conclusions generated at the conference
    corecore