130 research outputs found
Remora : implementing adaptive parallelism on a heterogeneous cluster of networked workstations
Computers connected to a local area network are often only fully utilized for short periods of time. In fact, most workstations are not used at all for a significant portion of the day. The combined "idle time" of the workstations on a network constitutes a significant computing resource, which is generally wasted. If harnessed properly, such a resource could constitute a cheap alternative to expensive high-performance computers. Adaptive parallelism refers to the parallel execution of a computation on a dynamically changing set of processors. This thesis investigates the viability of this approach as a vehicle to harness the "idle cycles" available on a heterogeneous cluster of networked computers. A system, called Remora, which implements adaptive parallelism via the Linda programming paradigm, is presented. Experiments, performed using Remora, show that adaptive parallelism provides an efficient vehicle for using idle processor cycles, without having an adverse effect on the tasks which constitute the normal workload of the computers being used
Transparent Adaptive Parallelism on NOWs using OpenMP
We present a system that allows OpenMP programs to execute on a network of workstations with a variable number of nodes. The ability to adapt to a variable number of nodes allows a program to take advantage of additional nodes that become available after it starts execution, or to gracefully scale down when the number of available nodes is reduced. We demonstrate that the cost of adaptation is modest; the system allows a program to adapt at a moderate rate without much performance loss.Two ideas underlie the efficiency of our design. First, we recognize that OpenMP programs exhibit convenient adaptation points during their execution, points at which the cost of adaptation can be much reduced. Second, by allowing a process a certain grace period before it must leave a node, we insure that most adaptations can occur at these adaptation points, and thus at low cost. Migration of a process, a much more expensive method for providing adaptivity, is used only as a back-up solution, when the process cannot reach an adaptation point within the grace period.Our implementation consists of an OpenMP pre-processor that generates TreadMarks distributed shared memory (DSM) programs, and a version of TreadMarks modified to adapt to a variable number of nodes. Using a DSM as the underlying substrate facilitates the data (re-)distribution necessary after an adaptation
Cooperating runtime systems in LiPS
Performing computation using networks of workstations is increasingly becoming an alternative to using a supercomputer. This approach is motivated by the vast quantities of unused idle-time available in workstation networks. Unlike comptuting o a tighty coupled parallel computer, where a fixed number of processor nodes is used within a computation, the number of usable nodes in a workstation network is constantly changing over time. Additionally, workstations are more frequently subject to outages, e.g. due to reboots. The question arises how applications, adapting smoothly to this environment, should be realized. LiPS is a system for distributed computing using idle-cycles in networks for workstations. This system is ints version 2.3 is currently used at the Universität des Saarlandes in Saarbrücken, Germany to perform computationally intensive applications in the field of cryptography on a net of approximately 250 workstations and should be enhanced to work within an environment of more than 1000 machines all over the world within the next years. In this paper we present the runtime systems of LiPS along with performance measurements taken with the current LiPS development version 2.4
An Evaluation of Adaptive Execution of OpenMP Task Parallel Programs
We present a system that allows task parallel OpenMP pro grams to execute on a network of workstations (NOW) with a variable number of nodes Such adaptivity, generally called adaptive parallelism, is important in a multi-user NOW environment, enabling the system to expand the computation onto idle nodes or withdraw from otherwise occupied nodes. We focus on task parallel applications in this paper, but the system also lets data parallel applications run adaptively. When an adaptation is requested, we let all processes complete theircurrent tasks, then the system executes an extra OpenMP join-fork sequence not present in the application code. Here, the system can change the number of nodes without involving the application, as processes do not have a compute-relevant private process state. We show that the costs of adaptations is low, and we explain why the costs are lower for task parallel applications than for data parallel applications
Data Parallel Programming in an Adaptive Environment
For better utilization of computing resources, it is important to
consider parallel programming environments in which the number of
available processors varies at runtime. In this paper, we discuss
runtime support for data parallel programming in such an adaptive
environment. Executing data parallel programs in an adaptive environment
requires redistributing data when the number of processors changes, and
also requires determining new loop bounds and communication patterns
for the new set of processors. We have developed a runtime library to
provide this support. We discuss how the runtime library can be used by
compilers to generate code for an adaptive environment.
We also present performance results for a multiblock Navier-Stokes
solver run on a network of workstations using PVM for message passing.
Our experiments show that if the number of processors
is not varied frequently, the cost of data redistribution is not
significant compared to the time required for the actual computations.
(Also cross-referenced as UMIACS-TR-94-109
- …