3 research outputs found

    Domain Decomposition vs. Master-Slave in Apparently Homogeneous Systems

    Full text link

    Decentralized Scheduling for Many-Task Applications in the Hybrid Cloud

    Get PDF
    While Cloud Computing has transformed how we solve many computing tasks, some scientific and many-task applications are not efficiently executed on cloud resources. Decentralized scheduling, as studied in grid computing, can provide a scalable system to organize cloud resources and schedule a variety of work. By measuring simulations of two algorithms, the fully decentralized Organic Grid, and the partially decentralized Air Traffic Controller from IBM, we establish that decentralization is a workable approach, and that there are bottlenecks that can impact partially centralized algorithms. Through measurements in the cloud, we verify that our simulation approach is sound, and assess the variable performance of cloud resources. We propose a scheduler that measures the capabilities of the resources available to execute a task and distributes work dynamically at run time. Our scheduling algorithm is evaluated experimentally, and we show that performance-aware scheduling in a cloud environment can provide improvements in execution time. This provides a framework by which a variety of parameters can be weighed to make job-specific and context-aware scheduling decisions. Our measurements examine the usefulness of benchmarking as a metric used to measure a node\u27s performance, and drive scheduling. Benchmarking provides an advantage over simple queue-based scheduling on distributed systems whose members vary in actual performance, but the NAS benchmark we use does not always correlate perfectly with actual performance. The utilized hardware is examined, as are enforced performance variations, and we observe changes in performance that result in running on a system in which different workers receive different CPU allocations. As we see that performance metrics are useful near the end of the execution of a large job, we create a new metric from historical data of partially completed work, and use that to drive execution time down further. Interdependent task graph work is introduced and described as a next step in improving cloud scheduling. Realistic task graph problems are defined and a scheduling approach is introduced. This dissertation lays the groundwork to expand the types of problems that can be solved efficiently in the cloud environment

    Adaptive Parallelism under Equus

    No full text
    This paper describes adaptively parallel computations under Equus. These computations execute on a processor pool, and expand and contract as the number of processor nodes allocated to them varies over their run-time. They are based upon a hierarchical master-worker structure. The number of worker processes changes with the number of allocated nodes, and so does the number of processes that act as servers to them (such as the masters). The paper uses an image-processing example to describe how workers and servers are added and withdrawn at run-time. The affected processes are synchronised, communication linkages are changed, and in some cases state is transferred between them. Reconfigurations are transparent to worker (and other client) processes, but not all can be made transparent to servers. The paper concludes by discussing the Equus techniques and outlining future work. 1: Introduction This paper describes the design of reconfigurable distributed computations (RDCs) that execute..
    corecore