17,545 research outputs found
Tighter bound for MULTIFIT scheduling on uniform processors
AbstractWe examine one of the basic, well studied problem of scheduling theory, that of nonpreemptive assignment of independent tasks on m parallel processors with the objective of minimizing the makespan. Because this problem is NP-complete and apparently intractable in general, much effort has been directed toward devising fast algorithms which find near optimal schedules. Two well-known heuristic algorithms LPT (largest processing time first) and MULTIFIT, shortly MF, find schedules having makespans within 43, 1311, respectively, of the minimum possible makespan, when the m parallel processors are identical. If they are uniform, then the best worst-case performance ratio bounds we know are 1.583, 1.40, respectively. In this paper we tighten the bound to 1.382 for MF algorithm for the uniform-processor system. On the basis of some of our general results and other investigations, we conjecture that the bound could be tightend further to 1.366
Parallel memetic algorithms for independent job scheduling in computational grids
In this chapter we present parallel implementations of Memetic Algorithms (MAs) for the problem of scheduling independent jobs in computational grids. The problem of scheduling in computational grids is known for its high demanding computational time. In this work we exploit the intrinsic parallel nature of MAs as well as the fact that computational grids offer large amount of resources, a part of which could be used to compute the efficient allocation of jobs to grid resources.
The parallel models exploited in this work for MAs include both fine-grained and coarse-grained parallelization and their hybridization. The resulting schedulers have been tested through different grid scenarios generated by a grid simulator to match different possible configurations of computational grids in terms of size (number of jobs and resources) and computational characteristics of resources. All in all, the result of this work showed that Parallel MAs are very good alternatives in order to match different performance requirement on fast scheduling of jobs to grid resources.Peer ReviewedPostprint (author's final draft
A Three-Level Parallelisation Scheme and Application to the Nelder-Mead Algorithm
We consider a three-level parallelisation scheme. The second and third levels
define a classical two-level parallelisation scheme and some load balancing
algorithm is used to distribute tasks among processes. It is well-known that
for many applications the efficiency of parallel algorithms of the second and
third level starts to drop down after some critical parallelisation degree is
reached. This weakness of the two-level template is addressed by introduction
of one additional parallelisation level. As an alternative to the basic solver
some new or modified algorithms are considered on this level. The idea of the
proposed methodology is to increase the parallelisation degree by using less
efficient algorithms in comparison with the basic solver. As an example we
investigate two modified Nelder-Mead methods. For the selected application, a
few partial differential equations are solved numerically on the second level,
and on the third level the parallel Wang's algorithm is used to solve systems
of linear equations with tridiagonal matrices. A greedy workload balancing
heuristic is proposed, which is oriented to the case of a large number of
available processors. The complexity estimates of the computational tasks are
model-based, i.e. they use empirical computational data
Pipelining the Fast Multipole Method over a Runtime System
Fast Multipole Methods (FMM) are a fundamental operation for the simulation
of many physical problems. The high performance design of such methods usually
requires to carefully tune the algorithm for both the targeted physics and the
hardware. In this paper, we propose a new approach that achieves high
performance across architectures. Our method consists of expressing the FMM
algorithm as a task flow and employing a state-of-the-art runtime system,
StarPU, in order to process the tasks on the different processing units. We
carefully design the task flow, the mathematical operators, their Central
Processing Unit (CPU) and Graphics Processing Unit (GPU) implementations, as
well as scheduling schemes. We compute potentials and forces of 200 million
particles in 48.7 seconds on a homogeneous 160 cores SGI Altix UV 100 and of 38
million particles in 13.34 seconds on a heterogeneous 12 cores Intel Nehalem
processor enhanced with 3 Nvidia M2090 Fermi GPUs.Comment: No. RR-7981 (2012
Multiprocessor task scheduling in multistage hyrid flowshops: a genetic algorithm approach
This paper considers multiprocessor task scheduling in a multistage hybrid flow-shop environment. The objective is to minimize the make-span, that is, the completion time of all the tasks in the last stage. This problem is of practical interest in the textile and process industries. A genetic algorithm (GA) is developed to solve the problem. The GA is tested against a lower bound from the literature as well as against heuristic rules on a test bed comprising 400 problems with up to 100 jobs, 10 stages, and with up to five processors on each stage. For small problems, solutions found by the GA are compared to optimal solutions, which are obtained by total enumeration. For larger problems, optimum solutions are estimated by a statistical prediction technique. Computational results show that the GA is both effective and efficient for the current problem. Test problems are provided in a web site at www.benchmark.ibu.edu.tr/mpt-h; fsp
- …