Search CORE

12,676 research outputs found

On contiguous and non-contiguous parallel task scheduling

Author
Publication venue: Springer
Publication date
Field of study

An efficient processor allocation strategy that maintains a high degree of contiguity among processors in 2D mesh connected multicomputers

Author: Abaneh I.
Bani-Mohammad S.
Mackenzie L.M.
Ould-Khaoua M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Two strategies are used for the allocation of jobs to processors connected by mesh topologies: contiguous allocation and non-contiguous allocation. In non-contiguous allocation, a job request can be split into smaller parts that are allocated to non-adjacent free sub-meshes rather than always waiting until a single sub-mesh of the requested size and shape is available. Lifting the contiguity condition is expected to reduce processor fragmentation and increase system utilization. However, the distances traversed by messages can be long, and as a result the communication overhead, especially contention, is increased. The extra communication overhead depends on how the allocation request is partitioned and assigned to free sub-meshes. This paper presents a new Non-contiguous allocation algorithm, referred to as Greedy-Available-Busy-List (GABL for short), which can decrease the communication overhead among processors allocated to a given job. The simulation results show that the new strategy can reduce the communication overhead and substantially improve performance in terms of parameters such as job turnaround time and system utilization. Moreover, the results reveal that the Shortest-Service-Demand-First (SSD) scheduling strategy is much better than the First-Come-First-Served (FCFS) scheduling strategy

Crossref

Enlighten

A performance comparison of the contiguous allocation strategies in 3D mesh connected multicomputers

Author: B.-S. Yoo
C. Peterson
G.-M. Chiu
H. Choo
I. Ababneh
I. Ababneh
J. Wei
K. Windisch
K.-H. Seo
K.-H. Seo
L. He
M. Harchol-Balter
S. Bani-Mohammad
S. Bani-Mohammad
V. Tabatabaee
W. Athas
Y. Zhu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

The performance of contiguous allocation strategies can be significantly affected by the distribution of job execution times. In this paper, the performance of the existing contiguous allocation strategies for 3D mesh multicomputers is re-visited in the context of heavy-tailed distributions (e.g., a Bounded Pareto distribution). The strategies are evaluated and compared using simulation experiments for both First-Come-First-Served (FCFS) and Shortest-Service-Demand (SSD) scheduling strategies under a variety of system loads and system sizes. The results show that the performance of the allocation strategies degrades considerably when job execution times follow a heavy-tailed distribution. Moreover, SSD copes much better than FCFS scheduling strategy in the presence of heavy-tailed job execution times. The results also show that the strategies that depend on a list of allocated sub-meshes for both allocation and deallocation have lower allocation overhead and deliver good system performance in terms of average turnaround time and mean system utilization

Crossref

Enlighten

The effect of real workloads and stochastic workloads on the performance of allocation and scheduling algorithms in 2D mesh multicomputers

Author: Abaneh I.
Bani-Mohammad S.
Ferguson J.D.
Mackenzie L.M.
Ould-Khaoua M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2008
Field of study

The performance of the existing non-contiguous processor allocation strategies has been traditionally carried out by means of simulation based on a stochastic workload model to generate a stream of incoming jobs. To validate the performance of the existing algorithms, there has been a need to evaluate the algorithms' performance based on a real workload trace. In this paper, we evaluate the performance of several well-known processor allocation and job scheduling strategies based on a real workload trace and compare the results against those obtained from using a stochastic workload. Our results reveal that the conclusions reached on the relative performance merits of the allocation strategies when a real workload trace is used are in general compatible with those obtained when a stochastic workload is used

Crossref

Enlighten

Bounding Cache Miss Costs of Multithreaded Computations Under General Schedulers

Author: Cole Richard
Ramachandran Vijaya
Publication venue
Publication date: 28/09/2017
Field of study

We analyze the caching overhead incurred by a class of multithreaded algorithms when scheduled by an arbitrary scheduler. We obtain bounds that match or improve upon the well-known

O(Q+S \cdot (M/B))

caching cost for the randomized work stealing (RWS) scheduler, where

S

is the number of steals,

Q

is the sequential caching cost, and

M

and

B

are the cache size and block (or cache line) size respectively.Comment: Extended abstract in Proceedings of ACM Symp. on Parallel Alg. and Architectures (SPAA) 2017, pp. 339-350. This revision has a few small updates including a missing citation and the replacement of some big Oh terms with precise constant

arXiv.org e-Print Archive

Crossref

A reclaimer scheduling problem arising in coal stockyard management

Author: Angelelli Enrico
Kalinowski Thomas
Kapoor Reena
Savelsbergh Martin W. P.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

We study a number of variants of an abstract scheduling problem inspired by the scheduling of reclaimers in the stockyard of a coal export terminal. We analyze the complexity of each of the variants, providing complexity proofs for some and polynomial algorithms for others. For one, especially interesting variant, we also develop a constant factor approximation algorithm.Comment: 26 page

arXiv.org e-Print Archive

University of Newcastle's Digital Repository

Crossref

Archivio istituzionale della ricerca - Università di Brescia

Parallel, iterative solution of sparse linear systems: Models and architectures

Author: Patrick M. L.
Reed D. A.
Publication venue
Publication date
Field of study

A model of a general class of asynchronous, iterative solution methods for linear systems is developed. In the model, the system is solved by creating several cooperating tasks that each compute a portion of the solution vector. A data transfer model predicting both the probability that data must be transferred between two tasks and the amount of data to be transferred is presented. This model is used to derive an execution time model for predicting parallel execution time and an optimal number of tasks given the dimension and sparsity of the coefficient matrix and the costs of computation, synchronization, and communication. The suitability of different parallel architectures for solving randomly sparse linear systems is discussed. Based on the complexity of task scheduling, one parallel architecture, based on a broadcast bus, is presented and analyzed

NASA Technical Reports Server

Closing the Gap for Pseudo-Polynomial Strip Packing

Author: Jansen Klaus
Rau Malin
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual European Symposium on Algorithms (ESA 2019)
Publication date: 01/01/2019
Field of study

Two-dimensional packing problems are a fundamental class of optimization problems and Strip Packing is one of the most natural and famous among them. Indeed it can be defined in just one sentence: Given a set of rectangular axis parallel items and a strip with bounded width and infinite height, the objective is to find a packing of the items into the strip minimizing the packing height. We speak of pseudo-polynomial Strip Packing if we consider algorithms with pseudo-polynomial running time with respect to the width of the strip. It is known that there is no pseudo-polynomial time algorithm for Strip Packing with a ratio better than 5/4 unless P = NP. The best algorithm so far has a ratio of 4/3 + epsilon. In this paper, we close the gap between inapproximability result and currently known algorithms by presenting an algorithm with approximation ratio 5/4 + epsilon. The algorithm relies on a new structural result which is the main accomplishment of this paper. It states that each optimal solution can be transformed with bounded loss in the objective such that it has one of a polynomial number of different forms thus making the problem tractable by standard techniques, i.e., dynamic programming. To show the conceptual strength of the approach, we extend our result to other problems as well, e.g., Strip Packing with 90 degree rotations and Contiguous Moldable Task Scheduling, and present algorithms with approximation ratio 5/4 + epsilon for these problems as well

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server