Ankara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent Univ., 2013.Thesis (Ph. D.) -- Bilkent University, 2013.Includes bibliographical references leaves 136-150.We study the problem of assigning nonuniform tasks onto heterogeneous systems.
We investigate two distinct problems in this context. The first problem is the
one-dimensional partitioning of nonuniform workload arrays with optimal load
balancing. The second problem is the assignment of nonuniform independent
tasks onto heterogeneous systems.
For one-dimensional partitioning of nonuniform workload arrays, we investigate
two cases: chain-on-chain partitioning (CCP), where the order of the processors
is specified, and chain partitioning (CP), where processor permutation
is allowed. We present polynomial time algorithms to solve the CCP problem
optimally, while we prove that the CP problem is NP complete. Our empirical
studies show that our proposed exact algorithms for the CCP problem produce
substantially better results than the state-of-the-art heuristics while the solution
times remain comparable.
For the independent task assignment problem, we investigate improving the
performance of the well-known and widely used constructive heuristics MinMin,
MaxMin and Sufferage. All three heuristics are known to run in O(KN2
) time in
assigning N tasks to K processors. In this thesis, we present our work on an algorithmic
improvement that asymptotically decreases the running time complexity
of MinMin to O(KN log N) without affecting its solution quality. Furthermore,
we combine the newly proposed MinMin algorithm with MaxMin as well as Sufferage,
obtaining two hybrid algorithms. The motivation behind the former hybrid
algorithm is to address the drawback of MaxMin in solving problem instances
with highly skewed cost distributions while also improving the running time performance
of MaxMin. The latter hybrid algorithm improves the running time
performance of Sufferage without degrading its solution quality. The proposed
algorithms are easy to implement and we illustrate them through detailed pseudocodes.
The experimental results over a large number of real-life datasets show
that the proposed fast MinMin algorithm and the proposed hybrid algorithms
perform significantly better than their traditional counterparts as well as more
recent state-of-the-art assignment heuristics. For the large datasets used in the
experiments, MinMin, MaxMin, and Sufferage, as well as recent state-of-the-art
heuristics, require days, weeks, or even months to produce a solution, whereas all
of the proposed algorithms produce solutions within only two or three minutes.
For the independent task assignment problem, we also investigate adopting
the multi-level framework which was successfully utilized in several applications
including graph and hypergraph partitioning. For the coarsening phase of the
multi-level framework, we present an efficient matching algorithm which runs in
O(KN) time in most cases. For the uncoarsening phase, we present two refinement
algorithms: an efficient O(KN)-time move-based refinement and an efficient
O(K2N log N)-time swap-based refinement. Our results indicate that multi-level
approach improves the quality of task assignments, while also improving the running
time performance, especially for large datasets.
As a realistic distributed application of the independent task assignment problem,
we introduce the site-to-crawler assignment problem, where a large number
of geographically distributed web servers are crawled by a multi-site distributed
crawling system and the objective is to minimize the duration of the crawl. We
show that this problem can be modeled as an independent task assignment problem.
As a solution to the problem, we evaluate a large number of state-of-the-art
task assignment heuristics selected from the literature as well as the improved
versions and the newly developed multi-level task assignment algorithm. We
compare the performance of different approaches through simulations on very
large, real-life web datasets. Our results indicate that multi-site web crawling
efficiency can be considerably improved using the independent task assignment
approach, when compared to relatively easy-to-implement, yet naive baselines.Tabak, E KartalPh.D