355 research outputs found

    Approximation algorithms for min-max resource sharing and malleable tasks scheduling

    Get PDF
    This thesis deals with approximation algorithms for problems in mathematical programming, combinatorial optimization, and their applications. We first study the min-max resource-sharing problem (the packing problem as the linear case) with MM nonnegative convex constraints on a convex set BB, which is a class of convex programming. In general block solvers are required for solving the problems. We generalize the algorithm by Grigoriadis et al. to the case with only weak approximate block solvers. In this way we present an approximation algorithm that needs at most O(M(lnM+ϵ2lnϵ1))O(M(\ln M+\epsilon^{-2}\ln\epsilon^{-1})) calls to the block solver for any given relative accuracy ϵ(0,1)\epsilon\in(0,1). It is the first bound independent of the data and the approximation ratio of the block solver. As applications of the min-max resource-sharing problem, we study the multicast congestion problem in communication networks and the range assignment problem in arbitrary ad-hoc networks. We present improved approximation algorithms for these problems. We also study the problem of scheduling malleable tasks with precedence constraints. We are given mm identical processors and nn tasks. For each task the processing time is a discrete function of the number of processors allotted to it. In addition, the tasks must be processed according to the precedence constraints. The goal is to minimize the makespan (maximum completion time) of the resulting schedule. We improve the previous best approximation algorithm with a ratio 3+55.2363+\sqrt{5}\approx 5.236 to 100/43+100(43497)/24514.730598100/43+100(\sqrt{4349}-7)/2451\approx 4.730598. Finally, we propose a new model for malleable tasks and develop an approximation algorithm for the scheduling problem with a ratio 100/63+100(6469+13)/54813.291919100/63+100(\sqrt{6469}+13)/5481\approx 3.291919. We also show that our results are very close to the best asymptotic one

    Scheduling Malleable Tasks with Precedence Constraints

    Get PDF
    In this paper we propose an approximation algorithm for scheduling malleable tasks with precedence constraints. Based on an interesting model for malleable tasks with continuous processor allotments by Prasanna and Musicus \cite{PrMu91,PrMu94,PrMu96}, we define two natural assumptions for malleable tasks: the processing time of any malleable task is non-increasing in the number of processors allotted, and the speedup is concave in the number of processors. We show that under these assumptions the work function of any malleable task is non-decreasing in the number of processors and is convex in the processing time. Furthermore, we propose a two-phase approximation algorithm for the scheduling problem. In the first phase we solve a linear program to obtain a fractional allotment for all tasks. By rounding the fractional solution, each malleable task is assigned a number of processors. In the second phase a variant of the list scheduling algorithm is employed. %In the phases we use two parameters μ{1,(m+1)/2}\mu\in\{1,\dots\lfloor (m+1)/2\rfloor\} and ρ[0,1]\rho\in [0,1] for the allotment and the rounding, respectively, where mm is the number of processors. By choosing appropriate values of the parameters, we show (via a nonlinear program) that the approximation ratio of our algorithm is at most 100/63+100(6469+13)/54813.291919100/63+100(\sqrt{6469}+13)/5481\approx 3.291919. We also show that our result is asymptotically tight

    Theory and Engineering of Scheduling Parallel Jobs

    Get PDF
    Scheduling is very important for an efficient utilization of modern parallel computing systems. In this thesis, four main research areas for scheduling are investigated: the interplay and distribution of decision makers, the efficient schedule computation, efficient scheduling for the memory hierarchy and energy-efficiency. The main result is a provably fast and efficient scheduling algorithm for malleable jobs. Experiments show the importance and possibilities of scheduling considering the memory hierarchy

    A (2+ɛ)-approximation for scheduling parallel jobs in platforms

    Get PDF
    We consider the problem of \textsc{Scheduling parallel Jobs in heterogeneous Platforms}: We are given a set J={1,,n}\mathcal{J}=\{1,\ldots,n\} of nn jobs, where a job jJj\in\mathcal{J} is described by a pair (pj,qj)(p_j,q_j) of a processing time pjQ>0p_j\in\mathbb{Q}_{>0} and the number of processors qjNq_j\in\mathbb{N} that are required to execute jj. We are also given a set B\mathcal{B} of NN heterogeneous platforms P1,,PNP_1,\ldots,P_N, where each PiP_i contains mim_{i} processors for i{1,,N}.i\in\{1,\ldots, N\}. The objective is to find a schedule for the jobs in the platforms minimizing the makespan, i.e. the latest finishing time of a job. Unless P=NP\mathcal{P}=\mathcal{NP} there is no approximation algorithm with absolute ratio strictly better than 22 for the problem. We give a (2+ϵ)(2+\epsilon)-approximation for the problem improving the previously best known absolute approximation ratio 33

    Performance optimization and energy efficiency of big-data computing workflows

    Get PDF
    Next-generation e-science is producing colossal amounts of data, now frequently termed as Big Data, on the order of terabyte at present and petabyte or even exabyte in the predictable future. These scientific applications typically feature data-intensive workflows comprised of moldable parallel computing jobs, such as MapReduce, with intricate inter-job dependencies. The granularity of task partitioning in each moldable job of such big data workflows has a significant impact on workflow completion time, energy consumption, and financial cost if executed in clouds, which remains largely unexplored. This dissertation conducts an in-depth investigation into the properties of moldable jobs and provides an experiment-based validation of the performance model where the total workload of a moldable job increases along with the degree of parallelism. Furthermore, this dissertation conducts rigorous research on workflow execution dynamics in resource sharing environments and explores the interactions between workflow mapping and task scheduling on various computing platforms. A workflow optimization architecture is developed to seamlessly integrate three interrelated technical components, i.e., resource allocation, job mapping, and task scheduling. Cloud computing provides a cost-effective computing platform for big data workflows where moldable parallel computing models are widely applied to meet stringent performance requirements. Based on the moldable parallel computing performance model, a big-data workflow mapping model is constructed and a workflow mapping problem is formulated to minimize workflow makespan under a budget constraint in public clouds. This dissertation shows this problem to be strongly NP-complete and designs i) a fully polynomial-time approximation scheme for a special case with a pipeline-structured workflow executed on virtual machines of a single class, and ii) a heuristic for a generalized problem with an arbitrary directed acyclic graph-structured workflow executed on virtual machines of multiple classes. The performance superiority of the proposed solution is illustrated by extensive simulation-based results in Hadoop/YARN in comparison with existing workflow mapping models and algorithms. Considering that large-scale workflows for big data analytics have become a main consumer of energy in data centers, this dissertation also delves into the problem of static workflow mapping to minimize the dynamic energy consumption of a workflow request under a deadline constraint in Hadoop clusters, which is shown to be strongly NP-hard. A fully polynomial-time approximation scheme is designed for a special case with a pipeline-structured workflow on a homogeneous cluster and a heuristic is designed for the generalized problem with an arbitrary directed acyclic graph-structured workflow on a heterogeneous cluster. This problem is further extended to a dynamic version with deadline-constrained MapReduce workflows to minimize dynamic energy consumption in Hadoop clusters. This dissertation proposes a semi-dynamic online scheduling algorithm based on adaptive task partitioning to reduce dynamic energy consumption while meeting performance requirements from a global perspective, and also develops corresponding system modules for algorithm implementation in the Hadoop ecosystem. The performance superiority of the proposed solutions in terms of dynamic energy saving and deadline missing rate is illustrated by extensive simulation results in comparison with existing algorithms, and further validated through real-life workflow implementation and experiments using the Oozie workflow engine in Hadoop/YARN systems

    Group-based optimization for parallel job scheduling in clusters via heuristic search

    Get PDF
    Job scheduling for parallel processing typically makes scheduling decisions on a per job basis due to the dynamic arrival of jobs. Such decision making provides limited options to find globally best schedules. Most research uses off-line optimization which is not realistic. We propose an optimization on the basis of limited-size dynamic job grouping per priority class. We apply heuristic domain-knowledge-based hi-level search and branch-and-bound methods to heavy workload traces to capture good schedules. Special plan-based conservative backfilling and shifting policies are used to augment the search. Our objective is to minimize average relative response times for long and medium job classes, while keeping utilization high. The scheduling algorithm is extended from the SCOJO-PECT coarse-grain pre-emptive time-sharing scheduler. The proposed scheduler was evaluated using real traces and Lublin-Feitelson synthetic workload model. The comparisons were made with the conservative SCOJO-PECT scheduler. The results are promising--the average relative response times were improved by 18-32 while still able to contain the loss of utilization within 2

    Improved Scheduling with a Shared Resource

    Full text link
    We consider the following shared-resource scheduling problem: Given a set of jobs JJ, for each jJj\in J we must schedule a job-specific processing volume of vj>0v_j>0. A total resource of 11 is available at any time. Jobs have a resource requirement rj[0,1]r_j\in[0,1], and the resources assigned to them may vary over time. However, assigning them less will cause a proportional slowdown. We consider two settings. In the first, we seek to minimize the makespan in an online setting: The resource assignment of a job must be fixed before the next job arrives. Here we give an optimal e/(e1)e/(e-1)-competitive algorithm with runtime O(nlogn)\mathcal{O}(n\cdot \log n). In the second, we aim to minimize the total completion time. We use a continuous linear programming (CLP) formulation for the fractional total completion time and combine it with a previously known dominance property from malleable job scheduling to obtain a lower bound on the total completion time. We extract structural properties by considering a geometrical representation of a CLP's primal-dual pair. We combine the CLP schedule with a greedy schedule to obtain a (3/2+ε)(3/2+\varepsilon)-approximation for this setting. This improves upon the so far best-known approximation factor of 22.Comment: Submitted to COCOA 2023, Full Versio
    corecore