5 research outputs found

    Backfilling with fairness and slack for parallel job scheduling

    Get PDF
    Parallel jobs have different runtimes and numbers of threads/processes. Thus, scheduling parallel jobs involves a packing problem. If jobs are packed as tightly as possible, utilization will be improved. Otherwise, some resources have to stay idle. The common solution to deal with idle resources is backfilling, which schedule smaller jobs submitted later to execute earlier as long as they do not postpone the first job or all the previous jobs in the waiting queue. Traditionally, backfilling uses first fit for idle resources, according to the submission order. However, in this case, better packing of jobs could be missed. Hence, we propose an algorithm which looks further ahead if significantly improving utilization. However at the same time, this could be unfair to some jobs ahead in the queue. So we use a delay factor as a constraint to limit unfairness. We propose a branch and bound algorithm which selects jobs for backfilling which keep utilization high, while trying to stay close to First-Come-First-Served (FCFS). We evaluate relative response time and utilization and compare to other backfilling approaches. The selection of jobs for backfilling to optimize for high utilization and low delay is implemented as an extension of the existing Scojo-PECT preemptive scheduler

    Coarse-grain time sharing with advantageous overhead minimization for parallel job scheduling

    Get PDF
    Parallel job scheduling on cluster computers involves the usage of several strategies to maximize both the utilization of the hardware as well as the throughput at which jobs are processed. Another consideration is the response times, or how quickly a job finishes after submission. One possible solution toward achieving these goals is the use of preemption. Preemptive scheduling techniques involve an overhead cost typically associated with swapping jobs in and out of memory. As memory and data sets increase in size, overhead costs increase. Here is presented a technique for reducing the overhead incurred by swapping jobs in and out of memory as a result of preemption. This is done in the context of the Scojo-PECT preemptive scheduler. Additionally a design for expanding the existing Cluster Simulator to support analysis of scheduling overhead in preemptive scheduling techniques is presented. A reduction in the overhead incurred through preemptive scheduling by the application of standard fitting algorithms in a multi-state job allocation heuristic is shown

    Extending Scojo-PECT by migration based on system-level checkpointing

    Get PDF
    In recent years, a significant amount of research has been done on job scheduling in high performance computing area. Parallel jobs have different running time and require a different number of processors, thus jobs need to be scheduled and packed to improve system utilization. Scojo-PECT is a job scheduler which provides service guarantees by using coarse-grain time sharing. However, Scojo-PECT does not provide process migration. We extend the Scojo-PECT by migrating parallel jobs based on system-level checkpointing. We investigate different cases in the Scojo-PECT scheduling algorithm where migration based on system-level checkpointing can be used to improve resource utilization and reduce job response time. Our experimental results show reduction of relative response times on medium jobs over the results of the original Scojo-PECT scheduler and the long jobs do not suffer any disadvantage

    Group-based optimization for parallel job scheduling in clusters via heuristic search

    Get PDF
    Job scheduling for parallel processing typically makes scheduling decisions on a per job basis due to the dynamic arrival of jobs. Such decision making provides limited options to find globally best schedules. Most research uses off-line optimization which is not realistic. We propose an optimization on the basis of limited-size dynamic job grouping per priority class. We apply heuristic domain-knowledge-based hi-level search and branch-and-bound methods to heavy workload traces to capture good schedules. Special plan-based conservative backfilling and shifting policies are used to augment the search. Our objective is to minimize average relative response times for long and medium job classes, while keeping utilization high. The scheduling algorithm is extended from the SCOJO-PECT coarse-grain pre-emptive time-sharing scheduler. The proposed scheduler was evaluated using real traces and Lublin-Feitelson synthetic workload model. The comparisons were made with the conservative SCOJO-PECT scheduler. The results are promising--the average relative response times were improved by 18-32 while still able to contain the loss of utilization within 2

    G-LOMARC-TS: Lookahead group matchmaking for time/space sharing on multi-core parallel machines

    Get PDF
    Parallel machines with multi-core nodes are becoming increasingly popular. The performances of applications running on these machines are improved gradually due to the resource competition in each node. Researches have found that coscheduling different applications with complementary resource characteristics on the same set of nodes (semi time sharing) may improve the performance. We propose a scheduling algorithm G-LOMARC-TS which incorporates both space and semi time sharing scheduling methods and matches groups of jobs if possible for coscheduling. Since matchmaking may select jobs further down the waiting queue and the jobs in front of the queue may be delayed subsequently, fairness for each individual job will be watched and the delay will be kept within a limited bound. Several heuristics are used to solve the NP-complete problem of forming groups. Our experiment results show both utilization gain and average relative response time improvements of G-LOMARC-TS over other several scheduling policies
    corecore