5 research outputs found

    Performance Analysis of Modified SRPT in Multiple-Processor Multitask Scheduling

    Full text link
    In this paper we study the multiple-processor multitask scheduling problem in both deterministic and stochastic models. We consider and analyze Modified Shortest Remaining Processing Time (M-SRPT) scheduling algorithm, a simple modification of SRPT, which always schedules jobs according to SRPT whenever possible, while processes tasks in an arbitrary order. The M-SRPT algorithm is proved to achieve a competitive ratio of Θ(logα+β)\Theta(\log \alpha +\beta) for minimizing response time, where α\alpha denotes the ratio between maximum job workload and minimum job workload, β\beta represents the ratio between maximum non-preemptive task workload and minimum job workload. In addition, the competitive ratio achieved is shown to be optimal (up to a constant factor), when there are constant number of machines. We further consider the problem under Poisson arrival and general workload distribution (\ie, M/GI/NM/GI/N system), and show that M-SRPT achieves asymptotic optimal mean response time when the traffic intensity ρ\rho approaches 11, if job size distribution has finite support. Beyond finite job workload, the asymptotic optimality of M-SRPT also holds for infinite job size distributions with certain probabilistic assumptions, for example, M/M/NM/M/N system with finite task workload

    Towards Optimality in Parallel Scheduling

    Full text link
    To keep pace with Moore's law, chip designers have focused on increasing the number of cores per chip rather than single core performance. In turn, modern jobs are often designed to run on any number of cores. However, to effectively leverage these multi-core chips, one must address the question of how many cores to assign to each job. Given that jobs receive sublinear speedups from additional cores, there is an obvious tradeoff: allocating more cores to an individual job reduces the job's runtime, but in turn decreases the efficiency of the overall system. We ask how the system should schedule jobs across cores so as to minimize the mean response time over a stream of incoming jobs. To answer this question, we develop an analytical model of jobs running on a multi-core machine. We prove that EQUI, a policy which continuously divides cores evenly across jobs, is optimal when all jobs follow a single speedup curve and have exponentially distributed sizes. EQUI requires jobs to change their level of parallelization while they run. Since this is not possible for all workloads, we consider a class of "fixed-width" policies, which choose a single level of parallelization, k, to use for all jobs. We prove that, surprisingly, it is possible to achieve EQUI's performance without requiring jobs to change their levels of parallelization by using the optimal fixed level of parallelization, k*. We also show how to analytically derive the optimal k* as a function of the system load, the speedup curve, and the job size distribution. In the case where jobs may follow different speedup curves, finding a good scheduling policy is even more challenging. We find that policies like EQUI which performed well in the case of a single speedup function now perform poorly. We propose a very simple policy, GREEDY*, which performs near-optimally when compared to the numerically-derived optimal policy

    Service-Level-Driven Load Scheduling and Balancing in Multi-Tier Cloud Computing

    Get PDF
    Cloud computing environments often deal with random-arrival computational workloads that vary in resource requirements and demand high Quality of Service (QoS) obligations. A Service Level Agreement (SLA) is employed to govern the QoS obligations of the cloud service provider to the client. A service provider conundrum revolves around the desire to maintain a balance between the limited resources available for computing and the high QoS requirements of the varying random computing demands. Any imbalance in managing these conflicting objectives may result in either dissatisfied clients that can incur potentially significant commercial penalties, or an over-sourced cloud computing environment that can be significantly costly to acquire and operate. To optimize response to such client demands, cloud service providers organize the cloud computing environment as a multi-tier architecture. Each tier executes its designated tasks and passes them to the next tier, in a fashion similar, but not identical, to the traditional job-shop environments. Each tier consists of multiple computing resources, though an optimization process must take place to assign and schedule the appropriate tasks of the job on the resources of the tier, so as to meet the job’s QoS expectations. Thus, scheduling the clients’ workloads as they arrive at the multi-tier cloud environment to ensure their timely execution has been a central issue in cloud computing. Various approaches have been presented in the literature to address this problem: Join-Shortest-Queue (JSQ), Join-Idle-Queue (JIQ), enhanced Round Robin (RR) and Least Connection (LC), as well as enhanced MinMin and MaxMin, to name a few. This thesis presents a service-level-driven load scheduling and balancing framework for multi-tier cloud computing. A model is used to quantify the penalty a cloud service provider incurs as a function of the jobs’ total waiting time and QoS violations. This model facilitates penalty mitigation in situations of high demand and resource shortage. The framework accounts for multi-tier job execution dependencies in capturing QoS violation penalties as the client jobs progress through subsequent tiers, thus optimizing the performance at the multi-tier level. Scheduling and balancing operations are employed to distribute client jobs on resources such that the total waiting time and, hence, SLA violations of client jobs are minimized. Optimal job allocation and scheduling is an NP combinatorial problem. The dynamics of random job arrival make the optimality goal even harder to achieve and maintain as new jobs arrive at the environment. Thus, the thesis proposes a queue virtualization as an abstract that allows jobs to migrate between resources within a given tier, as well, altering the sequencing of job execution within a given resource, during the optimization process. Given the computational complexity of the job allocation and scheduling problem, a genetic algorithm is proposed to seek optimal solutions. The queue virtualization is proposed as a basis for defining chromosome structure and operations. As computing jobs tend to vary with respect to delay tolerance, two SLA scenarios are tackled, that is, equal cost of time delays and differentiated cost of time delays. Experimental work is conducted to investigate the performance of the proposed approach both at the tier and system level
    corecore