2,453 research outputs found

    Task assignment in server farms under realistic workload conditions

    Get PDF
    Server farms have become very popular in recent years since they effectively address the problem of large delays, a common problem faced by many organisations whose systems receive high volumes of traffic. Recently, there has been a wide use of these server farms in two main areas, namely, Web hosting and scientific computing. The performance of such server farms is highly reliant on the underlying task assignment policy, a specific set of rules that defines how the incoming tasks are assigned to and processed at hosts. The aim of a task assignment policy is to optimise certain performance criteria such as the expected waiting time and slowdown. One of the key factors that affect the performance of these policies is the service time distribution of tasks. There is extensive evidence indicating that the service times of modern computer workloads closely follow heavy-tailed distributions that possess high variance. However, in certain environments, the service time distributions of tasks are unknown. Imposing parametric assumptions in such cases can lead to inaccurate and unreliable inferences. Considerable efforts have been made in recent years to devise efficient policies. Although these policies perform well under specific workload conditions, they have several major limitations. These include the assumption of known service times, inability to efficiently assign tasks in time sharing server farms, poor performance under changing workload conditions and poor performance under multiple server farms. This thesis aims at proposing novel task assignment policies for assigning tasks in server farms under two main classes of realistic workload conditions, namely, the heavy-tailed and arbitrary service time distributions. Arbitrary service time distributions are assumed, for cases where the underlying service time distribution of tasks is unknown. First we investigate ways to optimise the performance in a time-sharing server. We concentrate on a particular scheduling policy called multi-level time sharing policy (MLTP). We provide an extensive performance analysis of MTLP and show that MLTP can result in significant performance improvements under certain traffic conditions. Second we investigate how to improve the performance in time sharing server farms using MLTP. Three task assignment policies are proposed for time sharing server farms. Third we investigate how to design efficient task assignment policies to assign tasks in multiple server farms. We propose MCTPM which is based on a multi-tier host architecture. MCTPM supports preemptive task migration and it controls the traffic flow into server farms via a global dispatching device so as to optimise the performance. Finally, we investigate ways to design adaptive task assignment policies that make no assumptions regarding the underlying service time distribution of tasks. We propose a novel task assignment policy, called ADAPT-POLICY, which is based on a set of static-based task assignment policies. ADAPT-POLICY is based on a set of policies for the server farm and it adaptively changes the task assignment policy to suit the most recent traffic conditions. The experimental performance analysis of ADAPT-POLICY shows that ADAPT-POLICY outperforms other policies under a range of traffic conditions

    PSBS: Practical Size-Based Scheduling

    Full text link
    Size-based schedulers have very desirable performance properties: optimal or near-optimal response time can be coupled with strong fairness guarantees. Despite this, such systems are very rarely implemented in practical settings, because they require knowing a priori the amount of work needed to complete jobs: this assumption is very difficult to satisfy in concrete systems. It is definitely more likely to inform the system with an estimate of the job sizes, but existing studies point to somewhat pessimistic results if existing scheduler policies are used based on imprecise job size estimations. We take the goal of designing scheduling policies that are explicitly designed to deal with inexact job sizes: first, we show that existing size-based schedulers can have bad performance with inexact job size information when job sizes are heavily skewed; we show that this issue, and the pessimistic results shown in the literature, are due to problematic behavior when large jobs are underestimated. Once the problem is identified, it is possible to amend existing size-based schedulers to solve the issue. We generalize FSP -- a fair and efficient size-based scheduling policy -- in order to solve the problem highlighted above; in addition, our solution deals with different job weights (that can be assigned to a job independently from its size). We provide an efficient implementation of the resulting protocol, which we call Practical Size-Based Scheduler (PSBS). Through simulations evaluated on synthetic and real workloads, we show that PSBS has near-optimal performance in a large variety of cases with inaccurate size information, that it performs fairly and it handles correctly job weights. We believe that this work shows that PSBS is indeed pratical, and we maintain that it could inspire the design of schedulers in a wide array of real-world use cases.Comment: arXiv admin note: substantial text overlap with arXiv:1403.599

    Towards Fast, Adaptive, and Hardware-Assisted User-Space Scheduling

    Full text link
    Modern datacenter applications are prone to high tail latencies since their requests typically follow highly-dispersive distributions. Delivering fast interrupts is essential to reducing tail latency. Prior work has proposed both OS- and system-level solutions to reduce tail latencies for microsecond-scale workloads through better scheduling. Unfortunately, existing approaches like customized dataplane OSes, require significant OS changes, experience scalability limitations, or do not reach the full performance capabilities hardware offers. The emergence of new hardware features like UINTR exposed new opportunities to rethink the design paradigms and abstractions of traditional scheduling systems. We propose LibPreemptible, a preemptive user-level threading library that is flexible, lightweight, and adaptive. LibPreemptible was built with a set of optimizations like LibUtimer for scalability, and deadline-oriented API for flexible policies, time-quantum controller for adaptiveness. Compared to the prior state-of-the-art scheduling system Shinjuku, our system achieves significant tail latency and throughput improvements for various workloads without modifying the kernel. We also demonstrate the flexibility of LibPreemptible across scheduling policies for real applications experiencing varying load levels and characteristics.Comment: Accepted by HPCA202

    SEH: Size Estimate Hedging for Single-Server Queues

    Full text link
    For a single server system, Shortest Remaining Processing Time (SRPT) is an optimal size-based policy. In this paper, we discuss scheduling a single-server system when exact information about the jobs' processing times is not available. When the SRPT policy uses estimated processing times, the underestimation of large jobs can significantly degrade performance. We propose a simple heuristic, Size Estimate Hedging (SEH), that only uses jobs' estimated processing times for scheduling decisions. A job's priority is increased dynamically according to an SRPT rule until it is determined that it is underestimated, at which time the priority is frozen. Numerical results suggest that SEH has desirable performance when estimation errors are not unreasonably large

    Scheduling Heterogeneous HPC Applications in Next-Generation Exascale Systems

    Get PDF
    Next generation HPC applications will increasingly time-share system resources with emerging workloads such as in-situ analytics, resilience tasks, runtime adaptation services and power management activities. HPC systems must carefully schedule these co-located codes in order to reduce their impact on application performance. Among the techniques traditionally used to mitigate the performance effects of time- share systems is gang scheduling. This approach, however, leverages global synchronization and time agreement mechanisms that will become hard to support as systems increase in size. Alternative performance interference mitigation approaches must be explored for future HPC systems. This dissertation evaluates the impacts of workload concurrency in future HPC systems. It uses simulation and modeling techniques to study the performance impacts of existing and emerging interference sources on a selection of HPC benchmarks, mini-applications, and applications. It also quantifies the cost and benefits of different approaches to scheduling co-located workloads, studies performance interference mitigation solutions based on gang scheduling, and examines their synchronization requirements. To do so, this dissertation presents and leverages a new Extreme Value Theory- based model to characterize interference sources, and investigate their impact on Bulk Synchronous Parallel (BSP) applications. It demonstrates how this model can be used to analyze the interference attenuation effects of alternative fine-grained OS scheduling approaches based on periodic real time schedulers. This analysis can, in turn, guide the design of those mitigation techniques by providing tools to understand the tradeoffs of selecting scheduling parameters
    corecore