5,476 research outputs found

    Overcommitment in Cloud Services -- Bin packing with Chance Constraints

    Full text link
    This paper considers a traditional problem of resource allocation, scheduling jobs on machines. One such recent application is cloud computing, where jobs arrive in an online fashion with capacity requirements and need to be immediately scheduled on physical machines in data centers. It is often observed that the requested capacities are not fully utilized, hence offering an opportunity to employ an overcommitment policy, i.e., selling resources beyond capacity. Setting the right overcommitment level can induce a significant cost reduction for the cloud provider, while only inducing a very low risk of violating capacity constraints. We introduce and study a model that quantifies the value of overcommitment by modeling the problem as a bin packing with chance constraints. We then propose an alternative formulation that transforms each chance constraint into a submodular function. We show that our model captures the risk pooling effect and can guide scheduling and overcommitment decisions. We also develop a family of online algorithms that are intuitive, easy to implement and provide a constant factor guarantee from optimal. Finally, we calibrate our model using realistic workload data, and test our approach in a practical setting. Our analysis and experiments illustrate the benefit of overcommitment in cloud services, and suggest a cost reduction of 1.5% to 17% depending on the provider's risk tolerance

    Fairness in overloaded parallel queues

    Full text link
    Maximizing throughput for heterogeneous parallel server queues has received quite a bit of attention from the research community and the stability region for such systems is well understood. However, many real-world systems have periods where they are temporarily overloaded. Under such scenarios, the unstable queues often starve limited resources. This work examines what happens during periods of temporary overload. Specifically, we look at how to fairly distribute stress. We explore the dynamics of the queue workloads under the MaxWeight scheduling policy during long periods of stress and discuss how to tune this policy in order to achieve a target fairness ratio across these workloads

    Asymptotic optimality of maximum pressure policies in stochastic processing networks

    Full text link
    We consider a class of stochastic processing networks. Assume that the networks satisfy a complete resource pooling condition. We prove that each maximum pressure policy asymptotically minimizes the workload process in a stochastic processing network in heavy traffic. We also show that, under each quadratic holding cost structure, there is a maximum pressure policy that asymptotically minimizes the holding cost. A key to the optimality proofs is to prove a state space collapse result and a heavy traffic limit theorem for the network processes under a maximum pressure policy. We extend a framework of Bramson [Queueing Systems Theory Appl. 30 (1998) 89--148] and Williams [Queueing Systems Theory Appl. 30 (1998b) 5--25] from the multiclass queueing network setting to the stochastic processing network setting to prove the state space collapse result and the heavy traffic limit theorem. The extension can be adapted to other studies of stochastic processing networks.Comment: Published in at http://dx.doi.org/10.1214/08-AAP522 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Modeling and optimization of high-performance many-core systems for energy-efficient and reliable computing

    Full text link
    Thesis (Ph.D.)--Boston UniversityMany-core systems, ranging from small-scale many-core processors to large-scale high performance computing (HPC) data centers, have become the main trend in computing system design owing to their potential to deliver higher throughput per watt. However, power densities and temperatures increase following the growth in the performance capacity, and bring major challenges in energy efficiency, cooling costs, and reliability. These challenges require a joint assessment of performance, power, and temperature tradeoffs as well as the design of runtime optimization techniques that monitor and manage the interplay among them. This thesis proposes novel modeling and runtime management techniques that evaluate and optimize the performance, energy, and reliability of many-core systems. We first address the energy and thermal challenges in 3D-stacked many-core processors. 3D processors with stacked DRAM have the potential to dramatically improve performance owing to lower memory access latency and higher bandwidth. However, the performance increase may cause 3D systems to exceed the power budgets or create thermal hot spots. In order to provide an accurate analysis and enable the design of efficient management policies, this thesis introduces a simulation framework to jointly analyze performance, power, and temperature for 3D systems. We then propose a runtime optimization policy that maximizes the system performance by characterizing the application behavior and predicting the operating points that satisfy the power and thermal constraints. Our policy reduces the energy-delay product (EDP) by up to 61.9% compared to existing strategies. Performance, cooling energy, and reliability are also critical aspects in HPC data centers. In addition to causing reliability degradation, high temperatures increase the required cooling energy. Communication cost, on the other hand, has a significant impact on system performance in HPC data centers. This thesis proposes a topology-aware technique that maximizes system reliability by selecting between workload clustering and balancing. Our policy improves the system reliability by up to 123.3% compared to existing temperature balancing approaches. We also introduce a job allocation methodology to simultaneously optimize the communication cost and the cooling energy in a data center. Our policy reduces the cooling cost by 40% compared to cooling-aware and performance-aware policies, while achieving comparable performance to performance-aware policy
    corecore