5,476 research outputs found
Overcommitment in Cloud Services -- Bin packing with Chance Constraints
This paper considers a traditional problem of resource allocation, scheduling
jobs on machines. One such recent application is cloud computing, where jobs
arrive in an online fashion with capacity requirements and need to be
immediately scheduled on physical machines in data centers. It is often
observed that the requested capacities are not fully utilized, hence offering
an opportunity to employ an overcommitment policy, i.e., selling resources
beyond capacity. Setting the right overcommitment level can induce a
significant cost reduction for the cloud provider, while only inducing a very
low risk of violating capacity constraints. We introduce and study a model that
quantifies the value of overcommitment by modeling the problem as a bin packing
with chance constraints. We then propose an alternative formulation that
transforms each chance constraint into a submodular function. We show that our
model captures the risk pooling effect and can guide scheduling and
overcommitment decisions. We also develop a family of online algorithms that
are intuitive, easy to implement and provide a constant factor guarantee from
optimal. Finally, we calibrate our model using realistic workload data, and
test our approach in a practical setting. Our analysis and experiments
illustrate the benefit of overcommitment in cloud services, and suggest a cost
reduction of 1.5% to 17% depending on the provider's risk tolerance
Fairness in overloaded parallel queues
Maximizing throughput for heterogeneous parallel server queues has received
quite a bit of attention from the research community and the stability region
for such systems is well understood. However, many real-world systems have
periods where they are temporarily overloaded. Under such scenarios, the
unstable queues often starve limited resources. This work examines what happens
during periods of temporary overload. Specifically, we look at how to fairly
distribute stress. We explore the dynamics of the queue workloads under the
MaxWeight scheduling policy during long periods of stress and discuss how to
tune this policy in order to achieve a target fairness ratio across these
workloads
Asymptotic optimality of maximum pressure policies in stochastic processing networks
We consider a class of stochastic processing networks. Assume that the
networks satisfy a complete resource pooling condition. We prove that each
maximum pressure policy asymptotically minimizes the workload process in a
stochastic processing network in heavy traffic. We also show that, under each
quadratic holding cost structure, there is a maximum pressure policy that
asymptotically minimizes the holding cost. A key to the optimality proofs is to
prove a state space collapse result and a heavy traffic limit theorem for the
network processes under a maximum pressure policy. We extend a framework of
Bramson [Queueing Systems Theory Appl. 30 (1998) 89--148] and Williams
[Queueing Systems Theory Appl. 30 (1998b) 5--25] from the multiclass queueing
network setting to the stochastic processing network setting to prove the state
space collapse result and the heavy traffic limit theorem. The extension can be
adapted to other studies of stochastic processing networks.Comment: Published in at http://dx.doi.org/10.1214/08-AAP522 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Modeling and optimization of high-performance many-core systems for energy-efficient and reliable computing
Thesis (Ph.D.)--Boston UniversityMany-core systems, ranging from small-scale many-core processors to large-scale high performance computing (HPC) data centers, have become the main trend in computing system design owing to their potential to deliver higher throughput per watt. However, power densities and temperatures increase following the growth in the performance capacity, and bring major challenges in energy efficiency, cooling costs, and reliability. These challenges require a joint assessment of performance, power, and temperature tradeoffs as well as the design of runtime optimization techniques that monitor and manage the interplay among them. This thesis proposes novel modeling and runtime management techniques that evaluate and optimize the performance, energy, and reliability of many-core systems.
We first address the energy and thermal challenges in 3D-stacked many-core processors. 3D processors with stacked DRAM have the potential to dramatically improve performance owing to lower memory access latency and higher bandwidth. However, the performance increase may cause 3D systems to exceed the power budgets or create thermal hot spots. In order to provide an accurate analysis and enable the design of efficient management policies, this thesis introduces a simulation framework to jointly analyze performance, power, and temperature for 3D systems. We then propose a runtime optimization policy that maximizes the system performance by characterizing the application behavior and predicting the operating points that satisfy the power and thermal constraints. Our policy reduces the energy-delay product (EDP) by up to 61.9% compared to existing strategies.
Performance, cooling energy, and reliability are also critical aspects in HPC data centers. In addition to causing reliability degradation, high temperatures increase the required cooling energy. Communication cost, on the other hand, has a significant impact on system performance in HPC data centers. This thesis proposes a topology-aware technique that maximizes system reliability by selecting between workload clustering and balancing. Our policy improves the system reliability by up to 123.3% compared to existing temperature balancing approaches. We also introduce a job allocation methodology to simultaneously optimize the communication cost and the cooling energy in a data center. Our policy reduces the cooling cost by 40% compared to cooling-aware and performance-aware policies, while achieving comparable performance to performance-aware policy
- …