Characterizing Public Cloud Resource Contention to Support Virtual Machine Co-residency Prediction

Abstract

Hypervisors used to implement virtual machines (VMs) for infrastructure-as-a-service (IaaS) cloud platforms have undergone continued improvements for the past decade. VM components including CPU, memory, network, and storage I/O have evolved from full software emulation, to paravirtualization, to hardware virtualization. While these innovations have helped reduce performance overhead when simulating a computer, considerable performance loss is still possible in the public cloud from resource contention of co-located VMs. In this paper, we investigate the extent of performance degradation from resource contention by leveraging well-known benchmarks run in parallel across three generations of virtualization hypervisors. Using a Python-based test harness we orchestrate execution of CPU, disk, and network I/O bound benchmarks across up to 48 VMs sharing the same Amazon Web Services dedicated host server. We found that executing benchmarks on hosts with many idle Linux VMs produced unexpected performance degradation. As public cloud users are interested in avoiding resource contention from co-located VMs, we next leveraged our dedicated host performance measurements as independent variables to train models to predict the number of co-resident VMs. We evaluated multiple linear regression and random forest models using test data from independent benchmark runs across 96 vCPU dedicated hosts running up to 48 x 2 vCPU VMs where we controlled VM placements. Multiple linear regression over normalized data achieved R2=.942, with mean absolute error of VM co-residency predictions of ±1.61 VMs. We then leveraged our models to infer VM co-residency among a set of 50 VMs on the public cloud, where co-location data is unavailable. Here models cannot be independently verified, but results suggest the relative occupancy level of public cloud hosts enabling users to infer when their VMs reside on busy hosts. Our results characterize how recent hypervisor and hardware advancements are addressing resource contention, while demonstrating the potential to leverage co-located benchmarks for VM co-residency prediction in a public cloud

    Similar works