166 research outputs found
Carbon Containers: A System-level Facility for Managing Application-level Carbon Emissions
To reduce their environmental impact, cloud datacenters' are increasingly
focused on optimizing applications' carbon-efficiency, or work done per mass of
carbon emitted. To facilitate such optimizations, we present Carbon Containers,
a simple system-level facility, which extends prior work on power containers,
that automatically regulates applications' carbon emissions in response to
variations in both their workload's intensity and their energy's
carbon-intensity. Specifically, \carbonContainerS enable applications to
specify a maximum carbon emissions rate (in gCOe/hr), and then
transparently enforce this rate via a combination of vertical scaling,
container migration, and suspend/resume while maximizing either
energy-efficiency or performance.
Carbon Containers are especially useful for applications that i) must
continue running even during high-carbon periods, and ii) execute in regions
with few variations in carbon-intensity. These low-variability regions also
tend to have high average carbon-intensity, which increases the importance of
regulating carbon emissions. We implement a Carbon Containers prototype by
extending Linux Containers to incorporate the mechanisms above and evaluate it
using real workload traces and carbon-intensity data from multiple regions. We
compare Carbon Containers with prior work that regulates carbon emissions by
suspending/resuming applications during high/low carbon periods. We show that
Carbon Containers are more carbon-efficient and improve performance while
maintaining similar carbon emissions.Comment: ACM Symposium on Cloud Computing (SoCC
How does it function? Characterizing long-term trends in production serverless workloads
This paper releases and analyzes two new Huawei cloud serverless traces. The traces span a period of over 7 months with over 1.4 trillion function invocations combined. The first trace is derived from Huawei's internal workloads and contains detailed per-second statistics for 200 functions running across multiple Huawei cloud data centers. The second trace is a representative workload from Huawei's public FaaS platform. This trace contains per-minute arrival rates for over 5000 functions running in a single Huawei data center. We present the internals of a production FaaS platform by characterizing resource consumption, cold-start times, programming languages used, periodicity, per-second versus per-minute burstiness, correlations, and popularity. Our findings show that there is considerable diversity in how serverless functions behave: requests vary by up to 9 orders of magnitude across functions, with some functions executed over 1 billion times per day; scheduling time, execution time and cold-start distributions vary across 2 to 4 orders of magnitude and have very long tails; and function invocation counts demonstrate strong periodicity for many individual functions and on an aggregate level. Our analysis also highlights the need for further research in estimating resource reservations and time-series prediction to account for the huge diversity in how serverless functions behave.Postprin
A methodology for full-system power modeling in heterogeneous data centers
The need for energy-awareness in current data centers has encouraged the use of power modeling to estimate their power consumption. However, existing models present noticeable limitations, which make them application-dependent, platform-dependent, inaccurate, or computationally complex. In this paper, we propose a platform-and application-agnostic methodology for full-system power modeling in heterogeneous data centers that overcomes those limitations. It derives a single model per platform, which works with high accuracy for heterogeneous applications with different patterns of resource usage and energy consumption, by systematically selecting a minimum set of resource usage indicators and extracting complex relations among them that capture the impact on energy consumption of all the resources in the system. We demonstrate our methodology by generating power models for heterogeneous platforms with very different power consumption profiles. Our validation experiments with real Cloud applications show that such models provide high accuracy (around 5% of average estimation error).This work is supported by the Spanish Ministry of Economy and Competitiveness under contract TIN2015-65316-P, by the Gener-
alitat de Catalunya under contract 2014-SGR-1051, and by the European Commission under FP7-SMARTCITIES-2013 contract 608679 (RenewIT) and FP7-ICT-2013-10 contracts 610874 (AS- CETiC) and 610456 (EuroServer).Peer ReviewedPostprint (author's final draft
End-to-End Application Cloning for Distributed Cloud Microservices with Ditto
We present Ditto, an automated framework for cloning end-to-end cloud
applications, both monolithic and microservices, which captures I/O and network
activity, as well as kernel operations, in addition to application logic. Ditto
takes a hierarchical approach to application cloning, starting with capturing
the dependency graph across distributed services, to recreating each tier's
control/data flow, and finally generating system calls and assembly that mimics
the individual applications. Ditto does not reveal the logic of the original
application, facilitating publicly sharing clones of production services with
hardware vendors, cloud providers, and the research community.
We show that across a diverse set of single- and multi-tier applications,
Ditto accurately captures their CPU and memory characteristics as well as their
high-level performance metrics, is portable across platforms, and facilitates a
wide range of system studies
Karma: Resource Allocation for Dynamic Demands
The classical max-min fairness algorithm for resource allocation provides
many desirable properties, e.g., Pareto efficiency, strategy-proofness and
fairness. This paper builds upon the observation that max-min fairness
guarantees these properties under a strong assumption -- user demands being
static over time -- and that, for the realistic case of dynamic user demands,
max-min fairness loses one or more of these properties.
We present Karma, a generalization of max-min fairness for dynamic user
demands. The key insight in Karma is to introduce "memory" into max-min
fairness -- when allocating resources, Karma takes users' past allocations into
account: in each quantum, users donate their unused resources and are assigned
credits when other users borrow these resources; Karma carefully orchestrates
exchange of credits across users (based on their instantaneous demands, donated
resources and borrowed resources), and performs prioritized resource allocation
based on users' credits. We prove theoretically that Karma guarantees Pareto
efficiency, online strategy-proofness, and optimal fairness for dynamic user
demands (without future knowledge of user demands). Empirical evaluations over
production workloads show that these properties translate well into practice:
Karma is able to reduce disparity in performance across users to a bare minimum
while maintaining Pareto-optimal system-wide performance.Comment: Accepted for publication in USENIX OSDI 202
- …