5,105 research outputs found
Datacenter Traffic Control: Understanding Techniques and Trade-offs
Datacenters provide cost-effective and flexible access to scalable compute
and storage resources necessary for today's cloud computing needs. A typical
datacenter is made up of thousands of servers connected with a large network
and usually managed by one operator. To provide quality access to the variety
of applications and services hosted on datacenters and maximize performance, it
deems necessary to use datacenter networks effectively and efficiently.
Datacenter traffic is often a mix of several classes with different priorities
and requirements. This includes user-generated interactive traffic, traffic
with deadlines, and long-running traffic. To this end, custom transport
protocols and traffic management techniques have been developed to improve
datacenter network performance.
In this tutorial paper, we review the general architecture of datacenter
networks, various topologies proposed for them, their traffic properties,
general traffic control challenges in datacenters and general traffic control
objectives. The purpose of this paper is to bring out the important
characteristics of traffic control in datacenters and not to survey all
existing solutions (as it is virtually impossible due to massive body of
existing research). We hope to provide readers with a wide range of options and
factors while considering a variety of traffic control mechanisms. We discuss
various characteristics of datacenter traffic control including management
schemes, transmission control, traffic shaping, prioritization, load balancing,
multipathing, and traffic scheduling. Next, we point to several open challenges
as well as new and interesting networking paradigms. At the end of this paper,
we briefly review inter-datacenter networks that connect geographically
dispersed datacenters which have been receiving increasing attention recently
and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial
Technical Report: A Trace-Based Performance Study of Autoscaling Workloads of Workflows in Datacenters
To improve customer experience, datacenter operators offer support for
simplifying application and resource management. For example, running workloads
of workflows on behalf of customers is desirable, but requires increasingly
more sophisticated autoscaling policies, that is, policies that dynamically
provision resources for the customer. Although selecting and tuning autoscaling
policies is a challenging task for datacenter operators, so far relatively few
studies investigate the performance of autoscaling for workloads of workflows.
Complementing previous knowledge, in this work we propose the first
comprehensive performance study in the field. Using trace-based simulation, we
compare state-of-the-art autoscaling policies across multiple application
domains, workload arrival patterns (e.g., burstiness), and system utilization
levels. We further investigate the interplay between autoscaling and regular
allocation policies, and the complexity cost of autoscaling. Our quantitative
study focuses not only on traditional performance metrics and on
state-of-the-art elasticity metrics, but also on time- and memory-related
autoscaling-complexity metrics. Our main results give strong and quantitative
evidence about previously unreported operational behavior, for example, that
autoscaling policies perform differently across application domains and by how
much they differ.Comment: Technical Report for the CCGrid 2018 submission "A Trace-Based
Performance Study of Autoscaling Workloads of Workflows in Datacenters
dReDBox: Materializing a full-stack rack-scale system prototype of a next-generation disaggregated datacenter
Current datacenters are based on server machines, whose mainboard and hardware components form the baseline, monolithic building block that the rest of the system software, middleware and application stack are built upon. This leads to the following limitations: (a) resource proportionality of a multi-tray system is bounded by the basic building block (mainboard), (b) resource allocation to processes or virtual machines (VMs) is bounded by the available resources within the boundary of the mainboard, leading to spare resource fragmentation and inefficiencies, and (c) upgrades must be applied to each and every server even when only a specific component needs to be upgraded. The dRedBox project (Disaggregated Recursive Datacentre-in-a-Box) addresses the above limitations, and proposes the next generation, low-power, across form-factor datacenters, departing from the paradigm of the mainboard-as-a-unit and enabling the creation of function-block-as-a-unit. Hardware-level disaggregation and software-defined wiring of resources is supported by a full-fledged Type-1 hypervisor that can execute commodity virtual machines, which communicate over a low-latency and high-throughput software-defined optical network. To evaluate its novel approach, dRedBox will demonstrate application execution in the domains of network functions virtualization, infrastructure analytics, and real-time video surveillance.This work has been supported in part by EU H2020 ICTproject dRedBox, contract #687632.Peer ReviewedPostprint (author's final draft
- …