1,406 research outputs found
Application-driven Bandwidth Guarantees in Datacenters
Providing bandwidth guarantees to specific applications is be-coming increasingly important as applications compete for shared cloud network resources. We present CloudMirror, a solution that provides bandwidth guarantees to cloud applications based on a new network abstraction and workload placement algorithm. An effective network abstraction would enable applications to easily and accurately specify their requirements, while simultaneously enabling the infrastructure to provision resources efficiently for deployed applications. Prior research has approached the bandwidth guarantee specification by using abstractions that resemble physical network topologies. We present a contrasting approach of deriving a network abstraction based on application communication structure, called Tenant Application Graph or TAG. CloudMirror also incorporates a new workload place-ment algorithm that efficiently meets bandwidth requirements specified by TAGs while factoring in high availability consider-ations. Extensive simulations using real application traces and datacenter topologies show that CloudMirror can handle 40% more bandwidth demand than the state of the art (e.g., the Ok-topus system), while improving high availability from 20 % to 70%
Enabling Work-conserving Bandwidth Guarantees for Multi-tenant Datacenters via Dynamic Tenant-Queue Binding
Today's cloud networks are shared among many tenants. Bandwidth guarantees
and work conservation are two key properties to ensure predictable performance
for tenant applications and high network utilization for providers. Despite
significant efforts, very little prior work can really achieve both properties
simultaneously even some of them claimed so.
In this paper, we present QShare, an in-network based solution to achieve
bandwidth guarantees and work conservation simultaneously. QShare leverages
weighted fair queuing on commodity switches to slice network bandwidth for
tenants, and solves the challenge of queue scarcity through balanced tenant
placement and dynamic tenant-queue binding. QShare is readily implementable
with existing switching chips. We have implemented a QShare prototype and
evaluated it via both testbed experiments and simulations. Our results show
that QShare ensures bandwidth guarantees while driving network utilization to
over 91% even under unpredictable traffic demands.Comment: The initial work is published in IEEE INFOCOM 201
DCCast: Efficient Point to Multipoint Transfers Across Datacenters
Using multiple datacenters allows for higher availability, load balancing and
reduced latency to customers of cloud services. To distribute multiple copies
of data, cloud providers depend on inter-datacenter WANs that ought to be used
efficiently considering their limited capacity and the ever-increasing data
demands. In this paper, we focus on applications that transfer objects from one
datacenter to several datacenters over dedicated inter-datacenter networks. We
present DCCast, a centralized Point to Multi-Point (P2MP) algorithm that uses
forwarding trees to efficiently deliver an object from a source datacenter to
required destination datacenters. With low computational overhead, DCCast
selects forwarding trees that minimize bandwidth usage and balance load across
all links. With simulation experiments on Google's GScale network, we show that
DCCast can reduce total bandwidth usage and tail Transfer Completion Times
(TCT) by up to compared to delivering the same objects via independent
point-to-point (P2P) transfers.Comment: 9th USENIX Workshop on Hot Topics in Cloud Computing,
https://www.usenix.org/conference/hotcloud17/program/presentation/noormohammadpou
RCD: Rapid Close to Deadline Scheduling for Datacenter Networks
Datacenter-based Cloud Computing services provide a flexible, scalable and
yet economical infrastructure to host online services such as multimedia
streaming, email and bulk storage. Many such services perform geo-replication
to provide necessary quality of service and reliability to users resulting in
frequent large inter- datacenter transfers. In order to meet tenant service
level agreements (SLAs), these transfers have to be completed prior to a
deadline. In addition, WAN resources are quite scarce and costly, meaning they
should be fully utilized. Several recently proposed schemes, such as B4,
TEMPUS, and SWAN have focused on improving the utilization of inter-datacenter
transfers through centralized scheduling, however, they fail to provide a
mechanism to guarantee that admitted requests meet their deadlines. Also, in a
recent study, authors propose Amoeba, a system that allows tenants to define
deadlines and guarantees that the specified deadlines are met, however, to
admit new traffic, the proposed system has to modify the allocation of already
admitted transfers. In this paper, we propose Rapid Close to Deadline
Scheduling (RCD), a close to deadline traffic allocation technique that is fast
and efficient. Through simulations, we show that RCD is up to 15 times faster
than Amoeba, provides high link utilization along with deadline guarantees, and
is able to make quick decisions on whether a new request can be fully satisfied
before its deadline.Comment: World Automation Congress (WAC), IEEE, 201
Datacenter Traffic Control: Understanding Techniques and Trade-offs
Datacenters provide cost-effective and flexible access to scalable compute
and storage resources necessary for today's cloud computing needs. A typical
datacenter is made up of thousands of servers connected with a large network
and usually managed by one operator. To provide quality access to the variety
of applications and services hosted on datacenters and maximize performance, it
deems necessary to use datacenter networks effectively and efficiently.
Datacenter traffic is often a mix of several classes with different priorities
and requirements. This includes user-generated interactive traffic, traffic
with deadlines, and long-running traffic. To this end, custom transport
protocols and traffic management techniques have been developed to improve
datacenter network performance.
In this tutorial paper, we review the general architecture of datacenter
networks, various topologies proposed for them, their traffic properties,
general traffic control challenges in datacenters and general traffic control
objectives. The purpose of this paper is to bring out the important
characteristics of traffic control in datacenters and not to survey all
existing solutions (as it is virtually impossible due to massive body of
existing research). We hope to provide readers with a wide range of options and
factors while considering a variety of traffic control mechanisms. We discuss
various characteristics of datacenter traffic control including management
schemes, transmission control, traffic shaping, prioritization, load balancing,
multipathing, and traffic scheduling. Next, we point to several open challenges
as well as new and interesting networking paradigms. At the end of this paper,
we briefly review inter-datacenter networks that connect geographically
dispersed datacenters which have been receiving increasing attention recently
and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial
Towards Hybrid Cloud-assisted Crowdsourced Live Streaming: Measurement and Analysis
Crowdsourced Live Streaming (CLS), most notably Twitch.tv, has seen explosive
growth in its popularity in the past few years. In such systems, any user can
lively broadcast video content of interest to others, e.g., from a game player
to many online viewers. To fulfill the demands from both massive and
heterogeneous broadcasters and viewers, expensive server clusters have been
deployed to provide video ingesting and transcoding services. Despite the
existence of highly popular channels, a significant portion of the channels is
indeed unpopular. Yet as our measurement shows, these broadcasters are
consuming considerable system resources; in particular, 25% (resp. 30%) of
bandwidth (resp. computation) resources are used by the broadcasters who do not
have any viewers at all. In this paper, we closely examine the challenge of
handling unpopular live-broadcasting channels in CLS systems and present a
comprehensive solution for service partitioning on hybrid cloud. The
trace-driven evaluation shows that our hybrid cloud-assisted design can smartly
assign ingesting and transcoding tasks to the elastic cloud virtual machines,
providing flexible system deployment cost-effectively
- …