13,133 research outputs found
Cloudbus Toolkit for Market-Oriented Cloud Computing
This keynote paper: (1) presents the 21st century vision of computing and
identifies various IT paradigms promising to deliver computing as a utility;
(2) defines the architecture for creating market-oriented Clouds and computing
atmosphere by leveraging technologies such as virtual machines; (3) provides
thoughts on market-based resource management strategies that encompass both
customer-driven service management and computational risk management to sustain
SLA-oriented resource allocation; (4) presents the work carried out as part of
our new Cloud Computing initiative, called Cloudbus: (i) Aneka, a Platform as a
Service software system containing SDK (Software Development Kit) for
construction of Cloud applications and deployment on private or public Clouds,
in addition to supporting market-oriented resource management; (ii)
internetworking of Clouds for dynamic creation of federated computing
environments for scaling of elastic applications; (iii) creation of 3rd party
Cloud brokering services for building content delivery networks and e-Science
applications and their deployment on capabilities of IaaS providers such as
Amazon along with Grid mashups; (iv) CloudSim supporting modelling and
simulation of Clouds for performance studies; (v) Energy Efficient Resource
Allocation Mechanisms and Techniques for creation and management of Green
Clouds; and (vi) pathways for future research.Comment: 21 pages, 6 figures, 2 tables, Conference pape
Learning Scheduling Algorithms for Data Processing Clusters
Efficiently scheduling data processing jobs on distributed compute clusters
requires complex algorithms. Current systems, however, use simple generalized
heuristics and ignore workload characteristics, since developing and tuning a
scheduling policy for each workload is infeasible. In this paper, we show that
modern machine learning techniques can generate highly-efficient policies
automatically. Decima uses reinforcement learning (RL) and neural networks to
learn workload-specific scheduling algorithms without any human instruction
beyond a high-level objective such as minimizing average job completion time.
Off-the-shelf RL techniques, however, cannot handle the complexity and scale of
the scheduling problem. To build Decima, we had to develop new representations
for jobs' dependency graphs, design scalable RL models, and invent RL training
methods for dealing with continuous stochastic job arrivals. Our prototype
integration with Spark on a 25-node cluster shows that Decima improves the
average job completion time over hand-tuned scheduling heuristics by at least
21%, achieving up to 2x improvement during periods of high cluster load
Datacenter Traffic Control: Understanding Techniques and Trade-offs
Datacenters provide cost-effective and flexible access to scalable compute
and storage resources necessary for today's cloud computing needs. A typical
datacenter is made up of thousands of servers connected with a large network
and usually managed by one operator. To provide quality access to the variety
of applications and services hosted on datacenters and maximize performance, it
deems necessary to use datacenter networks effectively and efficiently.
Datacenter traffic is often a mix of several classes with different priorities
and requirements. This includes user-generated interactive traffic, traffic
with deadlines, and long-running traffic. To this end, custom transport
protocols and traffic management techniques have been developed to improve
datacenter network performance.
In this tutorial paper, we review the general architecture of datacenter
networks, various topologies proposed for them, their traffic properties,
general traffic control challenges in datacenters and general traffic control
objectives. The purpose of this paper is to bring out the important
characteristics of traffic control in datacenters and not to survey all
existing solutions (as it is virtually impossible due to massive body of
existing research). We hope to provide readers with a wide range of options and
factors while considering a variety of traffic control mechanisms. We discuss
various characteristics of datacenter traffic control including management
schemes, transmission control, traffic shaping, prioritization, load balancing,
multipathing, and traffic scheduling. Next, we point to several open challenges
as well as new and interesting networking paradigms. At the end of this paper,
we briefly review inter-datacenter networks that connect geographically
dispersed datacenters which have been receiving increasing attention recently
and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial
- …