6,155 research outputs found
Resource provisioning in Science Clouds: Requirements and challenges
Cloud computing has permeated into the information technology industry in the
last few years, and it is emerging nowadays in scientific environments. Science
user communities are demanding a broad range of computing power to satisfy the
needs of high-performance applications, such as local clusters,
high-performance computing systems, and computing grids. Different workloads
are needed from different computational models, and the cloud is already
considered as a promising paradigm. The scheduling and allocation of resources
is always a challenging matter in any form of computation and clouds are not an
exception. Science applications have unique features that differentiate their
workloads, hence, their requirements have to be taken into consideration to be
fulfilled when building a Science Cloud. This paper will discuss what are the
main scheduling and resource allocation challenges for any Infrastructure as a
Service provider supporting scientific applications
SLA-Oriented Resource Provisioning for Cloud Computing: Challenges, Architecture, and Solutions
Cloud computing systems promise to offer subscription-oriented,
enterprise-quality computing services to users worldwide. With the increased
demand for delivering services to a large number of users, they need to offer
differentiated services to users and meet their quality expectations. Existing
resource management systems in data centers are yet to support Service Level
Agreement (SLA)-oriented resource allocation, and thus need to be enhanced to
realize cloud computing and utility computing. In addition, no work has been
done to collectively incorporate customer-driven service management,
computational risk management, and autonomic resource management into a
market-based resource management system to target the rapidly changing
enterprise requirements of Cloud computing. This paper presents vision,
challenges, and architectural elements of SLA-oriented resource management. The
proposed architecture supports integration of marketbased provisioning policies
and virtualisation technologies for flexible allocation of resources to
applications. The performance results obtained from our working prototype
system shows the feasibility and effectiveness of SLA-based resource
provisioning in Clouds.Comment: 10 pages, 7 figures, Conference Keynote Paper: 2011 IEEE
International Conference on Cloud and Service Computing (CSC 2011, IEEE
Press, USA), Hong Kong, China, December 12-14, 201
Reproducible and Portable Big Data Analytics in the Cloud
Cloud computing has become a major approach to help reproduce computational
experiments because it supports on-demand hardware and software resource
provisioning. Yet there are still two main difficulties in reproducing big data
applications in the cloud. The first is how to automate end-to-end execution of
analytics including environment provisioning, analytics pipeline description,
pipeline execution, and resource termination. The second is that an application
developed for one cloud is difficult to be reproduced in another cloud, a.k.a.
vendor lock-in problem. To tackle these problems, we leverage serverless
computing and containerization techniques for automated scalable execution and
reproducibility, and utilize the adapter design pattern to enable application
portability and reproducibility across different clouds. We propose and develop
an open-source toolkit that supports 1) fully automated end-to-end execution
and reproduction via a single command, 2) automated data and configuration
storage for each execution, 3) flexible client modes based on user preferences,
4) execution history query, and 5) simple reproduction of existing executions
in the same environment or a different environment. We did extensive
experiments on both AWS and Azure using four big data analytics applications
that run on virtual CPU/GPU clusters. The experiments show our toolkit can
achieve good execution performance, scalability, and efficient reproducibility
for cloud-based big data analytics
Checkpointing as a Service in Heterogeneous Cloud Environments
A non-invasive, cloud-agnostic approach is demonstrated for extending
existing cloud platforms to include checkpoint-restart capability. Most cloud
platforms currently rely on each application to provide its own fault
tolerance. A uniform mechanism within the cloud itself serves two purposes: (a)
direct support for long-running jobs, which would otherwise require a custom
fault-tolerant mechanism for each application; and (b) the administrative
capability to manage an over-subscribed cloud by temporarily swapping out jobs
when higher priority jobs arrive. An advantage of this uniform approach is that
it also supports parallel and distributed computations, over both TCP and
InfiniBand, thus allowing traditional HPC applications to take advantage of an
existing cloud infrastructure. Additionally, an integrated health-monitoring
mechanism detects when long-running jobs either fail or incur exceptionally low
performance, perhaps due to resource starvation, and proactively suspends the
job. The cloud-agnostic feature is demonstrated by applying the implementation
to two very different cloud platforms: Snooze and OpenStack. The use of a
cloud-agnostic architecture also enables, for the first time, migration of
applications from one cloud platform to another.Comment: 20 pages, 11 figures, appears in CCGrid, 201
ENORM: A Framework For Edge NOde Resource Management
Current computing techniques using the cloud as a centralised server will
become untenable as billions of devices get connected to the Internet. This
raises the need for fog computing, which leverages computing at the edge of the
network on nodes, such as routers, base stations and switches, along with the
cloud. However, to realise fog computing the challenge of managing edge nodes
will need to be addressed. This paper is motivated to address the resource
management challenge. We develop the first framework to manage edge nodes,
namely the Edge NOde Resource Management (ENORM) framework. Mechanisms for
provisioning and auto-scaling edge node resources are proposed. The feasibility
of the framework is demonstrated on a PokeMon Go-like online game use-case. The
benefits of using ENORM are observed by reduced application latency between 20%
- 80% and reduced data transfer and communication frequency between the edge
node and the cloud by up to 95\%. These results highlight the potential of fog
computing for improving the quality of service and experience.Comment: 14 pages; accepted to IEEE Transactions on Services Computing on 12
September 201
Notes on Cloud computing principles
This letter provides a review of fundamental distributed systems and economic
Cloud computing principles. These principles are frequently deployed in their
respective fields, but their inter-dependencies are often neglected. Given that
Cloud Computing first and foremost is a new business model, a new model to sell
computational resources, the understanding of these concepts is facilitated by
treating them in unison. Here, we review some of the most important concepts
and how they relate to each other
- …