10,142 research outputs found
HPC Cloud for Scientific and Business Applications: Taxonomy, Vision, and Research Challenges
High Performance Computing (HPC) clouds are becoming an alternative to
on-premise clusters for executing scientific applications and business
analytics services. Most research efforts in HPC cloud aim to understand the
cost-benefit of moving resource-intensive applications from on-premise
environments to public cloud platforms. Industry trends show hybrid
environments are the natural path to get the best of the on-premise and cloud
resources---steady (and sensitive) workloads can run on on-premise resources
and peak demand can leverage remote resources in a pay-as-you-go manner.
Nevertheless, there are plenty of questions to be answered in HPC cloud, which
range from how to extract the best performance of an unknown underlying
platform to what services are essential to make its usage easier. Moreover, the
discussion on the right pricing and contractual models to fit small and large
users is relevant for the sustainability of HPC clouds. This paper brings a
survey and taxonomy of efforts in HPC cloud and a vision on what we believe is
ahead of us, including a set of research challenges that, once tackled, can
help advance businesses and scientific discoveries. This becomes particularly
relevant due to the fast increasing wave of new HPC applications coming from
big data and artificial intelligence.Comment: 29 pages, 5 figures, Published in ACM Computing Surveys (CSUR
The Blacklisting Memory Scheduler: Balancing Performance, Fairness and Complexity
In a multicore system, applications running on different cores interfere at
main memory. This inter-application interference degrades overall system
performance and unfairly slows down applications. Prior works have developed
application-aware memory schedulers to tackle this problem. State-of-the-art
application-aware memory schedulers prioritize requests of applications that
are vulnerable to interference, by ranking individual applications based on
their memory access characteristics and enforcing a total rank order.
In this paper, we observe that state-of-the-art application-aware memory
schedulers have two major shortcomings. First, such schedulers trade off
hardware complexity in order to achieve high performance or fairness, since
ranking applications with a total order leads to high hardware complexity.
Second, ranking can unfairly slow down applications that are at the bottom of
the ranking stack. To overcome these shortcomings, we propose the Blacklisting
Memory Scheduler (BLISS), which achieves high system performance and fairness
while incurring low hardware complexity, based on two observations. First, we
find that, to mitigate interference, it is sufficient to separate applications
into only two groups. Second, we show that this grouping can be efficiently
performed by simply counting the number of consecutive requests served from
each application.
We evaluate BLISS across a wide variety of workloads/system configurations
and compare its performance and hardware complexity, with five state-of-the-art
memory schedulers. Our evaluations show that BLISS achieves 5% better system
performance and 25% better fairness than the best-performing previous scheduler
while greatly reducing critical path latency and hardware area cost of the
memory scheduler (by 79% and 43%, respectively), thereby achieving a good
trade-off between performance, fairness and hardware complexity
Resource provisioning in Science Clouds: Requirements and challenges
Cloud computing has permeated into the information technology industry in the
last few years, and it is emerging nowadays in scientific environments. Science
user communities are demanding a broad range of computing power to satisfy the
needs of high-performance applications, such as local clusters,
high-performance computing systems, and computing grids. Different workloads
are needed from different computational models, and the cloud is already
considered as a promising paradigm. The scheduling and allocation of resources
is always a challenging matter in any form of computation and clouds are not an
exception. Science applications have unique features that differentiate their
workloads, hence, their requirements have to be taken into consideration to be
fulfilled when building a Science Cloud. This paper will discuss what are the
main scheduling and resource allocation challenges for any Infrastructure as a
Service provider supporting scientific applications
Energy-Efficient Management of Data Center Resources for Cloud Computing: A Vision, Architectural Elements, and Open Challenges
Cloud computing is offering utility-oriented IT services to users worldwide.
Based on a pay-as-you-go model, it enables hosting of pervasive applications
from consumer, scientific, and business domains. However, data centers hosting
Cloud applications consume huge amounts of energy, contributing to high
operational costs and carbon footprints to the environment. Therefore, we need
Green Cloud computing solutions that can not only save energy for the
environment but also reduce operational costs. This paper presents vision,
challenges, and architectural elements for energy-efficient management of Cloud
computing environments. We focus on the development of dynamic resource
provisioning and allocation algorithms that consider the synergy between
various data center infrastructures (i.e., the hardware, power units, cooling
and software), and holistically work to boost data center energy efficiency and
performance. In particular, this paper proposes (a) architectural principles
for energy-efficient management of Clouds; (b) energy-efficient resource
allocation policies and scheduling algorithms considering quality-of-service
expectations, and devices power usage characteristics; and (c) a novel software
technology for energy-efficient management of Clouds. We have validated our
approach by conducting a set of rigorous performance evaluation study using the
CloudSim toolkit. The results demonstrate that Cloud computing model has
immense potential as it offers significant performance gains as regards to
response time and cost saving under dynamic workload scenarios.Comment: 12 pages, 5 figures,Proceedings of the 2010 International Conference
on Parallel and Distributed Processing Techniques and Applications (PDPTA
2010), Las Vegas, USA, July 12-15, 201
- …