3,003 research outputs found
EPOBF: Energy Efficient Allocation of Virtual Machines in High Performance Computing Cloud
Cloud computing has become more popular in provision of computing resources
under virtual machine (VM) abstraction for high performance computing (HPC)
users to run their applications. A HPC cloud is such cloud computing
environment. One of challenges of energy efficient resource allocation for VMs
in HPC cloud is tradeoff between minimizing total energy consumption of
physical machines (PMs) and satisfying Quality of Service (e.g. performance).
On one hand, cloud providers want to maximize their profit by reducing the
power cost (e.g. using the smallest number of running PMs). On the other hand,
cloud customers (users) want highest performance for their applications. In
this paper, we focus on the scenario that scheduler does not know global
information about user jobs and user applications in the future. Users will
request shortterm resources at fixed start times and non interrupted durations.
We then propose a new allocation heuristic (named Energy-aware and Performance
per watt oriented Bestfit (EPOBF)) that uses metric of performance per watt to
choose which most energy-efficient PM for mapping each VM (e.g. maximum of MIPS
per Watt). Using information from Feitelson's Parallel Workload Archive to
model HPC jobs, we compare the proposed EPOBF to state of the art heuristics on
heterogeneous PMs (each PM has multicore CPU). Simulations show that the EPOBF
can reduce significant total energy consumption in comparison with state of the
art allocation heuristics.Comment: 10 pages, in Procedings of International Conference on Advanced
Computing and Applications, Journal of Science and Technology, Vietnamese
Academy of Science and Technology, ISSN 0866-708X, Vol. 51, No. 4B, 201
Resource provisioning in Science Clouds: Requirements and challenges
Cloud computing has permeated into the information technology industry in the
last few years, and it is emerging nowadays in scientific environments. Science
user communities are demanding a broad range of computing power to satisfy the
needs of high-performance applications, such as local clusters,
high-performance computing systems, and computing grids. Different workloads
are needed from different computational models, and the cloud is already
considered as a promising paradigm. The scheduling and allocation of resources
is always a challenging matter in any form of computation and clouds are not an
exception. Science applications have unique features that differentiate their
workloads, hence, their requirements have to be taken into consideration to be
fulfilled when building a Science Cloud. This paper will discuss what are the
main scheduling and resource allocation challenges for any Infrastructure as a
Service provider supporting scientific applications
HPC Cloud for Scientific and Business Applications: Taxonomy, Vision, and Research Challenges
High Performance Computing (HPC) clouds are becoming an alternative to
on-premise clusters for executing scientific applications and business
analytics services. Most research efforts in HPC cloud aim to understand the
cost-benefit of moving resource-intensive applications from on-premise
environments to public cloud platforms. Industry trends show hybrid
environments are the natural path to get the best of the on-premise and cloud
resources---steady (and sensitive) workloads can run on on-premise resources
and peak demand can leverage remote resources in a pay-as-you-go manner.
Nevertheless, there are plenty of questions to be answered in HPC cloud, which
range from how to extract the best performance of an unknown underlying
platform to what services are essential to make its usage easier. Moreover, the
discussion on the right pricing and contractual models to fit small and large
users is relevant for the sustainability of HPC clouds. This paper brings a
survey and taxonomy of efforts in HPC cloud and a vision on what we believe is
ahead of us, including a set of research challenges that, once tackled, can
help advance businesses and scientific discoveries. This becomes particularly
relevant due to the fast increasing wave of new HPC applications coming from
big data and artificial intelligence.Comment: 29 pages, 5 figures, Published in ACM Computing Surveys (CSUR
Dynamic Virtualized Deployment of Particle Physics Environments on a High Performance Computing Cluster
The NEMO High Performance Computing Cluster at the University of Freiburg has
been made available to researchers of the ATLAS and CMS experiments. Users
access the cluster from external machines connected to the World-wide LHC
Computing Grid (WLCG). This paper describes how the full software environment
of the WLCG is provided in a virtual machine image. The interplay between the
schedulers for NEMO and for the external clusters is coordinated through the
ROCED service. A cloud computing infrastructure is deployed at NEMO to
orchestrate the simultaneous usage by bare metal and virtualized jobs. Through
the setup, resources are provided to users in a transparent, automatized, and
on-demand way. The performance of the virtualized environment has been
evaluated for particle physics applications
funcX: A Federated Function Serving Fabric for Science
Exploding data volumes and velocities, new computational methods and
platforms, and ubiquitous connectivity demand new approaches to computation in
the sciences. These new approaches must enable computation to be mobile, so
that, for example, it can occur near data, be triggered by events (e.g.,
arrival of new data), be offloaded to specialized accelerators, or run remotely
where resources are available. They also require new design approaches in which
monolithic applications can be decomposed into smaller components, that may in
turn be executed separately and on the most suitable resources. To address
these needs we present funcX---a distributed function as a service (FaaS)
platform that enables flexible, scalable, and high performance remote function
execution. funcX's endpoint software can transform existing clouds, clusters,
and supercomputers into function serving systems, while funcX's cloud-hosted
service provides transparent, secure, and reliable function execution across a
federated ecosystem of endpoints. We motivate the need for funcX with several
scientific case studies, present our prototype design and implementation, show
optimizations that deliver throughput in excess of 1 million functions per
second, and demonstrate, via experiments on two supercomputers, that funcX can
scale to more than more than 130000 concurrent workers.Comment: Accepted to ACM Symposium on High-Performance Parallel and
Distributed Computing (HPDC 2020). arXiv admin note: substantial text overlap
with arXiv:1908.0490
- …