3,003 research outputs found

    EPOBF: Energy Efficient Allocation of Virtual Machines in High Performance Computing Cloud

    Full text link
    Cloud computing has become more popular in provision of computing resources under virtual machine (VM) abstraction for high performance computing (HPC) users to run their applications. A HPC cloud is such cloud computing environment. One of challenges of energy efficient resource allocation for VMs in HPC cloud is tradeoff between minimizing total energy consumption of physical machines (PMs) and satisfying Quality of Service (e.g. performance). On one hand, cloud providers want to maximize their profit by reducing the power cost (e.g. using the smallest number of running PMs). On the other hand, cloud customers (users) want highest performance for their applications. In this paper, we focus on the scenario that scheduler does not know global information about user jobs and user applications in the future. Users will request shortterm resources at fixed start times and non interrupted durations. We then propose a new allocation heuristic (named Energy-aware and Performance per watt oriented Bestfit (EPOBF)) that uses metric of performance per watt to choose which most energy-efficient PM for mapping each VM (e.g. maximum of MIPS per Watt). Using information from Feitelson's Parallel Workload Archive to model HPC jobs, we compare the proposed EPOBF to state of the art heuristics on heterogeneous PMs (each PM has multicore CPU). Simulations show that the EPOBF can reduce significant total energy consumption in comparison with state of the art allocation heuristics.Comment: 10 pages, in Procedings of International Conference on Advanced Computing and Applications, Journal of Science and Technology, Vietnamese Academy of Science and Technology, ISSN 0866-708X, Vol. 51, No. 4B, 201

    Resource provisioning in Science Clouds: Requirements and challenges

    Full text link
    Cloud computing has permeated into the information technology industry in the last few years, and it is emerging nowadays in scientific environments. Science user communities are demanding a broad range of computing power to satisfy the needs of high-performance applications, such as local clusters, high-performance computing systems, and computing grids. Different workloads are needed from different computational models, and the cloud is already considered as a promising paradigm. The scheduling and allocation of resources is always a challenging matter in any form of computation and clouds are not an exception. Science applications have unique features that differentiate their workloads, hence, their requirements have to be taken into consideration to be fulfilled when building a Science Cloud. This paper will discuss what are the main scheduling and resource allocation challenges for any Infrastructure as a Service provider supporting scientific applications

    HPC Cloud for Scientific and Business Applications: Taxonomy, Vision, and Research Challenges

    Full text link
    High Performance Computing (HPC) clouds are becoming an alternative to on-premise clusters for executing scientific applications and business analytics services. Most research efforts in HPC cloud aim to understand the cost-benefit of moving resource-intensive applications from on-premise environments to public cloud platforms. Industry trends show hybrid environments are the natural path to get the best of the on-premise and cloud resources---steady (and sensitive) workloads can run on on-premise resources and peak demand can leverage remote resources in a pay-as-you-go manner. Nevertheless, there are plenty of questions to be answered in HPC cloud, which range from how to extract the best performance of an unknown underlying platform to what services are essential to make its usage easier. Moreover, the discussion on the right pricing and contractual models to fit small and large users is relevant for the sustainability of HPC clouds. This paper brings a survey and taxonomy of efforts in HPC cloud and a vision on what we believe is ahead of us, including a set of research challenges that, once tackled, can help advance businesses and scientific discoveries. This becomes particularly relevant due to the fast increasing wave of new HPC applications coming from big data and artificial intelligence.Comment: 29 pages, 5 figures, Published in ACM Computing Surveys (CSUR

    Dynamic Virtualized Deployment of Particle Physics Environments on a High Performance Computing Cluster

    Full text link
    The NEMO High Performance Computing Cluster at the University of Freiburg has been made available to researchers of the ATLAS and CMS experiments. Users access the cluster from external machines connected to the World-wide LHC Computing Grid (WLCG). This paper describes how the full software environment of the WLCG is provided in a virtual machine image. The interplay between the schedulers for NEMO and for the external clusters is coordinated through the ROCED service. A cloud computing infrastructure is deployed at NEMO to orchestrate the simultaneous usage by bare metal and virtualized jobs. Through the setup, resources are provided to users in a transparent, automatized, and on-demand way. The performance of the virtualized environment has been evaluated for particle physics applications

    funcX: A Federated Function Serving Fabric for Science

    Full text link
    Exploding data volumes and velocities, new computational methods and platforms, and ubiquitous connectivity demand new approaches to computation in the sciences. These new approaches must enable computation to be mobile, so that, for example, it can occur near data, be triggered by events (e.g., arrival of new data), be offloaded to specialized accelerators, or run remotely where resources are available. They also require new design approaches in which monolithic applications can be decomposed into smaller components, that may in turn be executed separately and on the most suitable resources. To address these needs we present funcX---a distributed function as a service (FaaS) platform that enables flexible, scalable, and high performance remote function execution. funcX's endpoint software can transform existing clouds, clusters, and supercomputers into function serving systems, while funcX's cloud-hosted service provides transparent, secure, and reliable function execution across a federated ecosystem of endpoints. We motivate the need for funcX with several scientific case studies, present our prototype design and implementation, show optimizations that deliver throughput in excess of 1 million functions per second, and demonstrate, via experiments on two supercomputers, that funcX can scale to more than more than 130000 concurrent workers.Comment: Accepted to ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC 2020). arXiv admin note: substantial text overlap with arXiv:1908.0490
    • …
    corecore