708 research outputs found

    Empirical Performance Analysis of High Performance Computing Benchmarks Across Variations in Cloud Computing

    Get PDF
    High Performance Computing (HPC) applications are data-intensive scientific software requiring significant CPU and data storage capabilities. Researchers have examined the performance of Amazon Elastic Compute Cloud (EC2) environment across several HPC benchmarks; however, an extensive HPC benchmark study and a comparison between Amazon EC2 and Windows Azure (Microsoft’s cloud computing platform), with metrics such as memory bandwidth, Input/Output (I/O) performance, and communication computational performance, are largely absent. The purpose of this study is to perform an exhaustive HPC benchmark comparison on EC2 and Windows Azure platforms. We implement existing benchmarks to evaluate and analyze performance of two public clouds spanning both IaaS and PaaS types. We use Amazon EC2 and Windows Azure as platforms for hosting HPC benchmarks with variations such as instance types, number of nodes, hardware and software. This is accomplished by running benchmarks including STREAM, IOR and NPB benchmarks on these platforms on varied number of nodes for small and medium instance types. These benchmarks measure the memory bandwidth, I/O performance, communication and computational performance. Benchmarking cloud platforms provides useful objective measures of their worthiness for HPC applications in addition to assessing their consistency and predictability in supporting them

    HPC Cloud for Scientific and Business Applications: Taxonomy, Vision, and Research Challenges

    Full text link
    High Performance Computing (HPC) clouds are becoming an alternative to on-premise clusters for executing scientific applications and business analytics services. Most research efforts in HPC cloud aim to understand the cost-benefit of moving resource-intensive applications from on-premise environments to public cloud platforms. Industry trends show hybrid environments are the natural path to get the best of the on-premise and cloud resources---steady (and sensitive) workloads can run on on-premise resources and peak demand can leverage remote resources in a pay-as-you-go manner. Nevertheless, there are plenty of questions to be answered in HPC cloud, which range from how to extract the best performance of an unknown underlying platform to what services are essential to make its usage easier. Moreover, the discussion on the right pricing and contractual models to fit small and large users is relevant for the sustainability of HPC clouds. This paper brings a survey and taxonomy of efforts in HPC cloud and a vision on what we believe is ahead of us, including a set of research challenges that, once tackled, can help advance businesses and scientific discoveries. This becomes particularly relevant due to the fast increasing wave of new HPC applications coming from big data and artificial intelligence.Comment: 29 pages, 5 figures, Published in ACM Computing Surveys (CSUR

    Early Observations on Performance of Google Compute Engine for Scientific Computing

    Full text link
    Although Cloud computing emerged for business applications in industry, public Cloud services have been widely accepted and encouraged for scientific computing in academia. The recently available Google Compute Engine (GCE) is claimed to support high-performance and computationally intensive tasks, while little evaluation studies can be found to reveal GCE's scientific capabilities. Considering that fundamental performance benchmarking is the strategy of early-stage evaluation of new Cloud services, we followed the Cloud Evaluation Experiment Methodology (CEEM) to benchmark GCE and also compare it with Amazon EC2, to help understand the elementary capability of GCE for dealing with scientific problems. The experimental results and analyses show both potential advantages of, and possible threats to applying GCE to scientific computing. For example, compared to Amazon's EC2 service, GCE may better suit applications that require frequent disk operations, while it may not be ready yet for single VM-based parallel computing. Following the same evaluation methodology, different evaluators can replicate and/or supplement this fundamental evaluation of GCE. Based on the fundamental evaluation results, suitable GCE environments can be further established for case studies of solving real science problems.Comment: Proceedings of the 5th International Conference on Cloud Computing Technologies and Science (CloudCom 2013), pp. 1-8, Bristol, UK, December 2-5, 201

    On Evaluating Commercial Cloud Services: A Systematic Review

    Full text link
    Background: Cloud Computing is increasingly booming in industry with many competing providers and services. Accordingly, evaluation of commercial Cloud services is necessary. However, the existing evaluation studies are relatively chaotic. There exists tremendous confusion and gap between practices and theory about Cloud services evaluation. Aim: To facilitate relieving the aforementioned chaos, this work aims to synthesize the existing evaluation implementations to outline the state-of-the-practice and also identify research opportunities in Cloud services evaluation. Method: Based on a conceptual evaluation model comprising six steps, the Systematic Literature Review (SLR) method was employed to collect relevant evidence to investigate the Cloud services evaluation step by step. Results: This SLR identified 82 relevant evaluation studies. The overall data collected from these studies essentially represent the current practical landscape of implementing Cloud services evaluation, and in turn can be reused to facilitate future evaluation work. Conclusions: Evaluation of commercial Cloud services has become a world-wide research topic. Some of the findings of this SLR identify several research gaps in the area of Cloud services evaluation (e.g., the Elasticity and Security evaluation of commercial Cloud services could be a long-term challenge), while some other findings suggest the trend of applying commercial Cloud services (e.g., compared with PaaS, IaaS seems more suitable for customers and is particularly important in industry). This SLR study itself also confirms some previous experiences and reveals new Evidence-Based Software Engineering (EBSE) lessons

    The Strategy of the Commons: Modelling the Annual Cost of Successful ICT Services for European Research

    Get PDF
    The provision of ICT services for research is increasingly using Cloud services to complement the traditional federation of computing centres. Due to the complex funding structure and differences in the basic business model, comparing the cost-effectiveness of these options requires a new approach to cost assessment. This paper presents a cost assessment method addressing the limitations of the standard methods and some of the initial results of the study. This acts as an illustration of the kind of cost assessment issues high-utilisation rate ICT services should consider when choosing between different infrastructure options. The research is co-funded by the European Commission Seventh Framework Programme through the e-FISCAL project (contract number RI-283449)

    On a Catalogue of Metrics for Evaluating Commercial Cloud Services

    Full text link
    Given the continually increasing amount of commercial Cloud services in the market, evaluation of different services plays a significant role in cost-benefit analysis or decision making for choosing Cloud Computing. In particular, employing suitable metrics is essential in evaluation implementations. However, to the best of our knowledge, there is not any systematic discussion about metrics for evaluating Cloud services. By using the method of Systematic Literature Review (SLR), we have collected the de facto metrics adopted in the existing Cloud services evaluation work. The collected metrics were arranged following different Cloud service features to be evaluated, which essentially constructed an evaluation metrics catalogue, as shown in this paper. This metrics catalogue can be used to facilitate the future practice and research in the area of Cloud services evaluation. Moreover, considering metrics selection is a prerequisite of benchmark selection in evaluation implementations, this work also supplements the existing research in benchmarking the commercial Cloud services.Comment: 10 pages, Proceedings of the 13th ACM/IEEE International Conference on Grid Computing (Grid 2012), pp. 164-173, Beijing, China, September 20-23, 201

    Cloud benchmarking and performance analysis of an HPC application in Amazon EC2

    Get PDF
    Cloud computing platforms have been continuously evolving. Features such as the Elastic Fabric Adapter (EFA) in the Amazon Web Services (AWS) platform have brought yet another revolution in the High Performance Computing (HPC) world, further accelerating the convergence of HPC and cloud computing. Other public clouds also support similar features further fueling this change. In this paper, we show how and why the performance of a large-scale computational fluid dynamics (CFD) HPC application on AWS competes very closely with the one on Beskow - a Cray XC40 supercomputer at the PDC Center for High-Performance Computing - in terms of cost-efficiency with strong scaling up to 2304 processes. We perform an extensive set of micro and macro bench- marks in both environments and conduct a comparative analysis. Until as recently as 2020 these benchmarks have notoriously yielded unsatisfactory results for the cloud platforms compared with on-premise infrastructures. Our aim is to access the HPC capabilities of the cloud, and in general to demonstrate how researchers can scale and evaluate the performance of their application in the cloud.ENABL

    Performance analysis of HPC applications in the cloud

    Get PDF
    [Abstract] The scalability of High Performance Computing (HPC) applications depends heavily on the efficient support of network communications in virtualized environments. However, Infrastructure as a Service (IaaS) providers are more focused on deploying systems with higher computational power interconnected via high-speed networks rather than improving the scalability of the communication middleware. This paper analyzes the main performance bottlenecks in HPC application scalability on the Amazon EC2 Cluster Compute platform: (1) evaluating the communication performance on shared memory and a virtualized 10 Gigabit Ethernet network; (2) assessing the scalability of representative HPC codes, the NAS Parallel Benchmarks, using an important number of cores, up to 512; (3) analyzing the new cluster instances (CC2), both in terms of single instance performance, scalability and cost-efficiency of its use; (4) suggesting techniques for reducing the impact of the virtualization overhead in the scalability of communication-intensive HPC codes, such as the direct access of the Virtual Machine to the network and reducing the number of processes per instance; and (5) proposing the combination of message-passing with multithreading as the most scalable and cost-effective option for running HPC applications on the Amazon EC2 Cluster Compute platform.Ministerio de Ciencia e Innovación; TIN2010-16735Ministerio de Economía y Competitividad; AP2010-4348

    Dynamic HPC Clusters within Amazon Web Services (AWS)

    Get PDF
    Amazon Web Services (AWS) provides public cloud computing resources and services and is one of the largest cloud computing providers in the world. However, in order to get started using AWS, one must spend many hours overcoming the steep learning curve and terminology associated with AWS. This is especially true for researchers looking to create and utilize a High Performance Computing (HPC) cluster within AWS. This is due to the massive amount of AWS services and AWS resources that must be created and linked together in order to create a fully functional HPC cluster with AWS. The Dynamic AWS HPC Cluster Project aims to help simplify the steps needed to create a fully functional dynamic HPC cluster within AWS. The user simply completes a simple wizard that specifies the details of the HPC cluster that they want: the size and type of the shared filesystem, the type of HPC scheduler, the number of Compute Instances, what IP addresses they want the cluster to be accessible from, and the number of Login/Head Instances required. After all this has been specified, the Dynamic AWS HPC Cluster project makes the required calls to the AWS APIs in order to create all the required AWS resources. After the resources have been created, they are all automatically configured, networked together, and have the usernames and passwords pushed out to all of the cluster instances for SSH login. The user can then run their jobs and when they have no more jobs left to run they can pause the cluster, which means they do not pay for compute charges, and then when they have more jobs to run resume the cluster and run their jobs. This allows users to only pay for the cluster when they need it which can help save them money
    corecore