47 research outputs found
Towards a Taxonomy of Performance Evaluation of Commercial Cloud Services
Cloud Computing, as one of the most promising computing paradigms, has become
increasingly accepted in industry. Numerous commercial providers have started
to supply public Cloud services, and corresponding performance evaluation is
then inevitably required for Cloud provider selection or cost-benefit analysis.
Unfortunately, inaccurate and confusing evaluation implementations can be often
seen in the context of commercial Cloud Computing, which could severely
interfere and spoil evaluation-related comprehension and communication. This
paper introduces a taxonomy to help profile and standardize the details of
performance evaluation of commercial Cloud services. Through a systematic
literature review, we constructed the taxonomy along two dimensions by
arranging the atomic elements of Cloud-related performance evaluation. As such,
this proposed taxonomy can be employed both to analyze existing evaluation
practices through decomposition into elements and to design new experiments
through composing elements for evaluating performance of commercial Cloud
services. Moreover, through smooth expansion, we can continually adapt this
taxonomy to the more general area of evaluation of Cloud Computing.Comment: 8 pages, Proceedings of the 5th International Conference on Cloud
Computing (IEEE CLOUD 2012), pp. 344-351, Honolulu, Hawaii, USA, June 24-29,
201
On a Catalogue of Metrics for Evaluating Commercial Cloud Services
Given the continually increasing amount of commercial Cloud services in the
market, evaluation of different services plays a significant role in
cost-benefit analysis or decision making for choosing Cloud Computing. In
particular, employing suitable metrics is essential in evaluation
implementations. However, to the best of our knowledge, there is not any
systematic discussion about metrics for evaluating Cloud services. By using the
method of Systematic Literature Review (SLR), we have collected the de facto
metrics adopted in the existing Cloud services evaluation work. The collected
metrics were arranged following different Cloud service features to be
evaluated, which essentially constructed an evaluation metrics catalogue, as
shown in this paper. This metrics catalogue can be used to facilitate the
future practice and research in the area of Cloud services evaluation.
Moreover, considering metrics selection is a prerequisite of benchmark
selection in evaluation implementations, this work also supplements the
existing research in benchmarking the commercial Cloud services.Comment: 10 pages, Proceedings of the 13th ACM/IEEE International Conference
on Grid Computing (Grid 2012), pp. 164-173, Beijing, China, September 20-23,
201
Topology-aware GPU scheduling for learning workloads in cloud environments
Recent advances in hardware, such as systems with multiple GPUs and their availability in the cloud, are enabling deep learning in various domains including health care, autonomous vehicles, and Internet of Things. Multi-GPU systems exhibit complex connectivity among GPUs and between GPUs and CPUs. Workload schedulers must consider hardware topology and workload communication requirements in order to allocate CPU and GPU resources for optimal execution time and improved utilization in shared cloud environments.
This paper presents a new topology-aware workload placement strategy to schedule deep learning jobs on multi-GPU systems. The placement strategy is evaluated with a prototype on a Power8 machine with Tesla P100 cards, showing speedups of up to â1.30x compared to state-of-the-art strategies; the proposed algorithm achieves this result by allocating GPUs that satisfy workload requirements while preventing interference. Additionally, a large-scale simulation shows that the proposed strategy provides higher resource utilization and performance in cloud systems.This project is supported by the IBM/BSC Technology Center for Supercomputing
collaboration agreement. It has also received funding from the European Research Council (ERC) under the European Unionâs Horizon
2020 research and innovation programme (grant agreement No 639595). It is
also partially supported by the Ministry of Economy of Spain under contract
TIN2015-65316-P and Generalitat de Catalunya under contract 2014SGR1051,
by the ICREA Academia program, and by the BSC-CNS Severo Ochoa program
(SEV-2015-0493). We thank our IBM Research colleagues Alaa Youssef
and Asser Tantawi for the valuable discussions. We also thank SC17 committee
member Blair Bethwaite of Monash University for his constructive feedback on the earlier drafts of this paper.Peer ReviewedPostprint (published version
Scalable architectures for platform-as-a-service clouds: performance and cost analysis
Scalability is a significant feature of cloud computing, which ad-dresses to increase or decrease the capacities of allocated virtual resources at application, platform, database and infrastructure level on demand. We investigate scalable architecture solutions for cloud PaaS that allow services to utilize the resources dynamically and effectively without directly affecting users. We have implemented scalable architectures with different session state management solutions, deploying an online shopping cart application in a PaaS solution, and measuring the performance and cost under three server-side session state providers: Caching, SQL database and NoSQL database. A commercial solution with its supporting state management components has been used. Particularly when re-architecting software for the cloud, the trade-off between performance, scalability and cost implications needs to be discussed
An Analysis of Distributed Systems Syllabi With a Focus on Performance-Related Topics
We analyze a dataset of 51 current (2019-2020) Distributed Systems syllabi
from top Computer Science programs, focusing on finding the prevalence and
context in which topics related to performance are being taught in these
courses. We also study the scale of the infrastructure mentioned in DS courses,
from small client-server systems to cloud-scale, peer-to-peer, global-scale
systems. We make eight main findings, covering goals such as performance, and
scalability and its variant elasticity; activities such as performance
benchmarking and monitoring; eight selected performance-enhancing techniques
(replication, caching, sharding, load balancing, scheduling, streaming,
migrating, and offloading); and control issues such as trade-offs that include
performance and performance variability.Comment: Accepted for publication at WEPPE 2021, to be held in conjunction
with ACM/SPEC ICPE 2021: https://doi.org/10.1145/3447545.3451197 This article
is a follow-up of our prior ACM SIGCSE publication, arXiv:2012.0055