67 research outputs found

    On Evaluating Commercial Cloud Services: A Systematic Review

    Full text link
    Background: Cloud Computing is increasingly booming in industry with many competing providers and services. Accordingly, evaluation of commercial Cloud services is necessary. However, the existing evaluation studies are relatively chaotic. There exists tremendous confusion and gap between practices and theory about Cloud services evaluation. Aim: To facilitate relieving the aforementioned chaos, this work aims to synthesize the existing evaluation implementations to outline the state-of-the-practice and also identify research opportunities in Cloud services evaluation. Method: Based on a conceptual evaluation model comprising six steps, the Systematic Literature Review (SLR) method was employed to collect relevant evidence to investigate the Cloud services evaluation step by step. Results: This SLR identified 82 relevant evaluation studies. The overall data collected from these studies essentially represent the current practical landscape of implementing Cloud services evaluation, and in turn can be reused to facilitate future evaluation work. Conclusions: Evaluation of commercial Cloud services has become a world-wide research topic. Some of the findings of this SLR identify several research gaps in the area of Cloud services evaluation (e.g., the Elasticity and Security evaluation of commercial Cloud services could be a long-term challenge), while some other findings suggest the trend of applying commercial Cloud services (e.g., compared with PaaS, IaaS seems more suitable for customers and is particularly important in industry). This SLR study itself also confirms some previous experiences and reveals new Evidence-Based Software Engineering (EBSE) lessons

    Multi-elastic Datacenters: Auto-scaled Virtual Clusters on Energy-Aware Physical Infrastructures

    Full text link
    [EN] Computer clusters are widely used platforms to execute different computational workloads. Indeed, the advent of virtualization and Cloud computing has paved the way to deploy virtual elastic clusters on top of Cloud infrastructures, which are typically backed by physical computing clusters. In turn, the advances in Green computing have fostered the ability to dynamically power on the nodes of physical clusters as required. Therefore, this paper introduces an open-source framework to deploy elastic virtual clusters running on elastic physical clusters where the computing capabilities of the virtual clusters are dynamically changed to satisfy both the user application's computing requirements and to minimise the amount of energy consumed by the underlying physical cluster that supports an on-premises Cloud. For that, we integrate: i) an elasticity manager both at the infrastructure level (power management) and at the virtual infrastructure level (horizontal elasticity); ii) an automatic Virtual Machine (VM) consolidation agent that reduces the amount of powered on physical nodes using live migration and iii) a vertical elasticity manager to dynamically and transparently change the memory allocated to VMs, thus fostering enhanced consolidation. A case study based on real datasets executed on a production infrastructure is used to validate the proposed solution. The results show that a multi-elastic virtualized datacenter provides users with the ability to deploy customized scalable computing clusters while reducing its energy footprint.The results of this work have been partially supported by ATMOSPHERE (Adaptive, Trustworthy, Manageable, Orchestrated, Secure, Privacy-assuring Hybrid, Ecosystem for Resilient Cloud Computing), funded by the European Commission under the Cooperation Programme, Horizon 2020 grant agreement No 777154.Alfonso Laguna, CD.; Caballer Fernández, M.; Calatrava Arroyo, A.; Moltó, G.; Blanquer Espert, I. (2018). Multi-elastic Datacenters: Auto-scaled Virtual Clusters on Energy-Aware Physical Infrastructures. Journal of Grid Computing. 17(1):191-204. https://doi.org/10.1007/s10723-018-9449-zS191204171Buyya, R.: High Performance Cluster Computing: Architectures and Systems. Prentice Hall PTR, Upper Saddle River (1999)de Alfonso, C., Caballer, M., Alvarruiz, F., Moltó, G.: An economic and energy-aware analysis of the viability of outsourcing cluster computing to the cloud. Futur. Gener. Comput. Syst. (Int. J. Grid Comput eScience) 29, 704–712 (2013). https://doi.org/10.1016/j.future.2012.08.014Williams, D., Jamjoom, H., Liu, Y.H., Weatherspoon, H.: Overdriver: handling memory overload in an oversubscribed cloud. ACM SIGPLAN Not. 46(7), 205 (2011). https://doi.org/10.1145/2007477.1952709 . http://dl.acm.org/citation.cfm?id=2007477.1952709Valentini, G., Lassonde, W., Khan, S., Min-Allah, N., Madani, S., Li, J., Zhang, L., Wang, L., Ghani, N., Kolodziej, J., Li, H., Zomaya, A., Xu, C.Z., Balaji, P., Vishnu, A., Pinel, F., Pecero, J., Kliazovich, D., Bouvry, P.: An overview of energy efficiency techniques in cluster computing systems. Clust. Comput. 16(1), 3–15 (2013). https://doi.org/10.1007/s10586-011-0171-xDe Alfonso, C., Caballer, M., Hernández, V.: Efficient power management in high performance computer clusters. In: Proceedings of the 1st International Multi-conference on Innovative Developments in ICT, Proceedings of the International Conference on Green Computing 2010 (ICGreen 2010), 39–44 (2010)OpenNebula: OpenNebula Cloud Software https://opennebula.org/ . [Online; accessed 12-June-2017]OpenStack: OpenStack Cloud Software. http://openstack.org . [Online; accessed 12 June 2017]VMWare: VMWare vCenter Server. https://www.vmware.com/products/vcenter-server.html . [Online; accessed 12 June 2017]De Alfonso, C., Blanquer, I.: Automatic consolidation of virtual machines in on-premises cloud platforms. In: IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp 1070–1079 (2017). https://doi.org/10.1109/CCGRID.2017.128Chase, J.S., Irwin, D.E., Grit, L.E., Moore, J.D., Sprenkle, S.E.: Dynamic virtual clusters in a grid site manager. In: Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing, HPDC ’03, p 90. IEEE Computer Society, Washington, DC (2003). http://dl.acm.org/citation.cfm?id=822087.823392Doelitzscher, F., Held, M., Reich, C., Sulistio, A.: Viteraas: Virtual cluster as a service. In: 2011 IEEE Third International Conference on Cloud Computing Technology and Science (CloudCom), pp 652–657 (2011). https://doi.org/10.1109/CloudCom.2011.101Wei, X., Wang, H., Li, H., Zou, L.: Dynamic deployment and management of elastic virtual clusters. In: 2011 Sixth Annual Chinagrid Conference (ChinaGrid), pp 35–41 (2011). https://doi.org/10.1109/ChinaGrid.2011.31de Assuncao, M.D., di Costanzo, A., Buyya, R.: Evaluating the cost-benefit of using cloud computing to extend the capacity of clusters. In: Proceedings of the 18th ACM International Symposium on High Performance Distributed Computing, HPDC ’09, pp 141–150. ACM, New York (2009). https://doi.org/10.1145/1551609.1551635 . http://doi.acm.org/10.1145/1551609.1551635Marshall, P., Keahey, K., Freeman, T.: Elastic site: Using clouds to elastically extend site resources. In: 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid), pp 43–52 (2010). https://doi.org/10.1109/CCGRID.2010.80Niu, S., Zhai, J., Ma, X., Tang, X., Chen, W.: Cost-effective cloud hpc resource provisioning by building semi-elastic virtual clusters. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC ’13, pp 56:1–56:12. ACM, New York (2013). https://doi.org/10.1145/2503210.2503236 . http://doi.acm.org/10.1145/2503210.2503236Bialecki, A., Cafarella, M., Cutting, D., Omalley, O.: Hadoop: a framework for running applications on large clusters built of commodity hardware. Tech. rep. Apache Hadoop. http://hadoop.apache.org (2005)MIT: StarCluster Elastic Load Balancer. http://web.mit.edu/stardev/cluster/docs/0.92rc2/manual/load_balancer.htmlAppliance, C.C.S.: Creating elastic virtual clusters. http://cernvm.cern.ch/portal/elasticclusters (2015)Research project, T.G.: The games research project. http://www.green-datacenters.eu (2013)Cioara, T., Anghel, I., Salomie, I., Copil, G., Moldovan, D., Kipp, A.: Energy aware dynamic resource consolidation algorithm for virtualized service centers based on reinforcement learning. In: 2011 10th International Symposium on Parallel and Distributed Computing (ISPDC), pp 163–169 (2011). https://doi.org/10.1109/ISPDC.2011.32Farahnakian, F., Liljeberg, P., Plosila, J.: Energy-efficient virtual machines consolidation in cloud data centers using reinforcement learning. In: 2014 22nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp 500–507 (2014). https://doi.org/10.1109/PDP.2014.109Masoumzadeh, S., Hlavacs, H.: Integrating vm selection criteria in distributed dynamic vm consolidation using fuzzy q-learning. In: 2013 9th International Conference on Network and Service Management (CNSM), pp 332–338 (2013). https://doi.org/10.1109/CNSM.2013.6727854Feller, E., Rilling, L., Morin, C.: Energy-aware ant colony based workload placement in clouds. In: 2011 12th IEEE/ACM International Conference on Grid Computing (GRID), pp 26–33 (2011). https://doi.org/10.1109/Grid.2011.13Pop, C.B., Anghel, I., Cioara, T., Salomie, I., Vartic, I.: A swarm-inspired data center consolidation methodology. In: Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics, WIMS ’12, pp 41:1–41:7. ACM, New York (2012). https://doi.org/10.1145/2254129.2254180Marzolla, M., Babaoglu, O., Panzieri, F.: Server consolidation in clouds through gossiping. In: Proceedings of the 2011 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks, WOWMOM ’11, pp 1–6. IEEE Computer Society, Washington, DC (2011). https://doi.org/10.1109/WoWMoM.2011.5986483Ghafari, S., Fazeli, M., Patooghy, A., Rikhtechi, L.: Bee-mmt: A load balancing method for power consumption management in cloud computing. In: 2013 Sixth International Conference on Contemporary Computing (IC3), pp 76–80 (2013). https://doi.org/10.1109/IC3.2013.6612165Ajiro, Y., Tanaka, A.: Improving packing algorithms for server consolidation. In: International CMG Conference, pp. 399–406. Computer Measurement Group (2007)Verma, A., Ahuja, P., Neogi, A.: pmapper: power and migration cost aware application placement in virtualized systems. In: Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware, Middleware ’08, pp 243–264. Springer, New York (2008)Beloglazov, A., Abawajy, J., Buyya, R.: Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing. Future Gener. Comput. Syst. 28 (5), 755–768 (2012). https://doi.org/10.1016/j.future.2011.04.017Guazzone, M., Anglano, C., Canonico, M.: Exploiting vm migration for the automated power and performance management of green cloud computing systems. In: Proceedings of the First International Conference on Energy Efficient Data Centers, E2DC’12, pp 81–92. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-33645-4_8Shi, L., Furlong, J., Wang, R.: Empirical evaluation of vector bin packing algorithms for energy efficient data centers. In: 2013 IEEE Symposium on Computers and Communications (ISCC), pp 000,009–000,015 (2013). https://doi.org/10.1109/ISCC.2013.6754915Tomás, L., Tordsson, J.: Improving cloud infrastructure utilization through overbooking. In: Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference on - CAC ’13, p 1. ACM Press, New York (2013). https://doi.org/10.1145/2494621.2494627Dawoud, W., Takouna, I., Meinel, C.: Elastic vm for cloud resources provisioning optimization. In: Abraham, A., Lloret Mauri, J., Buford, J., Suzuki, J., Thampi, S. (eds.) Advances in Computing and Communications, Communications in Computer and Information Science, vol. 190, pp 431–445. Springer, Berlin (2011). https://doi.org/10.1007/978-3-642-22709-7_43Tasoulas, E., Haugerund, H.R., Begnum, K.: Bayllocator: a proactive system to predict server utilization and dynamically allocate memory resources using Bayesian networks and ballooning. In: Proceedings of the 26th International Conference on Large Installation System Administration: Strategies, Tools, and Techniques, pp. 111–122. USENIX Association (2012)Hines, M.R., Gordon, A., Silva, M., Da Silva, D., Ryu, K., Ben-Yehuda, M.: Applications know best: performance-driven memory overcommit with Ginkgo. In: 2011 IEEE Third International Conference on Cloud Computing Technology and Science, pp. 130–137. IEEE. https://doi.org/10.1109/CloudCom.2011.27 (2011)Litke, A.: Manage resources on overcommitted KVM hosts. Tech. rep. IBM. http://www.ibm.com/developerworks/library/l-overcommit-kvm-resources/ (2011)De Alfonso, C., Caballer, M., Alvarruiz, F., Hernández, V.: An energy management system for cluster infrastructures. Comput. Electr. Eng. 39(8), 2579–2590 (2013). https://doi.org/10.1016/j.compeleceng.2013.05.004Moltó, G., Caballer, M, de Alfonso, C.: Automatic memory-based vertical elasticity and oversubscription on cloud platforms. Futur. Gener. Comput. Syst. 56, 1–10 (2016). https://doi.org/10.1016/j.future.2015.10.002Calatrava, A., Romero, E., Moltó, G., Caballer, M., Alonso, J.M.: Self-managed cost-efficient virtual elastic clusters on hybrid Cloud infrastructures. Futur. Gener. Comput. Syst. 61, 13–25 (2016). https://doi.org/10.1016/j.future.2016.01.018 . http://authors.elsevier.com/sd/article/S0167739X16300024 , http://linkinghub.elsevier.com/retrieve/pii/S0167739X16300024Caballer, M., Chatziangelou, M., Calatrava, A., Moltó, G., Pérez, A.: IM integration in the EGI VMOps Dashboard. In: EGI Conference 2017 and INDIGO Summit 2017 (2017)Calatrava, A., Caballer, M., Moltó, G., Pérez, A.: Virtual Elastic Clusters in the EGI LToS with EC3. In: EGI Conference 2017 and INDIGO Summit 2017 (2017)Iosup, A., Li, H., Jan, M., Anoep, S., Dumitrescu, C., Wolters, L., Epema, D.H.: The grid workloads archive. Futur. Gener. Comput. Syst. 24(7), 672–686 (2008). https://doi.org/10.1016/j.future.2008.02.003 . http://www.sciencedirect.com/science/article/pii/S0167739X08000125Nordugrid dataset, the grid workloads archive (Online; accessed 27-March-2017). http://gwa.ewi.tudelft.nl/datasets/gwa-t-3-nordugrid/report/Caballer, M., Blanquer, I., Moltó, G., de Alfonso, C: Dynamic Management of Virtual Infrastructures. J. Grid Comput. 13, 53–70 (2015). https://doi.org/10.1007/s10723-014-9296-5 . http://link.springer.com/article/10.1007/s10723-014-9296-

    No Provisioned Concurrency: Fast RDMA-codesigned Remote Fork for Serverless Computing

    Full text link
    Serverless platforms essentially face a tradeoff between container startup time and provisioned concurrency (i.e., cached instances), which is further exaggerated by the frequent need for remote container initialization. This paper presents MITOSIS, an operating system primitive that provides fast remote fork, which exploits a deep codesign of the OS kernel with RDMA. By leveraging the fast remote read capability of RDMA and partial state transfer across serverless containers, MITOSIS bridges the performance gap between local and remote container initialization. MITOSIS is the first to fork over 10,000 new containers from one instance across multiple machines within a second, while allowing the new containers to efficiently transfer the pre-materialized states of the forked one. We have implemented MITOSIS on Linux and integrated it with FN, a popular serverless platform. Under load spikes in real-world serverless workloads, MITOSIS reduces the function tail latency by 89% with orders of magnitude lower memory usage. For serverless workflow that requires state transfer, MITOSIS improves its execution time by 86%.Comment: To appear in OSDI'2

    From detection to optimization: impact of soft errors on high-performance computing applications

    Get PDF
    As high-performance computing (HPC) continues to progress, constraints on HPC system design forces the handling of errors to higher levels in the software stack. Of the types of errors facing HPC, soft errors that silently corrupt system or application state are among the most severe. The behavior of HPC applications in the presence of soft errors is critical to gain insight for effective utilization of HPC systems. The need to understand this behavior can be used in developing algorithm-based error detection guided by application characteristics from fault injection and error propagation studies. Furthermore, the realization that applications are tolerant to small errors allows optimizations such as lossy compression on high-cost data transfers. Lossy compression adds small user controllable amounts of error when compressing data, to reduce data size before expensive data transfers saving time. This dissertation investigates and improves the resiliency of HPC applications to soft errors, and explores lossy compression as a new form of optimization for expensive, time-consuming data transfers

    Feluca : A two-stage graph coloring algorithm with color-centric paradigm on GPU

    Get PDF
    In this paper, we propose a two-stage high-performance graph coloring algorithm, called Feluca, aiming to address the above challenges. Feluca combines the recursion-based method with the sequential spread-based method. In the first stage, Feluca uses a recursive routine to color a majority of vertices in the graph. Then, it switches to the sequential spread method to color the remaining vertices in order to avoid the conflicts of the recursive algorithm. Moreover, the following techniques are proposed to further improve the graph coloring performance. i) A new method is proposed to eliminate the cycles in the graph; ii) a top-down scheme is developed to avoid the atomic operation originally required for color selection; and iii) a novel color-centric coloring paradigm is designed to improve the degree of parallelism for the sequential spread part. All these newly developed techniques, together with further GPU-specific optimizations such as coalesced memory access, comprise an efficient parallel graph coloring solution in Feluca. We have conducted extensive experiments on NVIDIA GPUs. The results show that Feluca can achieve 1.76 - 12.98x speedup over the state-of-the-art algorithms

    Gunrock: GPU Graph Analytics

    Full text link
    For large-scale graph analytics on the GPU, the irregularity of data access and control flow, and the complexity of programming GPUs, have presented two significant challenges to developing a programmable high-performance graph library. "Gunrock", our graph-processing system designed specifically for the GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on operations on a vertex or edge frontier. Gunrock achieves a balance between performance and expressiveness by coupling high performance GPU computing primitives and optimization strategies with a high-level programming model that allows programmers to quickly develop new graph primitives with small code size and minimal GPU programming knowledge. We characterize the performance of various optimization strategies and evaluate Gunrock's overall performance on different GPU architectures on a wide range of graph primitives that span from traversal-based algorithms and ranking algorithms, to triangle counting and bipartite-graph-based algorithms. The results show that on a single GPU, Gunrock has on average at least an order of magnitude speedup over Boost and PowerGraph, comparable performance to the fastest GPU hardwired primitives and CPU shared-memory graph libraries such as Ligra and Galois, and better performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing (TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance Graph Processing Library on the GPU

    An enhanced dynamic replica creation and eviction mechanism in data grid federation environment

    Get PDF
    Data Grid Federation system is an infrastructure that connects several grid systems, which facilitates sharing of large amount of data, as well as storage and computing resources. The existing mechanisms on data replication focus on finding file values based on the number of files access in deciding which file to replicate, and place new replicas on locations that provide minimum read cost. DRCEM finds file values based on logical dependencies in deciding which file to replicate, and allocates new replicas on locations that provide minimum replica placement cost. This thesis presents an enhanced data replication strategy known as Dynamic Replica Creation and Eviction Mechanism (DRCEM) that utilizes the usage of data grid resources, by allocating appropriate replica sites around the federation. The proposed mechanism uses three schemes: 1) Dynamic Replica Evaluation and Creation Scheme, 2) Replica Placement Scheme, and 3) Dynamic Replica Eviction Scheme. DRCEM was evaluated using OptorSim network simulator based on four performance metrics: 1) Jobs Completion Times, 2) Effective Network Usage, 3) Storage Element Usage, and 4) Computing Element Usage. DRCEM outperforms ELALW and DRCM mechanisms by 30% and 26%, in terms of Jobs Completion Times. In addition, DRCEM consumes less storage compared to ELALW and DRCM by 42% and 40%. However, DRCEM shows lower performance compared to existing mechanisms regarding Computing Element Usage, due to additional computations of files logical dependencies. Results revealed better jobs completion times with lower resource consumption than existing approaches. This research produces three replication schemes embodied in one mechanism that enhances the performance of Data Grid Federation environment. This has contributed to the enhancement of the existing mechanism, which is capable of deciding to either create or evict more than one file during a particular time. Furthermore, files logical dependencies were integrated into the replica creation scheme to evaluate data files more accurately
    corecore