205 research outputs found

    START: Straggler Prediction and Mitigation for Cloud Computing Environments using Encoder LSTM Networks

    Get PDF
    A common performance problem in large-scale cloud systems is dealing with straggler tasks that are slow running instances which increase the overall response time. Such tasks impact the system's QoS and the SLA. There is a need for automatic straggler detection and mitigation mechanisms that execute jobs without violating the SLA. Prior work typically builds reactive models that focus first on detection and then mitigation of straggler tasks, which leads to delays. Other works use prediction based proactive mechanisms, but ignore volatile task characteristics. We propose a Straggler Prediction and Mitigation Technique (START) that is able to predict which tasks might be stragglers and dynamically adapt scheduling to achieve lower response times. START analyzes all tasks and hosts based on compute and network resource consumption using an Encoder LSTM network to predict and mitigate expected straggler tasks. This reduces the SLA violation rate and execution time without compromising QoS. Specifically, we use the CloudSim toolkit to simulate START and compare it with IGRU-SD, SGC, Dolly, GRASS, NearestFit and Wrangler in terms of QoS parameters. Experiments show that START reduces execution time, resource contention, energy and SLA violations by 13%, 11%, 16%, 19%, compared to the state-of-the-art

    Predicting global usages of resources endowed with local policies

    Full text link
    The effective usages of computational resources are a primary concern of up-to-date distributed applications. In this paper, we present a methodology to reason about resource usages (acquisition, release, revision, ...), and therefore the proposed approach enables to predict bad usages of resources. Keeping in mind the interplay between local and global information occurring in the application-resource interactions, we model resources as entities with local policies and global properties governing the overall interactions. Formally, our model takes the shape of an extension of pi-calculus with primitives to manage resources. We develop a Control Flow Analysis computing a static approximation of process behaviour and therefore of the resource usages.Comment: In Proceedings FOCLASA 2011, arXiv:1107.584

    Multi-elastic Datacenters: Auto-scaled Virtual Clusters on Energy-Aware Physical Infrastructures

    Full text link
    [EN] Computer clusters are widely used platforms to execute different computational workloads. Indeed, the advent of virtualization and Cloud computing has paved the way to deploy virtual elastic clusters on top of Cloud infrastructures, which are typically backed by physical computing clusters. In turn, the advances in Green computing have fostered the ability to dynamically power on the nodes of physical clusters as required. Therefore, this paper introduces an open-source framework to deploy elastic virtual clusters running on elastic physical clusters where the computing capabilities of the virtual clusters are dynamically changed to satisfy both the user application's computing requirements and to minimise the amount of energy consumed by the underlying physical cluster that supports an on-premises Cloud. For that, we integrate: i) an elasticity manager both at the infrastructure level (power management) and at the virtual infrastructure level (horizontal elasticity); ii) an automatic Virtual Machine (VM) consolidation agent that reduces the amount of powered on physical nodes using live migration and iii) a vertical elasticity manager to dynamically and transparently change the memory allocated to VMs, thus fostering enhanced consolidation. A case study based on real datasets executed on a production infrastructure is used to validate the proposed solution. The results show that a multi-elastic virtualized datacenter provides users with the ability to deploy customized scalable computing clusters while reducing its energy footprint.The results of this work have been partially supported by ATMOSPHERE (Adaptive, Trustworthy, Manageable, Orchestrated, Secure, Privacy-assuring Hybrid, Ecosystem for Resilient Cloud Computing), funded by the European Commission under the Cooperation Programme, Horizon 2020 grant agreement No 777154.Alfonso Laguna, CD.; Caballer Fernández, M.; Calatrava Arroyo, A.; Moltó, G.; Blanquer Espert, I. (2018). Multi-elastic Datacenters: Auto-scaled Virtual Clusters on Energy-Aware Physical Infrastructures. Journal of Grid Computing. 17(1):191-204. https://doi.org/10.1007/s10723-018-9449-zS191204171Buyya, R.: High Performance Cluster Computing: Architectures and Systems. Prentice Hall PTR, Upper Saddle River (1999)de Alfonso, C., Caballer, M., Alvarruiz, F., Moltó, G.: An economic and energy-aware analysis of the viability of outsourcing cluster computing to the cloud. Futur. Gener. Comput. Syst. (Int. J. Grid Comput eScience) 29, 704–712 (2013). https://doi.org/10.1016/j.future.2012.08.014Williams, D., Jamjoom, H., Liu, Y.H., Weatherspoon, H.: Overdriver: handling memory overload in an oversubscribed cloud. ACM SIGPLAN Not. 46(7), 205 (2011). https://doi.org/10.1145/2007477.1952709 . http://dl.acm.org/citation.cfm?id=2007477.1952709Valentini, G., Lassonde, W., Khan, S., Min-Allah, N., Madani, S., Li, J., Zhang, L., Wang, L., Ghani, N., Kolodziej, J., Li, H., Zomaya, A., Xu, C.Z., Balaji, P., Vishnu, A., Pinel, F., Pecero, J., Kliazovich, D., Bouvry, P.: An overview of energy efficiency techniques in cluster computing systems. Clust. Comput. 16(1), 3–15 (2013). https://doi.org/10.1007/s10586-011-0171-xDe Alfonso, C., Caballer, M., Hernández, V.: Efficient power management in high performance computer clusters. In: Proceedings of the 1st International Multi-conference on Innovative Developments in ICT, Proceedings of the International Conference on Green Computing 2010 (ICGreen 2010), 39–44 (2010)OpenNebula: OpenNebula Cloud Software https://opennebula.org/ . [Online; accessed 12-June-2017]OpenStack: OpenStack Cloud Software. http://openstack.org . [Online; accessed 12 June 2017]VMWare: VMWare vCenter Server. https://www.vmware.com/products/vcenter-server.html . [Online; accessed 12 June 2017]De Alfonso, C., Blanquer, I.: Automatic consolidation of virtual machines in on-premises cloud platforms. In: IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp 1070–1079 (2017). https://doi.org/10.1109/CCGRID.2017.128Chase, J.S., Irwin, D.E., Grit, L.E., Moore, J.D., Sprenkle, S.E.: Dynamic virtual clusters in a grid site manager. In: Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing, HPDC ’03, p 90. IEEE Computer Society, Washington, DC (2003). http://dl.acm.org/citation.cfm?id=822087.823392Doelitzscher, F., Held, M., Reich, C., Sulistio, A.: Viteraas: Virtual cluster as a service. In: 2011 IEEE Third International Conference on Cloud Computing Technology and Science (CloudCom), pp 652–657 (2011). https://doi.org/10.1109/CloudCom.2011.101Wei, X., Wang, H., Li, H., Zou, L.: Dynamic deployment and management of elastic virtual clusters. In: 2011 Sixth Annual Chinagrid Conference (ChinaGrid), pp 35–41 (2011). https://doi.org/10.1109/ChinaGrid.2011.31de Assuncao, M.D., di Costanzo, A., Buyya, R.: Evaluating the cost-benefit of using cloud computing to extend the capacity of clusters. In: Proceedings of the 18th ACM International Symposium on High Performance Distributed Computing, HPDC ’09, pp 141–150. ACM, New York (2009). https://doi.org/10.1145/1551609.1551635 . http://doi.acm.org/10.1145/1551609.1551635Marshall, P., Keahey, K., Freeman, T.: Elastic site: Using clouds to elastically extend site resources. In: 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid), pp 43–52 (2010). https://doi.org/10.1109/CCGRID.2010.80Niu, S., Zhai, J., Ma, X., Tang, X., Chen, W.: Cost-effective cloud hpc resource provisioning by building semi-elastic virtual clusters. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC ’13, pp 56:1–56:12. ACM, New York (2013). https://doi.org/10.1145/2503210.2503236 . http://doi.acm.org/10.1145/2503210.2503236Bialecki, A., Cafarella, M., Cutting, D., Omalley, O.: Hadoop: a framework for running applications on large clusters built of commodity hardware. Tech. rep. Apache Hadoop. http://hadoop.apache.org (2005)MIT: StarCluster Elastic Load Balancer. http://web.mit.edu/stardev/cluster/docs/0.92rc2/manual/load_balancer.htmlAppliance, C.C.S.: Creating elastic virtual clusters. http://cernvm.cern.ch/portal/elasticclusters (2015)Research project, T.G.: The games research project. http://www.green-datacenters.eu (2013)Cioara, T., Anghel, I., Salomie, I., Copil, G., Moldovan, D., Kipp, A.: Energy aware dynamic resource consolidation algorithm for virtualized service centers based on reinforcement learning. In: 2011 10th International Symposium on Parallel and Distributed Computing (ISPDC), pp 163–169 (2011). https://doi.org/10.1109/ISPDC.2011.32Farahnakian, F., Liljeberg, P., Plosila, J.: Energy-efficient virtual machines consolidation in cloud data centers using reinforcement learning. In: 2014 22nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp 500–507 (2014). https://doi.org/10.1109/PDP.2014.109Masoumzadeh, S., Hlavacs, H.: Integrating vm selection criteria in distributed dynamic vm consolidation using fuzzy q-learning. In: 2013 9th International Conference on Network and Service Management (CNSM), pp 332–338 (2013). https://doi.org/10.1109/CNSM.2013.6727854Feller, E., Rilling, L., Morin, C.: Energy-aware ant colony based workload placement in clouds. In: 2011 12th IEEE/ACM International Conference on Grid Computing (GRID), pp 26–33 (2011). https://doi.org/10.1109/Grid.2011.13Pop, C.B., Anghel, I., Cioara, T., Salomie, I., Vartic, I.: A swarm-inspired data center consolidation methodology. In: Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics, WIMS ’12, pp 41:1–41:7. ACM, New York (2012). https://doi.org/10.1145/2254129.2254180Marzolla, M., Babaoglu, O., Panzieri, F.: Server consolidation in clouds through gossiping. In: Proceedings of the 2011 IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks, WOWMOM ’11, pp 1–6. IEEE Computer Society, Washington, DC (2011). https://doi.org/10.1109/WoWMoM.2011.5986483Ghafari, S., Fazeli, M., Patooghy, A., Rikhtechi, L.: Bee-mmt: A load balancing method for power consumption management in cloud computing. In: 2013 Sixth International Conference on Contemporary Computing (IC3), pp 76–80 (2013). https://doi.org/10.1109/IC3.2013.6612165Ajiro, Y., Tanaka, A.: Improving packing algorithms for server consolidation. In: International CMG Conference, pp. 399–406. Computer Measurement Group (2007)Verma, A., Ahuja, P., Neogi, A.: pmapper: power and migration cost aware application placement in virtualized systems. In: Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware, Middleware ’08, pp 243–264. Springer, New York (2008)Beloglazov, A., Abawajy, J., Buyya, R.: Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing. Future Gener. Comput. Syst. 28 (5), 755–768 (2012). https://doi.org/10.1016/j.future.2011.04.017Guazzone, M., Anglano, C., Canonico, M.: Exploiting vm migration for the automated power and performance management of green cloud computing systems. In: Proceedings of the First International Conference on Energy Efficient Data Centers, E2DC’12, pp 81–92. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-33645-4_8Shi, L., Furlong, J., Wang, R.: Empirical evaluation of vector bin packing algorithms for energy efficient data centers. In: 2013 IEEE Symposium on Computers and Communications (ISCC), pp 000,009–000,015 (2013). https://doi.org/10.1109/ISCC.2013.6754915Tomás, L., Tordsson, J.: Improving cloud infrastructure utilization through overbooking. In: Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference on - CAC ’13, p 1. ACM Press, New York (2013). https://doi.org/10.1145/2494621.2494627Dawoud, W., Takouna, I., Meinel, C.: Elastic vm for cloud resources provisioning optimization. In: Abraham, A., Lloret Mauri, J., Buford, J., Suzuki, J., Thampi, S. (eds.) Advances in Computing and Communications, Communications in Computer and Information Science, vol. 190, pp 431–445. Springer, Berlin (2011). https://doi.org/10.1007/978-3-642-22709-7_43Tasoulas, E., Haugerund, H.R., Begnum, K.: Bayllocator: a proactive system to predict server utilization and dynamically allocate memory resources using Bayesian networks and ballooning. In: Proceedings of the 26th International Conference on Large Installation System Administration: Strategies, Tools, and Techniques, pp. 111–122. USENIX Association (2012)Hines, M.R., Gordon, A., Silva, M., Da Silva, D., Ryu, K., Ben-Yehuda, M.: Applications know best: performance-driven memory overcommit with Ginkgo. In: 2011 IEEE Third International Conference on Cloud Computing Technology and Science, pp. 130–137. IEEE. https://doi.org/10.1109/CloudCom.2011.27 (2011)Litke, A.: Manage resources on overcommitted KVM hosts. Tech. rep. IBM. http://www.ibm.com/developerworks/library/l-overcommit-kvm-resources/ (2011)De Alfonso, C., Caballer, M., Alvarruiz, F., Hernández, V.: An energy management system for cluster infrastructures. Comput. Electr. Eng. 39(8), 2579–2590 (2013). https://doi.org/10.1016/j.compeleceng.2013.05.004Moltó, G., Caballer, M, de Alfonso, C.: Automatic memory-based vertical elasticity and oversubscription on cloud platforms. Futur. Gener. Comput. Syst. 56, 1–10 (2016). https://doi.org/10.1016/j.future.2015.10.002Calatrava, A., Romero, E., Moltó, G., Caballer, M., Alonso, J.M.: Self-managed cost-efficient virtual elastic clusters on hybrid Cloud infrastructures. Futur. Gener. Comput. Syst. 61, 13–25 (2016). https://doi.org/10.1016/j.future.2016.01.018 . http://authors.elsevier.com/sd/article/S0167739X16300024 , http://linkinghub.elsevier.com/retrieve/pii/S0167739X16300024Caballer, M., Chatziangelou, M., Calatrava, A., Moltó, G., Pérez, A.: IM integration in the EGI VMOps Dashboard. In: EGI Conference 2017 and INDIGO Summit 2017 (2017)Calatrava, A., Caballer, M., Moltó, G., Pérez, A.: Virtual Elastic Clusters in the EGI LToS with EC3. In: EGI Conference 2017 and INDIGO Summit 2017 (2017)Iosup, A., Li, H., Jan, M., Anoep, S., Dumitrescu, C., Wolters, L., Epema, D.H.: The grid workloads archive. Futur. Gener. Comput. Syst. 24(7), 672–686 (2008). https://doi.org/10.1016/j.future.2008.02.003 . http://www.sciencedirect.com/science/article/pii/S0167739X08000125Nordugrid dataset, the grid workloads archive (Online; accessed 27-March-2017). http://gwa.ewi.tudelft.nl/datasets/gwa-t-3-nordugrid/report/Caballer, M., Blanquer, I., Moltó, G., de Alfonso, C: Dynamic Management of Virtual Infrastructures. J. Grid Comput. 13, 53–70 (2015). https://doi.org/10.1007/s10723-014-9296-5 . http://link.springer.com/article/10.1007/s10723-014-9296-

    Critical analysis of vendor lock-in and its impact on cloud computing migration: a business perspective

    Get PDF
    Vendor lock-in is a major barrier to the adoption of cloud computing, due to the lack of standardization. Current solutions and efforts tackling the vendor lock-in problem are predominantly technology-oriented. Limited studies exist to analyse and highlight the complexity of vendor lock-in problem in the cloud environment. Consequently, most customers are unaware of proprietary standards which inhibit interoperability and portability of applications when taking services from vendors. This paper provides a critical analysis of the vendor lock-in problem, from a business perspective. A survey based on qualitative and quantitative approaches conducted in this study has identified the main risk factors that give rise to lock-in situations. The analysis of our survey of 114 participants shows that, as computing resources migrate from on-premise to the cloud, the vendor lock-in problem is exacerbated. Furthermore, the findings exemplify the importance of interoperability, portability and standards in cloud computing. A number of strategies are proposed on how to avoid and mitigate lock-in risks when migrating to cloud computing. The strategies relate to contracts, selection of vendors that support standardised formats and protocols regarding standard data structures and APIs, developing awareness of commonalities and dependencies among cloud-based solutions. We strongly believe that the implementation of these strategies has a great potential to reduce the risks of vendor lock-in

    Networking - A Statistical Physics Perspective

    Get PDF
    Efficient networking has a substantial economic and societal impact in a broad range of areas including transportation systems, wired and wireless communications and a range of Internet applications. As transportation and communication networks become increasingly more complex, the ever increasing demand for congestion control, higher traffic capacity, quality of service, robustness and reduced energy consumption require new tools and methods to meet these conflicting requirements. The new methodology should serve for gaining better understanding of the properties of networking systems at the macroscopic level, as well as for the development of new principled optimization and management algorithms at the microscopic level. Methods of statistical physics seem best placed to provide new approaches as they have been developed specifically to deal with non-linear large scale systems. This paper aims at presenting an overview of tools and methods that have been developed within the statistical physics community and that can be readily applied to address the emerging problems in networking. These include diffusion processes, methods from disordered systems and polymer physics, probabilistic inference, which have direct relevance to network routing, file and frequency distribution, the exploration of network structures and vulnerability, and various other practical networking applications.Comment: (Review article) 71 pages, 14 figure

    AI augmented Edge and Fog computing: trends and challenges

    Get PDF
    In recent years, the landscape of computing paradigms has witnessed a gradual yet remarkable shift from monolithic computing to distributed and decentralized paradigms such as Internet of Things (IoT), Edge, Fog, Cloud, and Serverless. The frontiers of these computing technologies have been boosted by shift from manually encoded algorithms to Artificial Intelligence (AI)-driven autonomous systems for optimum and reliable management of distributed computing resources. Prior work focuses on improving existing systems using AI across a wide range of domains, such as efficient resource provisioning, application deployment, task placement, and service management. This survey reviews the evolution of data-driven AI-augmented technologies and their impact on computing systems. We demystify new techniques and draw key insights in Edge, Fog and Cloud resource management-related uses of AI methods and also look at how AI can innovate traditional applications for enhanced Quality of Service (QoS) in the presence of a continuum of resources. We present the latest trends and impact areas such as optimizing AI models that are deployed on or for computing systems. We layout a roadmap for future research directions in areas such as resource management for QoS optimization and service reliability. Finally, we discuss blue-sky ideas and envision this work as an anchor point for future research on AI-driven computing systems

    Enhancing Federated Cloud Management with an Integrated Service Monitoring Approach

    Get PDF
    Cloud Computing enables the construction and the provisioning of virtualized service-based applications in a simple and cost effective outsourcing to dynamic service environments. Cloud Federations envisage a distributed, heterogeneous environment consisting of various cloud infrastructures by aggregating different IaaS provider capabilities coming from both the commercial and the academic area. In this paper, we introduce a federated cloud management solution that operates the federation through utilizing cloud-brokers for various IaaS providers. In order to enable an enhanced provider selection and inter-cloud service executions, an integrated monitoring approach is proposed which is capable of measuring the availability and reliability of the provisioned services in different providers. To this end, a minimal metric monitoring service has been designed and used together with a service monitoring solution to measure cloud performance. The transparent and cost effective operation on commercial clouds and the capability to simultaneously monitor both private and public clouds were the major design goals of this integrated cloud monitoring approach. Finally, the evaluation of our proposed solution is presented on different private IaaS systems participating in federations. © 2013 Springer Science+Business Media Dordrecht

    Public cloud data auditing with practical key update and zero knowledge privacy

    Get PDF
    Data integrity is extremely important for cloud based storage services, where cloud users no longer have physical possession of their outsourced files. A number of data auditing mechanisms have been proposed to solve this problem. However, how to update a cloud user\u27s private auditing key (as well as the authenticators those keys are associated with) without the user\u27s re-possession of the data remains an open problem. In this paper, we propose a key-updating and authenticator-evolving mechanism with zero-knowledge privacy of the stored files for secure cloud data auditing, which incorporates zero knowledge proof systems, proxy re-signatures and homomorphic linear authenticators. We instantiate our proposal with the state-of-the-art Shacham-Waters auditing scheme. When the cloud user needs to update his key, instead of downloading the entire file and re-generating all the authenticators, the user can just download and update the authenticators. This approach dramatically reduces the communication and computation cost while maintaining the desirable security. We formalize the security model of zero knowledge data privacy for auditing schemes in the key-updating context and prove the soundness and zero-knowledge privacy of the proposed construction. Finally, we analyze the complexity of communication, computation and storage costs of the improved protocol which demonstrates the practicality of the proposal
    corecore