64 research outputs found
On Reliability-Aware Server Consolidation in Cloud Datacenters
In the past few years, datacenter (DC) energy consumption has become an
important issue in technology world. Server consolidation using virtualization
and virtual machine (VM) live migration allows cloud DCs to improve resource
utilization and hence energy efficiency. In order to save energy, consolidation
techniques try to turn off the idle servers, while because of workload
fluctuations, these offline servers should be turned on to support the
increased resource demands. These repeated on-off cycles could affect the
hardware reliability and wear-and-tear of servers and as a result, increase the
maintenance and replacement costs. In this paper we propose a holistic
mathematical model for reliability-aware server consolidation with the
objective of minimizing total DC costs including energy and reliability costs.
In fact, we try to minimize the number of active PMs and racks, in a
reliability-aware manner. We formulate the problem as a Mixed Integer Linear
Programming (MILP) model which is in form of NP-complete. Finally, we evaluate
the performance of our approach in different scenarios using extensive
numerical MATLAB simulations.Comment: International Symposium on Parallel and Distributed Computing
(ISPDC), Innsbruck, Austria, 201
Cloud computing: survey on energy efficiency
International audienceCloud computing is today’s most emphasized Information and Communications Technology (ICT) paradigm that is directly or indirectly used by almost every online user. However, such great significance comes with the support of a great infrastructure that includes large data centers comprising thousands of server units and other supporting equipment. Their share in power consumption generates between 1.1% and 1.5% of the total electricity use worldwide and is projected to rise even more. Such alarming numbers demand rethinking the energy efficiency of such infrastructures. However, before making any changes to infrastructure, an analysis of the current status is required. In this article, we perform a comprehensive analysis of an infrastructure supporting the cloud computing paradigm with regards to energy efficiency. First, we define a systematic approach for analyzing the energy efficiency of most important data center domains, including server and network equipment, as well as cloud management systems and appliances consisting of a software utilized by end users. Second, we utilize this approach for analyzing available scientific and industrial literature on state-of-the-art practices in data centers and their equipment. Finally, we extract existing challenges and highlight future research directions
ON OPTIMIZATIONS OF VIRTUAL MACHINE LIVE STORAGE MIGRATION FOR THE CLOUD
Virtual Machine (VM) live storage migration is widely performed in the data cen- ters of the Cloud, for the purposes of load balance, reliability, availability, hardware maintenance and system upgrade. It entails moving all the state information of the VM being migrated, including memory state, network state and storage state, from one physical server to another within the same data center or across different data centers. To minimize its performance impact, this migration process is required to be transparent to applications running within the migrating VM, meaning that ap- plications will keep running inside the VM as if there were no migration operations at all.
In this dissertation, a thorough literature review is conducted to provide a big picture of the VM live storage migration process, its problems and existing solutions. After an in-depth examination, we observe that a severe IO interference between the VM IO threads and migration IO threads exists and causes both types of the IO threads to suffer from performance degradation. This interference stems from the fact that both types of IO threads share the same critical IO path by reading from and writing to the same shared storage system. Owing to IO resource contention and requests interference between the two different types of IO requests, not only will the IO request queue lengthens in the storage system, but the time-consuming disk seek operations will also become more frequent. Based on this fundamental observation, this dissertation research presents three related but orthogonal solutions that tackle the IO interference problem in order to improve the VM live storage migration performance.
First, we introduce the Workload-Aware IO Outsourcing scheme, called WAIO, to improve the VM live storage migration efficiency. Second, we address this problem by proposing a novel scheme, called SnapMig, to improve the VM live storage migration efficiency and eliminate its performance impact on user applications at the source server by effectively leveraging the existing VM snapshots in the backup servers. Third, we propose the IOFollow scheme to improve both the VM performance and migration performance simultaneously. Finally, we outline the direction for the future research work.
Advisor: Hong Jian
Optimization and Management Techniques for Geo-distributed SDN-enabled Cloud Datacenters\u27 Provisioning
Cloud computing has become a business reality that impacts technology users around the world. It has become a cornerstone for emerging technologies and an enabler of future Internet services as it provides on-demand IT services delivery via geographically distributed data centers. At the core of cloud computing, virtualization technology has played a crucial role by allowing resource sharing, which in turn allows cloud service providers to offer computing services without discrepancies in platform compatibility.
At the same time, a trend has emerged in which enterprises are adopting a software-based network infrastructure with paradigms, such as software-defined networking, gaining further attention for large-scale networks. This trend is due to the flexibility and agility offered to networks by such paradigms. Software-defined networks allow for network resource sharing by facilitating network virtualization. Hence, combining cloud computing with a software-defined network architecture promises to enhance the quality of services that are delivered to clients and reduces the operational costs to service providers. However, this combined architecture introduces several challenges to cloud service providers, including resource management, energy efficiency, virtual network provisioning, and controller placement.
This thesis tackles these challenges by proposing innovative resource provisioning techniques and developing novel frameworks to improve resource utilization, power efficiency, and quality of service performance. These metrics have a direct impact on the capital and operational expenditure of service providers.
In this thesis, the problem of virtual computing and network provisioning in geographically distributed software-defined network-enabled cloud data centers is modeled and formulated. It proposes and evaluates optimal and sub-optimal heuristic solutions to validate their efficiency. To address the energy efficiency of cloud environments that are enabled for software-defined networks, this thesis presents an innovative architecture and develops a comprehensive power consumption model that accurately describes the power consumption behavior of such environments. To address the challenge of the number of software-defined network controllers and locations, a sub-optimal solution is proposed that combines unsupervised hierarchical clustering. Finally, betweenness centrality is proposed as an efficient solution to the controller placement problem
Cloud-scale VM Deflation for Running Interactive Applications On Transient Servers
Transient computing has become popular in public cloud environments for
running delay-insensitive batch and data processing applications at low cost.
Since transient cloud servers can be revoked at any time by the cloud provider,
they are considered unsuitable for running interactive application such as web
services. In this paper, we present VM deflation as an alternative mechanism to
server preemption for reclaiming resources from transient cloud servers under
resource pressure. Using real traces from top-tier cloud providers, we show the
feasibility of using VM deflation as a resource reclamation mechanism for
interactive applications in public clouds. We show how current hypervisor
mechanisms can be used to implement VM deflation and present cluster deflation
policies for resource management of transient and on-demand cloud VMs.
Experimental evaluation of our deflation system on a Linux cluster shows that
microservice-based applications can be deflated by up to 50\% with negligible
performance overhead. Our cluster-level deflation policies allow overcommitment
levels as high as 50\%, with less than a 1\% decrease in application
throughput, and can enable cloud platforms to increase revenue by 30\%.Comment: To appear at ACM HPDC 202
Penerapan Konsolidasi Beban Kerja Kluster Web server Secara Dinamis Dengan Melakukan Klasifikasi Beban Kerja Server Menggunakan Pendekatan Backpropagation Neural Network
Meningkatnya permintaan pengguna aplikasi WWW telah menyebabkan peningkatan yang sepadan dalam penggunaan sumber daya kluster server web. Penelitian ini mengkaji tentang penyediaan sumber daya web server berdasarkan parameter beban kerja server (load average CPU). Data yang digunakan adalah akses terhadap web server yang melayani applikasi Sistem Informasi Akademik Mahasiswa Universitas Brawijaya (SIAM-UB). Penggunaan sumber daya server secara maksimal (beban puncak) terjadi pada periode registrasi mahasiswa, yaitu lebih dari 65000 mahasiswa akan mengakses server SIAM secara bersamaan. Jumlah permintaan yang dilayani server dalam 1 hari dapat mencapai 1.7juta permintaan. Pada penelitian ini, dilakukan prediksi (klasifikasi) konsolidasi beban kerja CPU dalam kluster web server untuk penyediaan sumber daya server yang optimal. Prediksi konsolidasi beban kerja server diklasifikasikan menjadi 3 kelas, yaitu: Min (0-2), Medium (3-6), Maximum (n > 7). Metode backpropagation neural network (BNN) digunakan untuk memprediksi kelas konsolidasi beban kerja server berdasarkan parameter input penggunaan CPU, memory, jaringan (throughput) dan jumlah IP akses. Arsitektur BNN dengan 32 input, 2 hidden layer dengan jumlah neuoron h1 512; h2 32, 3 output, dan learning rate 0.00001, menghasilkan bobot yang mampu melakukan klasifikasi konsolidasi beban kerja CPU dengan tingkat precision 90%, tingkat sensitivity 0.9, dan tingkat akurasi 93%
Recommended from our members
Transiency-driven Resource Management for Cloud Computing Platforms
Modern distributed server applications are hosted on enterprise or cloud data centers that provide computing, storage, and networking capabilities to these applications. These applications are built using the implicit assumption that the underlying servers will be stable and normally available, barring for occasional faults. In many emerging scenarios, however, data centers and clouds only provide transient, rather than continuous, availability of their servers. Transiency in modern distributed systems arises in many contexts, such as green data centers powered using renewable intermittent sources, and cloud platforms that provide lower-cost transient servers which can be unilaterally revoked by the cloud operator.
Transient computing resources are increasingly important, and existing fault-tolerance and resource management techniques are inadequate for transient servers because applications typically assume continuous resource availability. This thesis presents research in distributed systems design that treats transiency as a first-class design principle. I show that combining transiency-specific fault-tolerance mechanisms with resource management policies to suit application characteristics and requirements, can yield significant cost and performance benefits. These mechanisms and policies have been implemented and prototyped as part of software systems, which allow a wide range of applications, such as interactive services and distributed data processing, to be deployed on transient servers, and can reduce cloud computing costs by up to 90\%.
This thesis makes contributions to four areas of computer systems research: transiency-specific fault-tolerance, resource allocation, abstractions, and resource reclamation. For reducing the impact of transient server revocations, I develop two fault-tolerance techniques that are tailored to transient server characteristics and application requirements. For interactive applications, I build a derivative cloud platform that masks revocations by transparently moving application-state between servers of different types. Similarly, for distributed data processing applications, I investigate the use of application level periodic checkpointing to reduce the performance impact of server revocations. For managing and reducing the risk of server revocations, I investigate the use of server portfolios that allow transient resource allocation to be tailored to application requirements.
Finally, I investigate how resource providers (such as cloud platforms) can provide transient resource availability without revocation, by looking into alternative resource reclamation techniques. I develop resource deflation, wherein a server\u27s resources are fractionally reclaimed, allowing the application to continue execution albeit with fewer resources. Resource deflation generalizes revocation, and the deflation mechanisms and cluster-wide policies can yield both high cluster utilization and low application performance degradation
- …