Search CORE

3,987 research outputs found

Fog-supported delay-constrained energy-saving live migration of VMs over multiPath TCP/IP 5G connections

Author: Baccarelli Enzo
Momenzadeh Alireza
Scarpiniti Michele
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

The incoming era of the fifth-generation fog computing-supported radio access networks (shortly, 5G FOGRANs) aims at exploiting computing/networking resource virtualization, in order to augment the limited resources of wireless devices through the seamless live migration of virtual machines (VMs) toward nearby fog data centers. For this purpose, the bandwidths of the multiple wireless network interface cards of the wireless devices may be aggregated under the control of the emerging MultiPathTCP (MPTCP) protocol. However, due to the fading and mobility-induced phenomena, the energy consumptions of the current state-of-the-art VM migration techniques may still offset their expected benefits. Motivated by these considerations, in this paper, we analytically characterize and implement in software and numerically test the optimal minimum-energy settable-complexity bandwidth manager (SCBM) for the live migration of VMs over 5G FOGRAN MPTCP connections. The key features of the proposed SCBM are that: 1) its implementation complexity is settable on-line on the basis of the target energy consumption versus implementation complexity tradeoff; 2) it minimizes the network energy consumed by the wireless device for sustaining the migration process under hard constraints on the tolerated migration times and downtimes; and 3) by leveraging a suitably designed adaptive mechanism, it is capable to quickly react to (possibly, unpredicted) fading and/or mobility-induced abrupt changes of the wireless environment without requiring forecasting. The actual effectiveness of the proposed SCBM is supported by extensive energy versus delay performance comparisons that cover: 1) a number of heterogeneous 3G/4G/WiFi FOGRAN scenarios; 2) synthetic and real-world workloads; and, 3) MPTCP and wireless connections

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

A Generic Checkpoint-Restart Mechanism for Virtual Machines

Author: Cooperman Gene
Garg Rohan
Sodha Komal
Publication venue
Publication date: 08/12/2012
Field of study

It is common today to deploy complex software inside a virtual machine (VM). Snapshots provide rapid deployment, migration between hosts, dependability (fault tolerance), and security (insulating a guest VM from the host). Yet, for each virtual machine, the code for snapshots is laboriously developed on a per-VM basis. This work demonstrates a generic checkpoint-restart mechanism for virtual machines. The mechanism is based on a plugin on top of an unmodified user-space checkpoint-restart package, DMTCP. Checkpoint-restart is demonstrated for three virtual machines: Lguest, user-space QEMU, and KVM/QEMU. The plugins for Lguest and KVM/QEMU require just 200 lines of code. The Lguest kernel driver API is augmented by 40 lines of code. DMTCP checkpoints user-space QEMU without any new code. KVM/QEMU, user-space QEMU, and DMTCP need no modification. The design benefits from other DMTCP features and plugins. Experiments demonstrate checkpoint and restart in 0.2 seconds using forked checkpointing, mmap-based fast-restart, and incremental Btrfs-based snapshots

arXiv.org e-Print Archive

CiteSeerX

Enabling virtualization technologies for enhanced cloud computing

Author: Qazi Kashifuddin
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/2015
Field of study

Cloud Computing is a ubiquitous technology that offers various services for individual users, small businesses, as well as large scale organizations. Data-center owners maintain clusters of thousands of machines and lease out resources like CPU, memory, network bandwidth, and storage to clients. For organizations, cloud computing provides the means to offload server infrastructure and obtain resources on demand, which reduces setup costs as well as maintenance overheads. For individuals, cloud computing offers platforms, resources and services that would otherwise be unavailable to them. At the core of cloud computing are various virtualization technologies and the resulting Virtual Machines (VMs). Virtualization enables cloud providers to host multiple VMs on a single Physical Machine (PM). The hallmark of VMs is the inability of the end-user to distinguish them from actual PMs. VMs allow cloud owners such essential features as live migration, which is the process of moving a VM from one PM to another while the VM is running, for various reasons. Features of the cloud such as fault tolerance, geographical server placement, energy management, resource management, big data processing, parallel computing, etc. depend heavily on virtualization technologies. Improvements and breakthroughs in these technologies directly lead to introduction of new possibilities in the cloud. This thesis identifies and proposes innovations for such underlying VM technologies and tests their performance on a cluster of 16 machines with real world benchmarks. Specifically the issues of server load prediction, VM consolidation, live migration, and memory sharing are attempted. First, a unique VM resource load prediction mechanism based on Chaos Theory is introduced that predicts server workloads with high accuracy. Based on these predictions, VMs are dynamically and autonomously relocated to different PMs in the cluster in an attempt to conserve energy. Experimental evaluations with a prototype on real world data- center load traces show that up to 80% of the unused PMs can be freed up and repurposed, with Service Level Objective (SLO) violations as little as 3%. Second, issues in live migration of VMs are analyzed, based on which a new distributed approach is presented that allows network-efficient live migration of VMs. The approach amortizes the transfer of memory pages over the life of the VM, thus reducing network traffic during critical live migration. The prototype reduces network usage by up to 45% and lowers required time by up to 40% for live migration on various real-world loads. Finally, a memory sharing and management approach called ACE-M is demonstrated that enables VMs to share and utilize all the memory available in the cluster remotely. Along with predictions on network and memory, this approach allows VMs to run applications with memory requirements much higher than physically available locally. It is experimentally shown that ACE-M reduces the memory performance degradation by about 75% and achieves a 40% lower network response time for memory intensive VMs. A combination of these innovations to the virtualization technologies can minimize performance degradation of various VM attributes, which will ultimately lead to a better end-user experience

Digital Commons @ New Jersey Institute of Technology (NJIT)

A Hybrid Local Storage Transfer Scheme for Live Migration of I/O Intensive Workloads

Author: Cappello Franck
Nicolae Bogdan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/06/2012
Field of study

International audienceLive migration of virtual machines (VMs) is key feature of virtualization that is extensively leveraged in IaaS cloud environments: it is the basic building block of several important features, such as load balancing, pro-active fault tolerance, power management, online maintenance, etc. While most live migration efforts concentrate on how to transfer the memory from source to destination during the migration process, comparatively little attention has been devoted to the transfer of storage. This problem is gaining increasing importance: due to performance reasons, virtual machines that run large-scale, data-intensive applications tend to rely on local storage, which poses a difficult challenge on live migration: it needs to handle storage transfer in addition to memory transfer. This paper proposes a memory-migration independent approach that addresses this challenge. It relies on a hybrid active push / prioritized prefetch strategy, which makes it highly resilient to rapid changes of disk state exhibited by I/O intensive workloads. At the same time, it is minimally intrusive in order to ensure a maximum of portability with a wide range of hypervisors. Large scale experiments that involve multiple simultaneous migrations of both synthetic benchmarks and a real scientific application show improvements of up to 10x faster migration time, 10x less bandwidth consumption and 8x less performance degradation over state-of-art

HAL-CentraleSupelec

HAL - Lille 3

INRIA a CCSD electronic archive server

HAL-Rennes 1

Adaptive live VM migration over a WAN: modeling and implementation

Author: Lam KT
Wang CL
Zhang W
Publication venue: 'United States Sports Academy'
Publication date: 01/01/2014
Field of study

Recent advances in virtualization technology enable high mobility of virtual machines and resource provisioning at the data-center level. To streamline the migration process, various migration strategies have been proposed for VM live migration over a local-area network (LAN). The most common solution uses memory pre-copying and assumes the storage is shared on the LAN. While applied to a wide-area network (WAN), the VM live migration algorithms need a new design philosophy to address the challenges of long latency, limited bandwidth, unstable network conditions and the movement of storage. This paper proposes a three-phase fractional hybrid pre-copy and post-copy solution for both memory and storage to achieve highly adaptive migration over a WAN. In this hybrid solution, we selectively migrate an important fraction of memory and storage in the pre-copy and freeze-and-copy phase, while the rest (non-critical data set) is migrated during post-copying. We propose a new metric called performance restoration agility, which considers both the downtime and the VM speed degradation during the post-copy phase, to evaluate the migration process. We also develop a profiling framework and a novel probabilistic prediction model to adaptively find a predictably optimal combination of the memory and storage fractions to migrate. This model-based hybrid solution is implemented on Xen and evaluated in an emulated WAN environment. Experimental results show that our solution wins over all others in adaptiveness for various applications over a WAN, while retaining the responsiveness of post-copy algorithms.published_or_final_versio

HKU Scholars Hub

An overview of virtual machine live migration techniques

Author: Alimehaj Vlera
Mazrekaj Artan
Nuza Shkelzen
Zatriqi Mimoza
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/01/2019
Field of study

In a cloud computing the live migration of virtual machines shows a process of moving a running virtual machine from source physical machine to the destination, considering the CPU, memory, network, and storage states. Various performance metrics are tackled such as, downtime, total migration time, performance degradation, and amount of migrated data, which are affected when a virtual machine is migrated. This paper presents an overview and understanding of virtual machine live migration techniques, of the different works in literature that consider this issue, which might impact the work of professionals and researchers to further explore the challenges and provide optimal solutions

ZENODO

Institute of Advanced Engineering and Science

Dokuz Eylul University Research Information System

가상화 환경을 위한 원격 메모리

Author: 조창연
Publication venue: 서울대학교 대학원
Publication date: 01/08/2021
Field of study

학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·컴퓨터공학부, 2021.8. Bernhard Egger.클라우드 환경은 거대한 연산 자원을 상시 가동할 필요 없고 원하는 순간 원하는 양의 대한 연산 비용만을 지불하면 되기 때문에, 최근 인공지능 및 빅데이터 연산의 유행으로 인해 그 수요가 크게 증가하고 있다. 이러한 클라우드 컴퓨팅의 도입으로인해 고객은 서버 유지에 대한 비용을 크게 절감할 수 있고 서비스 제공자는 연산 자원의 이용 효율을 극대화 할 수 있다. 이러한 시나리오에서 데이터센터 입장에서는 연산 자원 활용 효율을 개선하는 것이 중요한 목표가 된다. 특히 최근 폭증하고 있는 데이터 센터의 규모를 고려하면 작은 효율 개선으로도 막대한 경제적 가치를 창출 할 수 있다. 데이터 센터의 효율은 위치 선정, 구조 설계, 냉각 시스템, 하드웨어 구성 등등 다양한 요소들에 영향을 받지만, 이 논문에서는 특히 연산 및 메모리 자원을 관리하는 소프트웨어 설계 및 구현을 다룬다. 본 논문에서는 데이터 센터 효율 개선을 획기적으로 개선하는 두가지 소프트웨어 기반 기술을 제안한다. 첫 째로 가상화 환경을 위한 소프트웨어 기반 메모리 분리 시스템을 제안한다. 최근 고속 네트워크의 발전으로 인해 원격 메모리 접근 비용이 획기적으로 줄어 들었고, 이 논문에서는 고성능 네트워킹 하드웨어를 이용하여 원격 메모리 위에서 실행되는 가상 머신의 큰 성능 저하 없이 실행할 수 있음을 보인다. 제안된 기술을 QEMU/KVM 가상머신 하이퍼바이저를 통해 평가한 결과, 본 논문에서 제안한 기법은 기존 시스템 대비 원격 페이징에 대한 꼬리 지연시간을 98.2% 개선함을 보인다. 또한 랙 규모의 작업처리 시뮬레이션을 통한 실험에서, 제안된 시스템은 전체 작업 처리 시간을 기존 시스템 대비 40.9% 줄일 수 있음을 보인다. 두 번째로 원격 메모리를 이용하는 즉각적인 가상머신 이주 기법을 제안하다. 가상화 환경의 원격 메모리 활용에 대한 확장은 그것만으로 자원 이용률 향상에 대해 큰 기여를 하지만, 여전히 한 서버에서 여러 어플리케이션이 경쟁적으로 자원을 이용하는 경우 성능이 크게 저하 될 수 있다. 이 논문에서 제안하는 즉각적인 가상머신 이주 기법은 원격 메모리 상에서 아주 작은 메타데이터의 전송만으로 가상머신의 이주를 가능하게 하며, 메모리 상에 키와 값을 저장하는 데이터베이스 벤치마크를 실행하는 가상머신을 기반으로 한 평가에서 기존 기법대비 실질적인 서비스 중단시간을 최대 92.6% 개선함을 보인다.The raising importance of big data and artificial intelligence (AI) has led to an unprecedented shift in moving local computation into the cloud. One of the key drivers behind this transformation was the exploding cost of owning and maintaining large computing systems powerful enough to process these new workloads. Customers experience a reduced cost by renting only the required resources and only when needed, while data center operators benefit from efficiency at scale. A key factor in operating a profitable data center is a high overall utilization of its resources. Due to the scale of modern data centers, small improvements in efficiency translate to significant savings in the total cost of ownership (TCO). There are many important elements that constitute an efficient data center such as its location, architecture, cooling system, or the employed hardware. In this thesis, we focus on software-related aspects, namely the utilization of computational and memory resources. Reports from data centers operated by Alibaba and Google show that the overall resource utilization has stagnated at a level of around 50 to 60 percent over the past decade. This low average utilization is mostly attributable to peak demand-driven resource allocation despite the high variability of modern workloads in their resource usage. In other words, data centers today lack an efficient way to put idle resources that are reserved but not used to work. In this dissertation we present RackMem, a software-based solution to address the problem of low resource utilization through two main contributions. First, we introduce a disaggregated memory system tailored for virtual environments. We observe that virtual machines can use remote memory without noticeable performance degradation under moderate memory pressure on modern networking infrastructure. We implement a specialized remote paging system for QEMU/KVM that reduces the remote paging tail-latency by 98.2% in comparison to the state of the art. A job processing simulation at rack-scale shows that the total makespan can be reduced by 40.9% under our memory system. While seamless disaggregated memory helps to balance memory usage across nodes, individual nodes can still suffer overloaded resources if co-located workloads exhibit high resource usage at the same time. In a second contribution, we present a novel live migration technique for machines running on top of our remote paging system. Under this instant live migration technique, entire virtual machines can be migrated in as little as 100 milliseconds. An evaluation with in-memory key-value database workloads shows that the presented migration technique improves the state of the art by a wide margin in all key performance metrics. The presented software-based solutions lay the technical foundations that allow data center operators to significantly improve the utilization of their computational and memory resources. As future work, we propose new job schedulers and load balancers to make full use of these new technical foundations.Chapter 1. Introduction 1 1.1 Contributions of the Dissertation 3 Chapter 2. Background 5 2.1 Resource Disaggregation 5 2.2 Transparent Remote Paging 7 2.3 Remote Direct Memory Access (RDMA) 9 2.4 Live Migration of Virtual Machines 10 Chapter 3. RackMem Overview 13 3.1 RackMem Virtual Memory 13 3.2 RackMem Distributed Virtual Storage 14 3.3 RackMem Networking 15 3.4 Instant VM Live Migration 16 Chapter 4. Virtual Memory 17 4.1 Design Considerations for Achieving Low-latency 19 4.2 Pagefault handling 20 4.2.1 Fast-path and slow-path in the pagefault handler 21 4.2.2 State transition of RackVM page 23 4.3 Latency Hiding Techniques 25 4.4 Implementation 26 4.4.1 RackMem Virtual Memory Module 27 4.4.2 Dynamic Rebalancing of Local Memory 29 4.4.3 RackVM for Virtual Machines 29 4.4.4 Running Unmodified Applications 30 Chapter 5. RackMem Distributed Virtual Storage 31 5.1 The distributed Storage Abstraction 32 5.2 Memory Management 33 5.2.1 Remote memory allocation 33 5.2.2 Remote memory reclamation 33 5.3 Fault Tolerance 34 5.3.1 Fault-tolerance and Write-duplication 34 5.4 Multiple Storage Support in RackMem 36 5.5 Implementation 38 5.5.1 The Remote Memory Backend 38 5.5.2 Linux Demand Paging on RackDVS 39 Chapter 6. Networking 40 6.1 Design of RackNet 40 6.2 Implementation 41 6.2.1 RPC message layout 41 6.2.2 RackNet RPC Implementation 42 Chapter 7. Instant VM Live Migration 44 7.1 Motivation 45 7.1.1 The need for a tailored live migration technique 45 7.1.2 Software Bottlenecks 46 7.1.3 Utilizing workload variability 46 7.2 Design of Instant 47 7.2.1 Instant Region Migration 47 7.3 Implementation 48 7.3.1 Extension of RackVM for Instant 49 7.3.2 Instant region migration 49 7.3.3 Pre-fetch optimizations 51 7.3.4 Downtime optimizations 51 7.3.5 QEMU modification for Instant 52 Chapter 8. Evaluation - RackMem 53 8.1 Execution Environment 54 8.2 Pagefault Handler Latency 56 8.3 Single Application Performance 57 8.3.1 Batch-oriented Applications 58 8.3.2 Internal Pagesize and Performance 59 8.3.3 Write-duplication overhead 60 8.3.4 RackDVS slab size and performance 62 8.3.5 Latency-oriented Applications 63 8.3.6 Network Bandwidth Analysis 64 8.3.7 Dynamic Local Memory Partitioning 66 8.3.8 Rack-scale Job Processing Simulation 67 Chapter 9. Evaluation - Instant VM Live Migration 69 9.1 Experimental setup 69 9.2 Target Applications 70 9.3 Comparison targets 70 9.4 Database and client setups 71 9.5 Memory disaggregation scenarios 71 9.6.1 Time-to-responsiveness 71 9.6.2 Effective Downtime 73 9.6.3 Effect of Instant optimizations 75 Chapter 10. Conclusion 77 10.1 Future Directions 78 요약 89박

SNU Open Repository and Archive

Investigating Emerging Security Threats in Clouds and Data Centers

Author: Gao Xing
Publication venue: W&M ScholarWorks
Publication date: 09/07/2018
Field of study

Data centers have been growing rapidly in recent years to meet the surging demand of cloud services. However, the expanding scale of a data center also brings new security threats. This dissertation studies emerging security issues in clouds and data centers from different aspects, including low-level cooling infrastructures and different virtualization techniques such as container and virtual machine (VM). We first unveil a new vulnerability called reduced cooling redundancy that might be exploited to launch thermal attacks, resulting in severely worsened thermal conditions in a data center. Such a vulnerability is caused by the wide adoption of aggressive cooling energy saving policies. We conduct thermal measurements and uncover effective thermal attack vectors at the server, rack, and data center levels. We also present damage assessments of thermal attacks. Our results demonstrate that thermal attacks can negatively impact the thermal conditions and reliability of victim servers, significantly raise the cooling cost, and even lead to cooling failures. Finally, we propose effective defenses to mitigate thermal attacks. We then perform a systematic study to understand the security implications of the information leakage in multi-tenancy container cloud services. Due to the incomplete implementation of system resource isolation mechanisms in the Linux kernel, a spectrum of system-wide host information is exposed to the containers, including host-system state information and individual process execution information. By exploiting such leaked host information, malicious adversaries can easily launch advanced attacks that can seriously affect the reliability of cloud services. Additionally, we discuss the root causes of the containers\u27 information leakage and propose a two-stage defense approach. The experimental results show that our defense is effective and incurs trivial performance overhead. Finally, we investigate security issues in the existing VM live migration approaches, especially the post-copy approach. While the entire live migration process relies upon reliable TCP connectivity for the transfer of the VM state, we demonstrate that the loss of TCP reliability leads to VM live migration failure. By intentionally aborting the TCP connection, attackers can cause unrecoverable memory inconsistency for post-copy, significantly increase service downtime, and degrade the running VM\u27s performance. From the offensive side, we present detailed techniques to reset the migration connection under heavy networking traffic. From the defensive side, we also propose effective protection to secure the live migration procedure

College of William & Mary: W&M Publish

Energy Efficiency through Virtual Machine Redistribution in Telecommunication Infrastructure Nodes

Author: Tafsir Miraj Hasnaine
Publication venue: Helsingin yliopisto
Publication date: 01/01/2013
Field of study

Energy efficiency is one of the key factors impacting the green behavior and operational expenses of telecommunication core network operations. This thesis study is aimed for finding out possible technique to reduce energy consumption in telecommunication infrastructure nodes. The study concentrates on traffic management operation (e.g. media stream control, ATM adaptation) within network processors [LeJ03], categorized as control plane. The control plane of the telecommunication infrastructure node is a custom built high performance cluster which consists of multiple GPPs (General Purpose Processor) interconnected by high-speed and low-latency network. Due to application configurations in particular GPP unit and redundancy issues, energy usage is not optimal. In this thesis, our approach is to gain elastic capacity within the control plane cluster to reduce power consumption. This scales down and wakes up certain GPP units depending on traffic load situations. For elasticity, our study moves toward the virtual machine (VM) migration technique in the control plane cluster through system virtualization. The traffic load situation triggers VM migration on demand. Virtual machine live migration brings the benefit of enhanced performance and resiliency of the control plane cluster. We compare the state-of-the-art power aware computing resource scheduling in cluster-based nodes with VM migration technique. Our research does not propose any change in data plane architecture as we are mainly concentrating on the control plane. This study shows, VM migration can be an efficient approach to significantly reduce energy consumption in control plane of cluster-based telecommunication infrastructure nodes without interrupting performance/throughput, while guaranteeing full connectivity and maximum link utilization

Helsingin yliopiston digitaalinen arkisto