3,987 research outputs found

    Fog-supported delay-constrained energy-saving live migration of VMs over multiPath TCP/IP 5G connections

    Get PDF
    The incoming era of the fifth-generation fog computing-supported radio access networks (shortly, 5G FOGRANs) aims at exploiting computing/networking resource virtualization, in order to augment the limited resources of wireless devices through the seamless live migration of virtual machines (VMs) toward nearby fog data centers. For this purpose, the bandwidths of the multiple wireless network interface cards of the wireless devices may be aggregated under the control of the emerging MultiPathTCP (MPTCP) protocol. However, due to the fading and mobility-induced phenomena, the energy consumptions of the current state-of-the-art VM migration techniques may still offset their expected benefits. Motivated by these considerations, in this paper, we analytically characterize and implement in software and numerically test the optimal minimum-energy settable-complexity bandwidth manager (SCBM) for the live migration of VMs over 5G FOGRAN MPTCP connections. The key features of the proposed SCBM are that: 1) its implementation complexity is settable on-line on the basis of the target energy consumption versus implementation complexity tradeoff; 2) it minimizes the network energy consumed by the wireless device for sustaining the migration process under hard constraints on the tolerated migration times and downtimes; and 3) by leveraging a suitably designed adaptive mechanism, it is capable to quickly react to (possibly, unpredicted) fading and/or mobility-induced abrupt changes of the wireless environment without requiring forecasting. The actual effectiveness of the proposed SCBM is supported by extensive energy versus delay performance comparisons that cover: 1) a number of heterogeneous 3G/4G/WiFi FOGRAN scenarios; 2) synthetic and real-world workloads; and, 3) MPTCP and wireless connections

    A Generic Checkpoint-Restart Mechanism for Virtual Machines

    Full text link
    It is common today to deploy complex software inside a virtual machine (VM). Snapshots provide rapid deployment, migration between hosts, dependability (fault tolerance), and security (insulating a guest VM from the host). Yet, for each virtual machine, the code for snapshots is laboriously developed on a per-VM basis. This work demonstrates a generic checkpoint-restart mechanism for virtual machines. The mechanism is based on a plugin on top of an unmodified user-space checkpoint-restart package, DMTCP. Checkpoint-restart is demonstrated for three virtual machines: Lguest, user-space QEMU, and KVM/QEMU. The plugins for Lguest and KVM/QEMU require just 200 lines of code. The Lguest kernel driver API is augmented by 40 lines of code. DMTCP checkpoints user-space QEMU without any new code. KVM/QEMU, user-space QEMU, and DMTCP need no modification. The design benefits from other DMTCP features and plugins. Experiments demonstrate checkpoint and restart in 0.2 seconds using forked checkpointing, mmap-based fast-restart, and incremental Btrfs-based snapshots

    Enabling virtualization technologies for enhanced cloud computing

    Get PDF
    Cloud Computing is a ubiquitous technology that offers various services for individual users, small businesses, as well as large scale organizations. Data-center owners maintain clusters of thousands of machines and lease out resources like CPU, memory, network bandwidth, and storage to clients. For organizations, cloud computing provides the means to offload server infrastructure and obtain resources on demand, which reduces setup costs as well as maintenance overheads. For individuals, cloud computing offers platforms, resources and services that would otherwise be unavailable to them. At the core of cloud computing are various virtualization technologies and the resulting Virtual Machines (VMs). Virtualization enables cloud providers to host multiple VMs on a single Physical Machine (PM). The hallmark of VMs is the inability of the end-user to distinguish them from actual PMs. VMs allow cloud owners such essential features as live migration, which is the process of moving a VM from one PM to another while the VM is running, for various reasons. Features of the cloud such as fault tolerance, geographical server placement, energy management, resource management, big data processing, parallel computing, etc. depend heavily on virtualization technologies. Improvements and breakthroughs in these technologies directly lead to introduction of new possibilities in the cloud. This thesis identifies and proposes innovations for such underlying VM technologies and tests their performance on a cluster of 16 machines with real world benchmarks. Specifically the issues of server load prediction, VM consolidation, live migration, and memory sharing are attempted. First, a unique VM resource load prediction mechanism based on Chaos Theory is introduced that predicts server workloads with high accuracy. Based on these predictions, VMs are dynamically and autonomously relocated to different PMs in the cluster in an attempt to conserve energy. Experimental evaluations with a prototype on real world data- center load traces show that up to 80% of the unused PMs can be freed up and repurposed, with Service Level Objective (SLO) violations as little as 3%. Second, issues in live migration of VMs are analyzed, based on which a new distributed approach is presented that allows network-efficient live migration of VMs. The approach amortizes the transfer of memory pages over the life of the VM, thus reducing network traffic during critical live migration. The prototype reduces network usage by up to 45% and lowers required time by up to 40% for live migration on various real-world loads. Finally, a memory sharing and management approach called ACE-M is demonstrated that enables VMs to share and utilize all the memory available in the cluster remotely. Along with predictions on network and memory, this approach allows VMs to run applications with memory requirements much higher than physically available locally. It is experimentally shown that ACE-M reduces the memory performance degradation by about 75% and achieves a 40% lower network response time for memory intensive VMs. A combination of these innovations to the virtualization technologies can minimize performance degradation of various VM attributes, which will ultimately lead to a better end-user experience

    A Hybrid Local Storage Transfer Scheme for Live Migration of I/O Intensive Workloads

    Get PDF
    International audienceLive migration of virtual machines (VMs) is key feature of virtualization that is extensively leveraged in IaaS cloud environments: it is the basic building block of several important features, such as load balancing, pro-active fault tolerance, power management, online maintenance, etc. While most live migration efforts concentrate on how to transfer the memory from source to destination during the migration process, comparatively little attention has been devoted to the transfer of storage. This problem is gaining increasing importance: due to performance reasons, virtual machines that run large-scale, data-intensive applications tend to rely on local storage, which poses a difficult challenge on live migration: it needs to handle storage transfer in addition to memory transfer. This paper proposes a memory-migration independent approach that addresses this challenge. It relies on a hybrid active push / prioritized prefetch strategy, which makes it highly resilient to rapid changes of disk state exhibited by I/O intensive workloads. At the same time, it is minimally intrusive in order to ensure a maximum of portability with a wide range of hypervisors. Large scale experiments that involve multiple simultaneous migrations of both synthetic benchmarks and a real scientific application show improvements of up to 10x faster migration time, 10x less bandwidth consumption and 8x less performance degradation over state-of-art

    Adaptive live VM migration over a WAN: modeling and implementation

    Get PDF
    Recent advances in virtualization technology enable high mobility of virtual machines and resource provisioning at the data-center level. To streamline the migration process, various migration strategies have been proposed for VM live migration over a local-area network (LAN). The most common solution uses memory pre-copying and assumes the storage is shared on the LAN. While applied to a wide-area network (WAN), the VM live migration algorithms need a new design philosophy to address the challenges of long latency, limited bandwidth, unstable network conditions and the movement of storage. This paper proposes a three-phase fractional hybrid pre-copy and post-copy solution for both memory and storage to achieve highly adaptive migration over a WAN. In this hybrid solution, we selectively migrate an important fraction of memory and storage in the pre-copy and freeze-and-copy phase, while the rest (non-critical data set) is migrated during post-copying. We propose a new metric called performance restoration agility, which considers both the downtime and the VM speed degradation during the post-copy phase, to evaluate the migration process. We also develop a profiling framework and a novel probabilistic prediction model to adaptively find a predictably optimal combination of the memory and storage fractions to migrate. This model-based hybrid solution is implemented on Xen and evaluated in an emulated WAN environment. Experimental results show that our solution wins over all others in adaptiveness for various applications over a WAN, while retaining the responsiveness of post-copy algorithms.published_or_final_versio

    An overview of virtual machine live migration techniques

    Get PDF
    In a cloud computing the live migration of virtual machines shows a process of moving a running virtual machine from source physical machine to the destination, considering the CPU, memory, network, and storage states. Various performance metrics are tackled such as, downtime, total migration time, performance degradation, and amount of migrated data, which are affected when a virtual machine is migrated. This paper presents an overview and understanding of virtual machine live migration techniques, of the different works in literature that consider this issue, which might impact the work of professionals and researchers to further explore the challenges and provide optimal solutions

    ๊ฐ€์ƒํ™” ํ™˜๊ฒฝ์„ ์œ„ํ•œ ์›๊ฒฉ ๋ฉ”๋ชจ๋ฆฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2021.8. Bernhard Egger.ํด๋ผ์šฐ๋“œ ํ™˜๊ฒฝ์€ ๊ฑฐ๋Œ€ํ•œ ์—ฐ์‚ฐ ์ž์›์„ ์ƒ์‹œ ๊ฐ€๋™ํ•  ํ•„์š” ์—†๊ณ  ์›ํ•˜๋Š” ์ˆœ๊ฐ„ ์›ํ•˜๋Š” ์–‘์˜ ๋Œ€ํ•œ ์—ฐ์‚ฐ ๋น„์šฉ๋งŒ์„ ์ง€๋ถˆํ•˜๋ฉด ๋˜๊ธฐ ๋•Œ๋ฌธ์—, ์ตœ๊ทผ ์ธ๊ณต์ง€๋Šฅ ๋ฐ ๋น…๋ฐ์ดํ„ฐ ์—ฐ์‚ฐ์˜ ์œ ํ–‰์œผ๋กœ ์ธํ•ด ๊ทธ ์ˆ˜์š”๊ฐ€ ํฌ๊ฒŒ ์ฆ๊ฐ€ํ•˜๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ํด๋ผ์šฐ๋“œ ์ปดํ“จํŒ…์˜ ๋„์ž…์œผ๋กœ์ธํ•ด ๊ณ ๊ฐ์€ ์„œ๋ฒ„ ์œ ์ง€์— ๋Œ€ํ•œ ๋น„์šฉ์„ ํฌ๊ฒŒ ์ ˆ๊ฐํ•  ์ˆ˜ ์žˆ๊ณ  ์„œ๋น„์Šค ์ œ๊ณต์ž๋Š” ์—ฐ์‚ฐ ์ž์›์˜ ์ด์šฉ ํšจ์œจ์„ ๊ทน๋Œ€ํ™” ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ๋ฐ์ดํ„ฐ์„ผํ„ฐ ์ž…์žฅ์—์„œ๋Š” ์—ฐ์‚ฐ ์ž์› ํ™œ์šฉ ํšจ์œจ์„ ๊ฐœ์„ ํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•œ ๋ชฉํ‘œ๊ฐ€ ๋œ๋‹ค. ํŠนํžˆ ์ตœ๊ทผ ํญ์ฆํ•˜๊ณ  ์žˆ๋Š” ๋ฐ์ดํ„ฐ ์„ผํ„ฐ์˜ ๊ทœ๋ชจ๋ฅผ ๊ณ ๋ คํ•˜๋ฉด ์ž‘์€ ํšจ์œจ ๊ฐœ์„ ์œผ๋กœ๋„ ๋ง‰๋Œ€ํ•œ ๊ฒฝ์ œ์  ๊ฐ€์น˜๋ฅผ ์ฐฝ์ถœ ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋ฐ์ดํ„ฐ ์„ผํ„ฐ์˜ ํšจ์œจ์€ ์œ„์น˜ ์„ ์ •, ๊ตฌ์กฐ ์„ค๊ณ„, ๋ƒ‰๊ฐ ์‹œ์Šคํ…œ, ํ•˜๋“œ์›จ์–ด ๊ตฌ์„ฑ ๋“ฑ๋“ฑ ๋‹ค์–‘ํ•œ ์š”์†Œ๋“ค์— ์˜ํ–ฅ์„ ๋ฐ›์ง€๋งŒ, ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ํŠนํžˆ ์—ฐ์‚ฐ ๋ฐ ๋ฉ”๋ชจ๋ฆฌ ์ž์›์„ ๊ด€๋ฆฌํ•˜๋Š” ์†Œํ”„ํŠธ์›จ์–ด ์„ค๊ณ„ ๋ฐ ๊ตฌํ˜„์„ ๋‹ค๋ฃฌ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ฐ์ดํ„ฐ ์„ผํ„ฐ ํšจ์œจ ๊ฐœ์„ ์„ ํš๊ธฐ์ ์œผ๋กœ ๊ฐœ์„ ํ•˜๋Š” ๋‘๊ฐ€์ง€ ์†Œํ”„ํŠธ์›จ์–ด ๊ธฐ๋ฐ˜ ๊ธฐ์ˆ ์„ ์ œ์•ˆํ•œ๋‹ค. ์ฒซ ์งธ๋กœ ๊ฐ€์ƒํ™” ํ™˜๊ฒฝ์„ ์œ„ํ•œ ์†Œํ”„ํŠธ์›จ์–ด ๊ธฐ๋ฐ˜ ๋ฉ”๋ชจ๋ฆฌ ๋ถ„๋ฆฌ ์‹œ์Šคํ…œ์„ ์ œ์•ˆํ•œ๋‹ค. ์ตœ๊ทผ ๊ณ ์† ๋„คํŠธ์›Œํฌ์˜ ๋ฐœ์ „์œผ๋กœ ์ธํ•ด ์›๊ฒฉ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ ๋น„์šฉ์ด ํš๊ธฐ์ ์œผ๋กœ ์ค„์–ด ๋“ค์—ˆ๊ณ , ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๊ณ ์„ฑ๋Šฅ ๋„คํŠธ์›Œํ‚น ํ•˜๋“œ์›จ์–ด๋ฅผ ์ด์šฉํ•˜์—ฌ ์›๊ฒฉ ๋ฉ”๋ชจ๋ฆฌ ์œ„์—์„œ ์‹คํ–‰๋˜๋Š” ๊ฐ€์ƒ ๋จธ์‹ ์˜ ํฐ ์„ฑ๋Šฅ ์ €ํ•˜ ์—†์ด ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์ธ๋‹ค. ์ œ์•ˆ๋œ ๊ธฐ์ˆ ์„ QEMU/KVM ๊ฐ€์ƒ๋จธ์‹  ํ•˜์ดํผ๋ฐ”์ด์ €๋ฅผ ํ†ตํ•ด ํ‰๊ฐ€ํ•œ ๊ฒฐ๊ณผ, ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ ๊ธฐ๋ฒ•์€ ๊ธฐ์กด ์‹œ์Šคํ…œ ๋Œ€๋น„ ์›๊ฒฉ ํŽ˜์ด์ง•์— ๋Œ€ํ•œ ๊ผฌ๋ฆฌ ์ง€์—ฐ์‹œ๊ฐ„์„ 98.2% ๊ฐœ์„ ํ•จ์„ ๋ณด์ธ๋‹ค. ๋˜ํ•œ ๋ž™ ๊ทœ๋ชจ์˜ ์ž‘์—…์ฒ˜๋ฆฌ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ํ†ตํ•œ ์‹คํ—˜์—์„œ, ์ œ์•ˆ๋œ ์‹œ์Šคํ…œ์€ ์ „์ฒด ์ž‘์—… ์ฒ˜๋ฆฌ ์‹œ๊ฐ„์„ ๊ธฐ์กด ์‹œ์Šคํ…œ ๋Œ€๋น„ 40.9% ์ค„์ผ ์ˆ˜ ์žˆ์Œ์„ ๋ณด์ธ๋‹ค. ๋‘ ๋ฒˆ์งธ๋กœ ์›๊ฒฉ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ด์šฉํ•˜๋Š” ์ฆ‰๊ฐ์ ์ธ ๊ฐ€์ƒ๋จธ์‹  ์ด์ฃผ ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•˜๋‹ค. ๊ฐ€์ƒํ™” ํ™˜๊ฒฝ์˜ ์›๊ฒฉ ๋ฉ”๋ชจ๋ฆฌ ํ™œ์šฉ์— ๋Œ€ํ•œ ํ™•์žฅ์€ ๊ทธ๊ฒƒ๋งŒ์œผ๋กœ ์ž์› ์ด์šฉ๋ฅ  ํ–ฅ์ƒ์— ๋Œ€ํ•ด ํฐ ๊ธฐ์—ฌ๋ฅผ ํ•˜์ง€๋งŒ, ์—ฌ์ „ํžˆ ํ•œ ์„œ๋ฒ„์—์„œ ์—ฌ๋Ÿฌ ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜์ด ๊ฒฝ์Ÿ์ ์œผ๋กœ ์ž์›์„ ์ด์šฉํ•˜๋Š” ๊ฒฝ์šฐ ์„ฑ๋Šฅ์ด ํฌ๊ฒŒ ์ €ํ•˜ ๋  ์ˆ˜ ์žˆ๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜๋Š” ์ฆ‰๊ฐ์ ์ธ ๊ฐ€์ƒ๋จธ์‹  ์ด์ฃผ ๊ธฐ๋ฒ•์€ ์›๊ฒฉ ๋ฉ”๋ชจ๋ฆฌ ์ƒ์—์„œ ์•„์ฃผ ์ž‘์€ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ์˜ ์ „์†ก๋งŒ์œผ๋กœ ๊ฐ€์ƒ๋จธ์‹ ์˜ ์ด์ฃผ๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋ฉฐ, ๋ฉ”๋ชจ๋ฆฌ ์ƒ์— ํ‚ค์™€ ๊ฐ’์„ ์ €์žฅํ•˜๋Š” ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ๋ฒค์น˜๋งˆํฌ๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๊ฐ€์ƒ๋จธ์‹ ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ํ‰๊ฐ€์—์„œ ๊ธฐ์กด ๊ธฐ๋ฒ•๋Œ€๋น„ ์‹ค์งˆ์ ์ธ ์„œ๋น„์Šค ์ค‘๋‹จ์‹œ๊ฐ„์„ ์ตœ๋Œ€ 92.6% ๊ฐœ์„ ํ•จ์„ ๋ณด์ธ๋‹ค.The raising importance of big data and artificial intelligence (AI) has led to an unprecedented shift in moving local computation into the cloud. One of the key drivers behind this transformation was the exploding cost of owning and maintaining large computing systems powerful enough to process these new workloads. Customers experience a reduced cost by renting only the required resources and only when needed, while data center operators benefit from efficiency at scale. A key factor in operating a profitable data center is a high overall utilization of its resources. Due to the scale of modern data centers, small improvements in efficiency translate to significant savings in the total cost of ownership (TCO). There are many important elements that constitute an efficient data center such as its location, architecture, cooling system, or the employed hardware. In this thesis, we focus on software-related aspects, namely the utilization of computational and memory resources. Reports from data centers operated by Alibaba and Google show that the overall resource utilization has stagnated at a level of around 50 to 60 percent over the past decade. This low average utilization is mostly attributable to peak demand-driven resource allocation despite the high variability of modern workloads in their resource usage. In other words, data centers today lack an efficient way to put idle resources that are reserved but not used to work. In this dissertation we present RackMem, a software-based solution to address the problem of low resource utilization through two main contributions. First, we introduce a disaggregated memory system tailored for virtual environments. We observe that virtual machines can use remote memory without noticeable performance degradation under moderate memory pressure on modern networking infrastructure. We implement a specialized remote paging system for QEMU/KVM that reduces the remote paging tail-latency by 98.2% in comparison to the state of the art. A job processing simulation at rack-scale shows that the total makespan can be reduced by 40.9% under our memory system. While seamless disaggregated memory helps to balance memory usage across nodes, individual nodes can still suffer overloaded resources if co-located workloads exhibit high resource usage at the same time. In a second contribution, we present a novel live migration technique for machines running on top of our remote paging system. Under this instant live migration technique, entire virtual machines can be migrated in as little as 100 milliseconds. An evaluation with in-memory key-value database workloads shows that the presented migration technique improves the state of the art by a wide margin in all key performance metrics. The presented software-based solutions lay the technical foundations that allow data center operators to significantly improve the utilization of their computational and memory resources. As future work, we propose new job schedulers and load balancers to make full use of these new technical foundations.Chapter 1. Introduction 1 1.1 Contributions of the Dissertation 3 Chapter 2. Background 5 2.1 Resource Disaggregation 5 2.2 Transparent Remote Paging 7 2.3 Remote Direct Memory Access (RDMA) 9 2.4 Live Migration of Virtual Machines 10 Chapter 3. RackMem Overview 13 3.1 RackMem Virtual Memory 13 3.2 RackMem Distributed Virtual Storage 14 3.3 RackMem Networking 15 3.4 Instant VM Live Migration 16 Chapter 4. Virtual Memory 17 4.1 Design Considerations for Achieving Low-latency 19 4.2 Pagefault handling 20 4.2.1 Fast-path and slow-path in the pagefault handler 21 4.2.2 State transition of RackVM page 23 4.3 Latency Hiding Techniques 25 4.4 Implementation 26 4.4.1 RackMem Virtual Memory Module 27 4.4.2 Dynamic Rebalancing of Local Memory 29 4.4.3 RackVM for Virtual Machines 29 4.4.4 Running Unmodified Applications 30 Chapter 5. RackMem Distributed Virtual Storage 31 5.1 The distributed Storage Abstraction 32 5.2 Memory Management 33 5.2.1 Remote memory allocation 33 5.2.2 Remote memory reclamation 33 5.3 Fault Tolerance 34 5.3.1 Fault-tolerance and Write-duplication 34 5.4 Multiple Storage Support in RackMem 36 5.5 Implementation 38 5.5.1 The Remote Memory Backend 38 5.5.2 Linux Demand Paging on RackDVS 39 Chapter 6. Networking 40 6.1 Design of RackNet 40 6.2 Implementation 41 6.2.1 RPC message layout 41 6.2.2 RackNet RPC Implementation 42 Chapter 7. Instant VM Live Migration 44 7.1 Motivation 45 7.1.1 The need for a tailored live migration technique 45 7.1.2 Software Bottlenecks 46 7.1.3 Utilizing workload variability 46 7.2 Design of Instant 47 7.2.1 Instant Region Migration 47 7.3 Implementation 48 7.3.1 Extension of RackVM for Instant 49 7.3.2 Instant region migration 49 7.3.3 Pre-fetch optimizations 51 7.3.4 Downtime optimizations 51 7.3.5 QEMU modification for Instant 52 Chapter 8. Evaluation - RackMem 53 8.1 Execution Environment 54 8.2 Pagefault Handler Latency 56 8.3 Single Application Performance 57 8.3.1 Batch-oriented Applications 58 8.3.2 Internal Pagesize and Performance 59 8.3.3 Write-duplication overhead 60 8.3.4 RackDVS slab size and performance 62 8.3.5 Latency-oriented Applications 63 8.3.6 Network Bandwidth Analysis 64 8.3.7 Dynamic Local Memory Partitioning 66 8.3.8 Rack-scale Job Processing Simulation 67 Chapter 9. Evaluation - Instant VM Live Migration 69 9.1 Experimental setup 69 9.2 Target Applications 70 9.3 Comparison targets 70 9.4 Database and client setups 71 9.5 Memory disaggregation scenarios 71 9.6.1 Time-to-responsiveness 71 9.6.2 Effective Downtime 73 9.6.3 Effect of Instant optimizations 75 Chapter 10. Conclusion 77 10.1 Future Directions 78 ์š”์•ฝ 89๋ฐ•

    Investigating Emerging Security Threats in Clouds and Data Centers

    Get PDF
    Data centers have been growing rapidly in recent years to meet the surging demand of cloud services. However, the expanding scale of a data center also brings new security threats. This dissertation studies emerging security issues in clouds and data centers from different aspects, including low-level cooling infrastructures and different virtualization techniques such as container and virtual machine (VM). We first unveil a new vulnerability called reduced cooling redundancy that might be exploited to launch thermal attacks, resulting in severely worsened thermal conditions in a data center. Such a vulnerability is caused by the wide adoption of aggressive cooling energy saving policies. We conduct thermal measurements and uncover effective thermal attack vectors at the server, rack, and data center levels. We also present damage assessments of thermal attacks. Our results demonstrate that thermal attacks can negatively impact the thermal conditions and reliability of victim servers, significantly raise the cooling cost, and even lead to cooling failures. Finally, we propose effective defenses to mitigate thermal attacks. We then perform a systematic study to understand the security implications of the information leakage in multi-tenancy container cloud services. Due to the incomplete implementation of system resource isolation mechanisms in the Linux kernel, a spectrum of system-wide host information is exposed to the containers, including host-system state information and individual process execution information. By exploiting such leaked host information, malicious adversaries can easily launch advanced attacks that can seriously affect the reliability of cloud services. Additionally, we discuss the root causes of the containers\u27 information leakage and propose a two-stage defense approach. The experimental results show that our defense is effective and incurs trivial performance overhead. Finally, we investigate security issues in the existing VM live migration approaches, especially the post-copy approach. While the entire live migration process relies upon reliable TCP connectivity for the transfer of the VM state, we demonstrate that the loss of TCP reliability leads to VM live migration failure. By intentionally aborting the TCP connection, attackers can cause unrecoverable memory inconsistency for post-copy, significantly increase service downtime, and degrade the running VM\u27s performance. From the offensive side, we present detailed techniques to reset the migration connection under heavy networking traffic. From the defensive side, we also propose effective protection to secure the live migration procedure

    Energy Efficiency through Virtual Machine Redistribution in Telecommunication Infrastructure Nodes

    Get PDF
    Energy efficiency is one of the key factors impacting the green behavior and operational expenses of telecommunication core network operations. This thesis study is aimed for finding out possible technique to reduce energy consumption in telecommunication infrastructure nodes. The study concentrates on traffic management operation (e.g. media stream control, ATM adaptation) within network processors [LeJ03], categorized as control plane. The control plane of the telecommunication infrastructure node is a custom built high performance cluster which consists of multiple GPPs (General Purpose Processor) interconnected by high-speed and low-latency network. Due to application configurations in particular GPP unit and redundancy issues, energy usage is not optimal. In this thesis, our approach is to gain elastic capacity within the control plane cluster to reduce power consumption. This scales down and wakes up certain GPP units depending on traffic load situations. For elasticity, our study moves toward the virtual machine (VM) migration technique in the control plane cluster through system virtualization. The traffic load situation triggers VM migration on demand. Virtual machine live migration brings the benefit of enhanced performance and resiliency of the control plane cluster. We compare the state-of-the-art power aware computing resource scheduling in cluster-based nodes with VM migration technique. Our research does not propose any change in data plane architecture as we are mainly concentrating on the control plane. This study shows, VM migration can be an efficient approach to significantly reduce energy consumption in control plane of cluster-based telecommunication infrastructure nodes without interrupting performance/throughput, while guaranteeing full connectivity and maximum link utilization
    • โ€ฆ
    corecore