18 research outputs found
Glider: A GPU Library Driver for Improved System Security
Legacy device drivers implement both device resource management and
isolation. This results in a large code base with a wide high-level interface
making the driver vulnerable to security attacks. This is particularly
problematic for increasingly popular accelerators like GPUs that have large,
complex drivers. We solve this problem with library drivers, a new driver
architecture. A library driver implements resource management as an untrusted
library in the application process address space, and implements isolation as a
kernel module that is smaller and has a narrower lower-level interface (i.e.,
closer to hardware) than a legacy driver. We articulate a set of device and
platform hardware properties that are required to retrofit a legacy driver into
a library driver. To demonstrate the feasibility and superiority of library
drivers, we present Glider, a library driver implementation for two GPUs of
popular brands, Radeon and Intel. Glider reduces the TCB size and attack
surface by about 35% and 84% respectively for a Radeon HD 6450 GPU and by about
38% and 90% respectively for an Intel Ivy Bridge GPU. Moreover, it incurs no
performance cost. Indeed, Glider outperforms a legacy driver for applications
requiring intensive interactions with the device driver, such as applications
using the OpenGL immediate mode API
Summary Administration of Virtual Machines in the OVirt Project
Táto práca sa zaoberá myšlienkou virtualizácie a virtuálnych počítačov. Teoretická časť pokrýva základy virtualizácie z rôznych aspektov. Predstavuje koncept virtualizácie a rôzne architektúry využívané na jej dosiahnutie. Práca takisto skúma populárne implementácie spomenutých architektúr rovnako ako komerčne dostupné virtualizačné riešenia vyžívajúce tieto implementácie. Cieľom praktickej časti je navrhnúť a implementovať desktopovú aplikáciu pre administrátorskú úroveň prehľadu virtuálneho prostredia bežiacom v službe oVirt. Hlavným cieľom je vyriešiť niektoré z jeho známych problémov súvisiacich s prístupnosťou dát.This thesis tackles the idea of virtualization and virtual machines. Theoretical part covers basics of virtualization from various aspects. It introduces a concept of virtualization and various architectures used to achieve it. The thesis also examines most popular implementations of mentioned architectures as well as commercially available virtualization solutions using those implementations. The aim of its practical part is to design and implement desktop application for administrator level overview over virtual environment running on oVirt. The main goal is to solve some of its known issues regarding accessibility of its data.
Virtualization Components of the Modern Hypervisor
Virtualization is the foundation on which cloud services build their business. It supports the infrastructure for the largest companies around the globe and is a key component for scaling software for the ever-growing technology industry. If companies decide to use virtualization as part of their infrastructure it is important for them to quickly and reliably have a way to choose a virtualization technology and tweak the performance of that technology to fit their intended usage. Unfortunately, while many papers exist discussing and testing the performance of various virtualization systems, most of these performance tests do not take into account components that can be configured to improve performance for certain scenarios. This study provides a comparison of how three hypervisors (VMWare vSphere, Citrix XenServer, and KVM) perform under different sets of configurations at this point and which system workloads would be ideal for these configurations. This study also provides a means in which to compare different configurations with each other so that implementers of these technologies have a way in which to make informed decisions on which components should be enabled for their current or future systems
Distributed Shared Memory based Live VM Migration
Cloud computing is the new trend in computing services and IT industry, this computing paradigm has numerous benefits to utilize IT infrastructure resources and reduce services cost. The key feature of cloud computing depends on mobility and scalability of the computing resources, by managing virtual machines. The virtualization decouples the software from the hardware and manages the software and hardware resources in an easy way without interruption of services. Live virtual machine migration is an essential tool for dynamic resource management in current data centers. Live virtual machine is defined as the process of moving a running virtual machine or application between different physical machines without disconnecting the client or application. Many techniques have been developed to achieve this goal based on several metrics (total migration time, downtime, size of data sent and application performance) that are used to measure the performance of live migration. These metrics measure the quality of the VM services that clients care about, because the main goal of clients is keeping the applications performance with minimum service interruption.
The pre-copy live VM migration is done in four phases: preparation, iterative migration, stop and copy, and resume and commitment. During the preparation phase, the source and destination physical servers are selected, the resources in destination physical server are reserved, and the critical VM is selected to be migrated. The cloud manager responsibility is to make all of these decisions. VM state migration takes place and memory state is transferred to the target node during iterative migration phase. Meanwhile, the migrated VM continues to execute and dirties its memory. In the stop and copy phase, VM virtual CPU is stopped and then the processor and network states are transferred to the destination host. Service downtime results from stopping VM execution and moving the VM CPU and network states. Finally in the resume and commitment phase, the migrated VM is resumed running in the destination physical host, the remaining memory pages are pulled by destination machine from the source machine. The source machine resources are released and eliminated.
In this thesis, pre-copy live VM migration using Distributed Shared Memory (DSM) computing model is proposed. The setup is built using two identical computation nodes to construct all the proposed environment services architecture namely the virtualization infrastructure (Xenserver6.2 hypervisor), the shared storage server (the network file system), and the DSM and High Performance Computing (HPC) cluster. The custom DSM framework is based on a low latency memory update named Grappa. Moreover, HPC cluster is used to parallelize the work load by using CPUs computation nodes. HPC cluster employs OPENMPI and MPI libraries to support parallelization and auto-parallelization. The DSM allows the cluster CPUs to access the same memory space pages resulting in less memory data updates, which reduces the amount of data transferred through the network.
The thesis proposed model achieves a good enhancement of the live VM migration metrics. Downtime is reduced by 50 % in the idle workload of Windows VM and 66.6% in case of Ubuntu Linux idle workload. In general, the proposed model not only reduces the downtime and the total amount of data sent, but also does not degrade other metrics like the total migration time and the applications performance
Recommended from our members
New Container Architectures for Mobile, Drone, and Cloud Computing
Containers are increasingly used across many different types of computing to isolate and control apps while efficiently sharing computing resources. By using lightweight operating system virtualization, they can provide apps with a virtual computing abstraction while imposing minimal hardware requirements and a small footprint. My thesis is that new container architectures can provide additional functionality, better resource utilization, and stronger security for mobile, drone, and cloud computing. To demonstrate this, we introduce three new container architectures that enable new mobile app migration functionality, a new notion of virtual drones and efficient utilization of drone hardware, and stronger security for cloud computing by protecting containers against untrusted operating systems.
First, we introduce Flux to support multi-surface apps, apps that seamlessly run across multiple user devices, through app migration. Flux introduces two key mechanisms to overcome device heterogeneity and residual dependencies associated with app migration to enable app migration. Selective Record/Adaptive Replay to record just those device-agnostic app calls that lead to the generation of app-specific device-dependent state in services and replay them on the target. Checkpoint/Restore in Android (CRIA) to transition an app into a state in which device-specific information the app contains can be safely discarded before checkpointing and restoring the app within a containerized environment on the new device.
Second, we introduce AnDrone, a drone-as-a-service solution that makes drones accessible in the cloud. AnDrone provides a drone virtualization architecture to leverage the fact that computational costs are cheap compared to the operational and energy costs of putting a drone in the air. This enables multiple virtual drones to run simultaneously on the same physical drone at very little additional cost. To enable multiple virtual drones to run in an isolated and secure manner, each virtual drone runs its own containerized operating system instance. AnDrone introduces a new device container architecture, providing virtual drones with secure access to a full range of drone hardware devices, including sensors such as cameras and geofenced flight control.
Finally, we introduce BlackBox, a new container architecture that provides fine-grain protection of application data confidentiality and integrity without the need to trust the operating system. BlackBox introduces a container security monitor, a small trusted computing base that creates separate and independent physical address spaces for each container, such that there is no direct information flow from container to operating system or other container physical address spaces. Containerized apps do not need to be modified, can still make full use of operating system services via system calls, yet their CPU and memory state are isolated and protected from other containers and the operating system
Machine Learning Models for Live Migration Metrics Prediction
학위논문 (석사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2019. 2. Egger, Bernhard.오늘날 데이터 센터에서 가상머신의 라이브 마이그레이션 기술은 매우 중요하게 사용된다. 현존하는 데이터 센터 관리 프레임워크에서는 복잡한 알고리즘을 이용하여 언제, 어디서, 어디로 가상머신의 마이그레션을 실행할지를 결정한다. 하지만 어떤 마이그레이션 방법을 사용하는지에 따라서 성능이 크게 차이가 날 수 있음에도 불구하고 이에 대한 논의는 주요하게 다뤄지지 않았다. 이러한 성능의 차이는 라이브 마이그레이션 알고리즘의 차이나 가상머신에 할당된 워크로드의 양의 차이 그리고 마이그레이션을 하는 곳과 목적 host의 상태 차이에 의하여 일어난다. 빠르고 정확하게 올바른 마이그레이션 방법을 정하는 것은 필수적인 과제이다. 이러한 과제를 performance model을 이용하여 해결할 것이다.
본 논문에서는, 가상머신의 라이브 마이그레이션 성능을 예측하는 여러 머신 러닝 모델을 제시한다. 여기서 12개의 서로 다른 마이그레이션 알고리즘에 대해 7가지의 다른 metric들을 예측한다. 이 모델은 기존 연구에 비해 훨씬 정확한 예측을 성공하였다. 각각의 target metric과 여러 알고리즘들에 대하여 input feature evaluation을 수행하였고 각각의 특성에 맞는 모델을 만들어 84개의 서로다른 머신 러닝 모델들을 훈련시켰다. 이러한 모델들은 실제 라이브 마이그레이션 프레임워크에 쉽게 적용 가능하다. 각각의 마이그레이션 알고리즘에 대하여 target metric 예측을 사용함으로써 올바른 마이그레이션 알고리즘을 쉽게 결정할 수 있고 이는 결과적으로 다운타임과 마이그레이션에 소요되는 총 시간의 감소 효과를 볼 수 있다.Live migration of Virtual Machines (VMs) is an important technique in today's data centers. In existing data center management frameworks, complex algorithms are used to determine when, where, and to which host a migration of a VM is to be performed. However, very little attention is paid to the selection of the right migration technique depending on which the migration performance can vary greatly. This performance fluctuation is caused by the different live migration algorithms, the different workloads that each VM is executing, and the state of the destination and the source host. Choosing the right migration technique is a crucial task that has to be made quickly and precisely. Therefore, a performance model is the best and the right candidate for such a task.
In this thesis, we propose various machine learning models for predicting live migration metrics of virtual machines. We predict seven different metrics for twelve distinct migration algorithms. Our models achieve a much higher accuracy compared to existing work. For each target metric and algorithm, an input feature evaluation is conducted and a strictly specific model is generated, leading to 84 different trained machine learning models. These models can easily be integrated into a live migration framework. Using the target metric predictions for each migration algorithm, a framework can easily choose the right migration algorithm, which can lead to downtime and total migration time reduction and less service-level agreement violations.Abstract
Contents
List of Figures
List of Tables
Chapter 1 Introduction and Motivation
Chapter 2 Background
2.1 Virtualization
2.2 Live Migration
2.3 SLA and SLO
2.4 Live Migration Techniques
2.4.1 Pre-copy (PRE)
2.4.2 Post-copy (POST)
2.4.3 Hybrid Migration Techniques
2.5 Live Migration Performance Metrics
2.6 Artificial Neural Networks
2.6.1 Feedforward Neural Network (FNN)
2.6.2 Deep Neural Network (DNN)
2.6.3 Convolution Neural Network (CNN)
Chapter 3 Related Work
Chapter 4 Overview and Design
Chapter 5 Implementation
5.1 Deep Neural Network design
5.2 Convolutional Neural Network design
Chapter 6 Evaluation metrics
6.1 Geometric Mean Absolute Error (GMAE)
6.2 Geometric Mean Relative Error (GMRE)
6.3 Mean Absolute Error (MAE)
6.4 Weighted Absolute Percentage Error (WAPE)
Chapter 7 Results
7.1 Deep Neural Network
7.2 SVR with bagging
7.3 DNN vs. SVR comparison
7.4 Overhead
Chapter 8 Conclusion and Future Work
8.1 Conclusion
8.2 Future Work
AppendicesMaste
Improving energy efficiency of virtualized datacenters
Nowadays, many organizations choose to increasingly implement the cloud computing approach. More specifically, as customers, these organizations are outsourcing the management of their physical infrastructure to data centers (or cloud computing platforms). Energy consumption is a primary concern for datacenter (DC) management. Its cost represents about 80% of the total cost of ownership and it is estimated that in 2020, the US DCs alone will spend about $13 billion on energy bills. Generally, the datacenter servers are manufactured in such a way that they achieve high energy efficiency at high utilizations. Thereby for a low cost per computation all datacenter servers should push the utilization as high as possible. In order to fight the historically low utilization, cloud computing adopted server virtualization. The latter allows a physical server to execute multiple virtual servers (called virtual machines) in an isolated way. With virtualization, the cloud provider can pack (consolidate) the entire set of virtual machines (VMs) on a small set of physical servers and thereby, reduce the number of active servers. Even so, the datacenter servers rarely reach utilizations higher than 50% which means that they operate with sets of longterm unused resources (called 'holes'). My first contribution is a cloud management system that dynamically splits/fusions VMs such that they can better fill the holes. This solution is effective only for elastic applications, i.e. applications that can be executed and reconfigured over an arbitrary number of VMs. However the datacenter resource fragmentation stems from a more fundamental problem. Over time, cloud applications demand more and more memory but the physical servers provide more an more CPU. In nowadays datacenters, the two resources are strongly coupled since they are bounded to a physical sever. My second contribution is a practical way to decouple the CPU-memory tuple that can simply be applied to a commodity server. Thereby, the two resources can vary independently, depending on their demand. My third and my forth contribution show a practical system which exploit the second contribution. The underutilization observed on physical servers is also true for virtual machines. It has been shown that VMs consume only a small fraction of the allocated resources because the cloud customers are not able to correctly estimate the resource amount necessary for their applications. My third contribution is a system that estimates the memory consumption (i.e. the working set size) of a VM, with low overhead and high accuracy. Thereby, we can now consolidate the VMs based on their working set size (not the booked memory). However, the drawback of this approach is the risk of memory starvation. If one or multiple VMs have an sharp increase in memory demand, the physical server may run out of memory. This event is undesirable because the cloud platform is unable to provide the client with the booked memory. My fourth contribution is a system that allows a VM to use remote memory provided by a different rack server. Thereby, in the case of a peak memory demand, my system allows the VM to allocate memory on a remote physical server
Gestão e engenharia de CAP na nuvem híbrida
Doutoramento em InformáticaThe evolution and maturation of Cloud Computing created an opportunity for the emergence of new Cloud applications. High-performance Computing, a complex problem solving class, arises as a new business consumer by taking advantage of the Cloud premises and leaving the expensive datacenter management and difficult grid development.
Standing on an advanced maturing phase, today’s Cloud discarded many of its drawbacks, becoming more and more efficient and widespread. Performance enhancements, prices drops due to massification and customizable services on demand triggered an emphasized attention from other markets.
HPC, regardless of being a very well established field, traditionally has a narrow frontier concerning its deployment and runs on dedicated datacenters or large grid computing. The problem with common placement is mainly the initial cost and the inability to fully use resources which not all research labs can afford.
The main objective of this work was to investigate new technical solutions to allow the deployment of HPC applications on the Cloud, with particular emphasis on the private on-premise resources – the lower end of the chain which reduces costs. The work includes many experiments and analysis to identify obstacles and technology limitations. The feasibility of the objective was tested with new modeling, architecture and several applications migration.
The final application integrates a simplified incorporation of both public and private Cloud resources, as well as HPC applications scheduling, deployment and management. It uses a well-defined user role strategy, based on federated authentication and a seamless procedure to daily usage with balanced low cost and performance.O desenvolvimento e maturação da Computação em Nuvem abriu a janela de oportunidade para o surgimento de novas aplicações na Nuvem. A Computação de Alta Performance, uma classe dedicada à resolução de problemas complexos, surge como um novo consumidor no Mercado ao aproveitar as vantagens inerentes à Nuvem e deixando o dispendioso centro de computação tradicional e o difícil desenvolvimento em grelha.
Situando-se num avançado estado de maturação, a Nuvem de hoje deixou para trás muitas das suas limitações, tornando-se cada vez mais eficiente e disseminada. Melhoramentos de performance, baixa de preços devido à massificação e serviços personalizados a pedido despoletaram uma atenção inusitada de outros mercados.
A CAP, independentemente de ser uma área extremamente bem estabelecida, tradicionalmente tem uma fronteira estreita em relação à sua implementação. É executada em centros de computação dedicados ou computação em grelha de larga escala. O maior problema com o tipo de instalação habitual é o custo inicial e o não aproveitamento dos recursos a tempo inteiro, fator que nem todos os laboratórios de investigação conseguem suportar.
O objetivo principal deste trabalho foi investigar novas soluções técnicas para permitir o lançamento de aplicações CAP na Nuvem, com particular ênfase nos recursos privados existentes, a parte peculiar e final da cadeia onde se pode reduzir custos. O trabalho inclui várias experiências e análises para identificar obstáculos e limitações tecnológicas. A viabilidade e praticabilidade do objetivo foi testada com inovação em modelos, arquitetura e migração de várias aplicações.
A aplicação final integra uma agregação de recursos de Nuvens, públicas e privadas, assim como escalonamento, lançamento e gestão de aplicações CAP. É usada uma estratégia de perfil de utilizador baseada em autenticação federada, assim como procedimentos transparentes para a utilização diária com um equilibrado custo e performance
Live migration of user environments across wide area networks
A complex challenge in mobile computing is to allow the user to migrate her highly customised environment while moving to a different location and to continue work without interruption. I motivate why this is a highly desirable capability and conduct a survey of the current approaches towards this goal and explain their limitations. I then propose a new architecture to support user mobility by live migration of a user’s operating system instance over the network. Previous work includes the Collective and Internet Suspend/Resume projects that have addressed migration of a user’s environment by suspending the running state and resuming it at a later time. In contrast to previous work, this work addresses live migration of a user’s operating system instance across wide area links. Live migration is done by performing most of the migration while the operating system is still running, achieving very little downtime and preserving all network connectivity.
I developed an initial proof of concept of this solution. It relies on migrating whole operating systems using the Xen virtual machine and provides a way to perform live migration of persistent storage as well as the network connections across subnets. These challenges have not been addressed previously in this scenario. In a virtual machine environment, persistent storage is provided by virtual block devices. The architecture supports decentralized virtual block device replication across wide area network links, as well as migrating network connection across subnetworks using the Host Identity Protocol. The proposed architecture is compared against existing solutions and an initial performance evaluation of the prototype implementation is presented, showing that such a solution is a promising step towards true seamless mobility of fully fledged computing environments