Search CORE

13 research outputs found

Practical study of bare metal virtualization solutions

Author: Pousa Duarte
Rufino José
Publication venue: Instituto Politécnico de Bragança
Publication date: 01/01/2017
Field of study

With the hardware breakthroughs accomplished through the years, the idea of software defined hardware has become a reality. Hypervisors such as KVM, Xen, Hyper-V and ESXi enable the cloud of today, with hardware consolidation bringing a reduction in operating costs. In this scope, it is imperative to address the performance of all the different virtualization implementations, in order to discover any potential bottlenecks and bugs. In this work, the performance of all the prominent Type-1 virtualization platforms is analyzed, using guests representative of the Windows NT and Linux kernels, in the form of Windows 10 LTSB and Ubuntu Server 16.04 LTS. The effectiveness of the CPU scheduler of each hypervisor is put to the test, as well as the storage backend performance under multiple scenarios (iSCSI, NFS and local). In short, this project provides a snapshot of the current state of the virtualization market, covering CPU, Memory, 2D & 3D Graphics performance of oVirt, Proxmox, XenServer, Hyper-V and VMware Vsphere. All the benchmarks were executed using their own default settings, with some automation scripts, in order to accelerate the process and exclude variability as much as possible. Among the selected benchmarks were: Passmark Performance Test 9 to benchmark Windows performance; Unixbench, providing a way to extrapolate the performance of Linux guests; (ez)FIO allowed in-depth analysis of filesystem performance across platforms. Concluding, there are a few generalizations that can be made from the information gathered: XenServer, oVirt and Proxmox require the presence of xentools/virtio in order to provide good I/O throughput; GPU passthrough provides native performance as long as there is no resource overcommitment; VMware's Vsphere provides impressive CPU performance, edging out the competition, with 98% of the native performance; Hyper-V offers mediocre 2D Desktop performance (28% of the native performance), as such, it should not be used in VMs that provide interactive desktops; Similarly, Hyper-V's performance plunges in memory related workloads, when compared to the remaining platforms and bare metal, with a mere 83%; The remote I/O results crown iSCSI as best performer, with double the performance of NFS; All the open source platforms (Proxmox, oVirt and XenServer) display impressive remote I/O performance, in both iSCSI and NFS.info:eu-repo/semantics/publishedVersio

Biblioteca Digital do IPB

Glider: A GPU Library Driver for Improved System Security

Author: Sani Ardalan Amiri
Wallach Dan S.
Zhong Lin
Publication venue
Publication date: 13/11/2014
Field of study

Legacy device drivers implement both device resource management and isolation. This results in a large code base with a wide high-level interface making the driver vulnerable to security attacks. This is particularly problematic for increasingly popular accelerators like GPUs that have large, complex drivers. We solve this problem with library drivers, a new driver architecture. A library driver implements resource management as an untrusted library in the application process address space, and implements isolation as a kernel module that is smaller and has a narrower lower-level interface (i.e., closer to hardware) than a legacy driver. We articulate a set of device and platform hardware properties that are required to retrofit a legacy driver into a library driver. To demonstrate the feasibility and superiority of library drivers, we present Glider, a library driver implementation for two GPUs of popular brands, Radeon and Intel. Glider reduces the TCB size and attack surface by about 35% and 84% respectively for a Radeon HD 6450 GPU and by about 38% and 90% respectively for an Intel Ivy Bridge GPU. Moreover, it incurs no performance cost. Indeed, Glider outperforms a legacy driver for applications requiring intensive interactions with the device driver, such as applications using the OpenGL immediate mode API

arXiv.org e-Print Archive

CiteSeerX

GPrioSwap : Towards a Swapping Policy for GPUs

Author: Bellosa Frank
Gottschlag Mathias
Hillenbrand Marius
Kehne Jens
Merkel Martin
Metter Jonathan
Publication venue: Association for Computing Machinery
Publication date: 29/05/2017
Field of study

Over the last few years, Graphics Processing Units (GPUs) have become popular in computing, and have found their way into a number of cloud platforms. However, integrating a GPU into a cloud environment requires the cloud provider to efficiently virtualize the GPU. While several research projects have addressed this challenge in the past, few of these projects attempt to properly enable sharing of GPU memory between multiple clients: To date, GPUswap is the only project that enables sharing of GPU memory without inducing unnecessary application overhead, while maintaining both fairness and high utilization of GPU memory. However, GPUswap includes only a rudimentary swapping policy, and therefore induces a rather large application overhead. In this paper, we work towards a practicable swapping policy for GPUs. To that end, we analyze the behavior of various GPU applications to determine their memory access patterns. Based on our insights about these patterns, we derive a swapping policy that includes a developer-assigned priority for each GPU buffer in its swapping decisions. Experiments with our prototype implementation show that a swapping policy based on buffer priorities can significantly reduce the swapping overhead

KITopen

Securing display path for security-sensitive applications on mobile devices

Author: CAI Zhiping
CUI Jinhua
LI Yangyang
LIU Anfeng
ZHANG Yuanyuan
Publication venue: Tech Science Press
Publication date: 01/01/2018
Field of study

Institutional Knowledge at Singapore Management University

LoGA : Low-Overhead GPU Accounting Using Events

Author: Bellosa Frank
Hillenbrand Marius
Kehne Jens
Rittinghaus Marc
Spassov Stanislav
Publication venue: Association for Computing Machinery
Publication date: 01/01/2017
Field of study

Over the last few years, GPUs have become common in computing. However, current GPUs are not designed for a shared environment like a cloud, creating a number of challenges whenever a GPU must be multiplexed between multiple users. In particular, the round-robin scheduling used by today\u27s GPUs does not distribute the available GPU computation time fairly among applications. Most of the previous work addressing this problem resorted to scheduling all GPU computation in software, which induces high overhead. While there is a GPU scheduler called NEON which reduces the scheduling overhead compared to previous work, NEON\u27s accounting mechanism frequently disables GPU access for all but one application, resulting in considerable overhead if that application does not saturate the GPU by itself. In this paper, we present LoGA, a novel accounting mechanism for GPU computation time. LoGA monitors the GPU\u27s state to detect GPU-internal context switches, and infers the amount of GPU computation time consumed by each process from the time between these context switches. This method allows LoGA to measure GPU computation time consumed by applications while keeping all applications running concurrently. As a result, LoGA achieves a lower accounting overhead than previous work, especially for applications that do not saturate the GPU by themselves. We have developed a prototype which combines LoGA with the pre-existing NEON scheduler. Experiments with that prototype have shown that LoGA induces no accounting overhead while still delivering accurate measurements of applications\u27 consumed GPU computation time

KITopen

엣지 클라우드 환경을 위한 연산 오프로딩 시스템

Author: 정혁진
Publication venue: 서울대학교 대학원
Publication date: 01/02/2020
Field of study

학위논문(박사)--서울대학교 대학원 :공과대학 전기·정보공학부,2020. 2. 문수묵.The purpose of my dissertation is to build lightweight edge computing systems which provide seamless offloading services even when users move across multiple edge servers. I focused on two specific application domains: 1) web applications and 2) DNN applications. I propose an edge computing system which offload computations from web-supported devices to edge servers. The proposed system exploits the portability of web apps, i.e., distributed as source code and runnable without installation, when migrating the execution state of web apps. This significantly reduces the complexity of state migration, allowing a web app to migrate within a few seconds. Also, the proposed system supports offloading of webassembly, a standard low-level instruction format for web apps, having achieved up to 8.4x speedup compared to offloading of pure JavaScript codes. I also propose incremental offloading of neural network (IONN), which simultaneously offloads DNN execution while deploying a DNN model, thus reducing the overhead of DNN model deployment. Also, I extended IONN to support large-scale edge server environments by proactively migrating DNN layers to edge servers where mobile users are predicted to visit. Simulation with open-source mobility dataset showed that the proposed system could significantly reduce the overhead of deploying a DNN model.본 논문의 목적은 사용자가 이동하는 동안에도 원활한 연산 오프로딩 서비스를 제공하는 경량 엣지 컴퓨팅 시스템을 구축하는 것입니다. 웹 어플리케이션과 인공신경망 (DNN: Deep Neural Network) 이라는 두 가지 어플리케이션 도메인에서 연구를 진행했습니다. 첫째, 웹 지원 장치에서 엣지 서버로 연산을 오프로드하는 엣지 컴퓨팅 시스템을 제안합니다. 제안된 시스템은 웹 앱의 실행 상태를 마이그레이션 할 때 웹 앱의 높은 이식성(소스 코드로 배포되고 설치하지 않고 실행할 수 있음)을 활용합니다. 이를 통해 상태 마이그레이션의 복잡성이 크게 줄여서 웹 앱이 몇 초 내에 마이그레이션 될 수 있습니다. 또한, 제안된 시스템은 웹 어플리케이션을 위한 표준 저수준 인스트럭션인 웹 어셈블리 오프로드를 지원하여 순수한 JavaScript 코드 오프로드와 비교하여 최대 8.4 배의 속도 향상을 달성했습니다. 둘째, DNN 어플리케이션을 엣지 서버에 배포할 때, DNN 모델을 전송하는 동안 DNN 연산을 오프로드 하여 빠르게 성능향상을 달성할 수 있는 점진적 오프로드 방식을 제안합니다. 또한, 모바일 사용자가 방문 할 것으로 예상되는 엣지 서버로 DNN 레이어를 사전에 마이그레이션하여 콜드 스타트 성능을 향상시키는 방식을 제안 합니다. 오픈 소스 모빌리티 데이터셋을 이용한 시뮬레이션에서, DNN 모델을 배포하면서 발생하는 성능 저하를 제안 하는 방식이 크게 줄일 수 있음을 확인하였습니다.Chapter 1. Introduction 1 1.1 Offloading Web App Computations to Edge Servers 1 1.2 Offloading DNN Computations to Edge Servers 3 Chapter 2. Seamless Offloading of Web App Computations 7 2.1 Motivation: Computation-Intensive Web Apps 7 2.2 Mobile Web Worker System 10 2.2.1 Review of HTML5 Web Worker 10 2.2.2 Mobile Web Worker System 11 2.3 Migrating Web Worker 14 2.3.1 Runtime State of Web Worker 15 2.3.2 Snapshot of Mobile Web Worker 16 2.3.3 End-to-End Migration Process 21 2.4 Evaluation 22 2.4.1 Experimental Environment 22 2.4.2 Migration Performance 24 2.4.3 Application Execution Performance 27 Chapter 3. IONN: Incremental Offloading of Neural Network Computations 30 3.1 Motivation: Overhead of Deploying DNN Model 30 3.2 Background 32 3.2.1 Deep Neural Network 33 3.2.2 Offloading of DNN Computations 33 3.3 IONN For DNN Edge Computing 35 3.4 DNN Partitioning 37 3.4.1 Neural Network (NN) Execution Graph 38 3.4.2 Partitioning Algorithm 40 3.4.3 Handling DNNs with Multiple Paths. 43 3.5 Evaluation 45 3.5.1 Experimental Environment 45 3.5.2 DNN Query Performance 46 3.5.3 Accuracy of Prediction Functions 48 3.5.4 Energy Consumption. 49 Chapter 4. PerDNN: Offloading DNN Computations to Pervasive Edge Servers 51 4.1 Motivation: Cold Start Issue 51 4.2 Proposed Offloading System: PerDNN 52 4.2.1 Edge Server Environment 53 4.2.2 Overall Architecture 54 4.2.3 GPU-aware DNN Partitioning 56 4.2.4 Mobility Prediction 59 4.3 Evaluation 63 4.3.1 Performance Gain of Single Client 64 4.3.2 Large-Scale Simulation 65 Chapter 5. RelatedWorks 73 Chapter 6. Conclusion. 78 Chapter 5. RelatedWorks 73 Chapter 6. Conclusion 78 Bibliography 80Docto

SNU Open Repository and Archive

A Performance Comparison of VMware GPU Virtualization Techniques in Cloud Gaming

Author: Zhuo Zhihong
Publication venue: 'University of Saskatchewan Library'
Publication date
Field of study

Cloud gaming is an application deployment scenario which runs an interactive gaming application remotely in a cloud according to the commands received from a thin client and streams the scenes as a video sequence back to the client over the Internet, and it is of interest to both research community and industry. The academic community has developed some open-source cloud gaming systems such as GamingAnywhere for research study, while some industrial pioneers such as Onlive and Gaikai have succeeded in gaining a large user base in the cloud gaming market. Graphical Processing Unit (GPU) virtualization plays an important role in such an environment as it is a critical component that allows virtual machines to run 3D applications with performance guarantees. Currently, GPU pass-through and GPU sharing are the two main techniques of GPU virtualization. The former enables a single virtual machine to access a physical GPU directly and exclusively, while the latter makes a physical GPU shareable by multiple virtual machines. VMware Inc., one of the most popular virtualization solution vendors, has provided concrete implementations of GPU pass-through and GPU sharing. In particular, it provides a GPU pass-through solution called Virtual Dedicated Graphics Acceleration (vDGA) and a GPU-sharing solution called Virtual Shared Graphics Acceleration (vSGA). Moreover, VMware Inc. recently claimed it realized another GPU sharing solution called vGPU. Nevertheless, the feasibility and performance of these solutions in cloud gaming has not been studied yet. In this work, an experimental study is conducted to evaluate the feasibility and performance of GPU pass-through and GPU sharing solutions offered by VMware in cloud gaming scenarios. The primary results confirm that vDGA and vGPU techniques can fit the demands of cloud gaming. In particular, these two solutions achieved good performance in the tested graphics card benchmarks, and gained acceptable image quality and response delay for the tested games

eCommons@USASK

University of Saskatchewan Research Archive