Search CORE

306 research outputs found

Flash Cache Hybrid Storage System For Video-On-Demand Server

Author: Al-Wasabi Ola Ahmed Mohammed
Publication venue
Publication date: 01/08/2018
Field of study

A video-on-demand (VOD) system allows users to access any video at any time. This system enables users to view videos in the real-time streaming mode for extended durations. One of the important components of the VOD systems is the hybrid storage server. This component consists of HDD, SSD, and RAM, collectively fulfilling the requirement of simultaneous fast data access and large data distribution to numerous users. However, current hybrid storage systems still pose numerous challenges. (1) The integration and roles of the HDD, SSD, and RAM are relatively weak in terms of optimizing fast access prior to streaming to a large number of simultaneous users. (2) The HDD and SSD exhibit poor data layout and streaming controller in supporting the production of a high number of simultaneous streams. This thesis proposes (1) an enhanced hybrid storage system (EHSS) for the VOD servers. The HDD, SSD, and RAM are managed and integrated using an effective mechanism, in which the top 10% most popular videos are utilized to achieve fast data access and service to a large number of simultaneous users. (2) The flash cache hybrid storage system (FCHSS) is also proposed. Unlike the EHSS architecture, the FCHSS architecture has no RAM, which is removed and replaced by a flash-based SSD. The flash-based SSD is used instead of the RAM as a cache for the HDD to achieve fast data access and service to a large number of simultaneous users. (3) The new data layout stores thousands of video segments in the HDD and SSD

Repository@USM

SSD의 긴 꼬리 지연시간 문제 완화를 위한 강화학습의 적용

Author: 강원경
Publication venue: 서울대학교 대학원
Publication date: 01/02/2020
Field of study

학위논문(박사)--서울대학교 대학원 :공과대학 컴퓨터공학부,2020. 2. 유승주.NAND flash memory is widely used in a variety of systems, from realtime embedded systems to high-performance enterprise server systems. Flash memory has (1) erase-before-write (write-once) and (2) endurance problems. To handle the erase-before-write feature, apply a flash-translation layer (FTL). Currently, the page-level mapping method is mainly used to reduce the latency increase caused by the write-once and block erase characteristics of flash memory. Garbage collection (GC) is one of the leading causes of long-tail latency, which increases more than 100 times the average latency at 99th percentile. Therefore, real-time systems or quality-critical systems cannot satisfy given requirements such as QoS restrictions. As flash memory capacity increases, GC latency also tends to increase. This is because the block size (the number of pages included in one block) of the flash memory increases as the capacity of the flash memory increases. GC latency is determined by valid page copy and block erase time. Therefore, as block size increases, GC latency also increases. Especially, the block size gets increased from 2D to 3D NAND flash memory, e.g., 256 pages/block in 2D planner NAND flash memory and 768 pages/block in 3D NAND flash memory. Even in 3D NAND flash memory, the block size is expected to continue to increase. Thus, the long write latency problem incurred by GC can become more serious in 3D NAND flash memory-based storage. In this dissertation, we propose three versions of the novel GC scheduling method based on reinforcement learning. The purpose of this method is to reduce the long tail latency caused by GC by utilizing the idle time of the storage system. Also, we perform a quantitative analysis for the RL-assisted GC solution. RL-assisted GC scheduling technique was proposed which learns the storage access behavior online and determines the number of GC operations to exploit the idle time. We also presented aggressive methods, which helps in further reducing the long tail latency by aggressively performing fine-grained GC operations. We also proposed a technique that dynamically manages key states in RL-assisted GC to reduce the long-tail latency. This technique uses many fine-grained pieces of information as state candidates and manages key states that suitably represent the characteristics of the workload using a relatively small amount of memory resource. Thus, the proposed method can reduce the long-tail latency even further. In addition, we presented a Q-value prediction network that predicts the initial Q-value of a newly inserted state in the Q-table cache. The integrated solution of the Q-table cache and Q-value prediction network can exploit the short-term history of the system with a low-cost Q-table cache. It is also equipped with a small network called Q-value prediction network to make use of the long-term history and provide good Q-value initialization for the Q-table cache. The experiments show that our proposed method reduces by 25%-37% the long tail latency compared to the state-of-the-art method.낸드 플래시 메모리는 실시간 임베디드 시스템으로부터 고성능의 엔터프라이즈 서버 시스템까지 다양한 시스템에서 널리 사용 되고 있다. 플래시 메모리는 (1) erase-before-write (write-once)와 (2) endurance 문제를 갖고 있다. Erase-before-write 특성을 다루기 위해 flash-translation layer (FTL)을 적용 한다. 현재 플래시 메모리의 write-once 특성과 block erase특성으로 인한 latency 증가를 감소 시키기 위하여 page-level mapping방식이 주로 사용 된다. Garbage collection (GC)은 99th percentile에서 평균 지연시간의 100배 이상 증가하는 long tail latency를 유발시키는 주요 원인 중 하나이다. 따라서 실시간 시스템이나 quality-critical system에서는 Quality of Service (QoS) 제한과 같은 주어진 요구 조건을 만족 시킬 수 없다. 플래시 메모리의 용량이 증가함에 따라 GC latency도 증가하는 경향을 보인다. 이것은 플래시 메모리의 용량이 증가 함에 따라 플래시 메모리의 블록 크기 (하나의 블록이 포함하고 있는 페이지의 수)가 증가 하기 때문이다. GC latency는 valid page copy와 block erase 시간에 의해 결정 된다. 따라서, 블록 크기가 증가하면, GC latency도 증가 한다. 특히, 최근 2D planner 플래시 메모리에서 3D vertical 플래시 메모리 구조로 전환됨에 따라 블록 크기는 증가 하였다. 심지어 3D vertical 플래시 메모리에서도 블록 크기가 지속적으로 증가 하고 있다. 따라서 3D vertical 플래시 메모리에서 long tail latency 문제는 더욱 심각해 진다. 본 논문에서 우리는 강화학습(Reinforcement learning, RL)을 이용한 세 가지 버전의 새로운 GC scheduling 기법을 제안하였다. 제안된 기술의 목적은 스토리지 시스템의 idle 시간을 활용하여 GC에 의해 발생된 long tail latency를 감소 시키는 것이다. 또한, 우리는 RL-assisted GC 솔루션을 위한 정량 분석 하였다. 우리는 스토리지의 access behavior를 온라인으로 학습하고, idle 시간을 활용할 수 있는 GC operation의 수를 결정하는 RL-assisted GC scheduling 기술을 제안 하였다. 추가적으로 우리는 공격적인 방법을 제시 하였다. 이 방법은 작은 단위의 GC operation들을 공격적으로 수행 함으로써, long tail latency를 더욱 감소 시킬 수 있도록 도움을 준다. 또한 우리는 long tail latency를 더욱 감소시키기 위하여 RL-assisted GC의 key state들을 동적으로 관리할 수 있는 Q-table cache 기술을 제안 하였다. 이 기술은 state 후보로 매우 많은 수의 세밀한 정보들을 사용 하고, 상대적으로 작은 메모리 공간을 이용하여 workload의 특성을 적절하게 표현 할 수 있는 key state들을 관리 한다. 따라서, 제안된 방법은 long tail latency를 더욱 감소 시킬 수 있다. 추가적으로, 우리는 Q-table cache에 새롭게 추가되는 state의 초기값을 예측하는 Q-value prediction network (QP Net)를 제안 하였다. Q-table cache와 QP Net의 통합 솔루션은 저 비용의 Q-table cache를 이용하여 단기간의 과거 정보를 활용 할 수 있다. 또한 이것은 QP Net이라고 부르는 작은 신경망을 이용하여 학습한 장기간의 과거 정보를 사용하여 Q-table cache에 새롭게 삽입되는 state에 대해 좋은 Q-value 초기값을 제공한다. 실험결과는 제안한 방법이 state-of-the-art 방법에 비교하여 25%-37%의 long tail latency를 감소 시켰음을 보여준다.Chapter 1 Introduction 1 Chapter 2 Background 6 2.1 System Level Tail Latency 6 2.2 Solid State Drive 10 2.2.1 Flash Storage Architecture and Garbage Collection 10 2.3 Reinforcement Learning 13 Chapter 3 Related Work 17 Chapter 4 Small Q-table based Solution to Reduce Long Tail Latency 23 4.1 Problem and Motivation 23 4.1.1 Long Tail Problem in Flash Storage Access Latency 23 4.1.2 Idle Time in Flash Storage 24 4.2 Design and Implementation 26 4.2.1 Solution Overview 26 4.2.2 RL-assisted Garbage Collection Scheduling 27 4.2.3 Aggressive RL-assisted Garbage Collection Scheduling 33 4.3 Evaluation 35 4.3.1 Evaluation Setup 35 4.3.2 Results and Discussion 39 Chapter 5 Q-table Cache to Exploit a Large Number of States at Small Cost 52 5.1 Motivation 52 5.2 Design and Implementation 56 5.2.1 Solution Overview 56 5.2.2 Dynamic Key States Management 61 5.3 Evaluation 67 5.3.1 Evaluation Setup 67 5.3.2 Results and Discussion 67 Chapter 6 Combining Q-table cache and Neural Network to Exploit both Long and Short-term History 73 6.1 Motivation and Problem 73 6.1.1 More State Information can Further Reduce Long Tail Latency 73 6.1.2 Locality Behavior of Workload 74 6.1.3 Zero Initialization Problem 75 6.2 Design and Implementation 77 6.2.1 Solution Overview 77 6.2.2 Q-table Cache for Action Selection 80 6.2.3 Q-value Prediction 83 6.3 Evaluation 87 6.3.1 Evaluation Setup 87 6.3.2 Storage-Intensive Workloads 89 6.3.3 Latency Comparison: Overall 92 6.3.4 Q-value Prediction Network Effects on Latency 97 6.3.5 Q-table Cache Analysis 110 6.3.6 Immature State Analysis 113 6.3.7 Miscellaneous Analysis 116 6.3.8 Multi Channel Analysis 121 Chapter 7 Conculsion and Future Work 138 7.1 Conclusion 138 7.2 Future Work 140 Bibliography 143 국문초록 154Docto

Data Deduplication Technology for Cloud Storage

Author: Bilin Shao
Genqing Bian*
Qinlu He
Weiqi Zhang
Publication venue: 'Mechanical Engineering Faculty in Slavonski Brod'
Publication date: 01/01/2020
Field of study

With the explosive growth of information data, the data storage system has stepped into the cloud storage era. Although the core of the cloud storage system is distributed file system in solving the problem of mass data storage, a large number of duplicate data exist in all storage system. File systems are designed to control how files are stored and retrieved. Fewer studies focus on the cloud file system deduplication technologies at the application level, especially for the Hadoop distributed file system. In this paper, we design a file deduplication framework on Hadoop distributed file system for cloud application developer. Proposed RFD-HDFS and FD-HDFS two data deduplication solutions process data deduplication online, which improves storage space utilisation and reduces the redundancy. In the end of the paper, we test the disk utilisation and the file upload performance on RFD-HDFS and FD-HDFS, and compare HDFS with the disk utilisation of two system frameworks. The results show that the two-system framework not only implements data deduplication function but also effectively reduces the disk utilisation of duplicate files. So, the proposed framework can indeed reduce the storage space by eliminating redundant HDFS file

A (Nearly) Free Lunch:Extending NAND Flash Lifetime by Exploiting Neglected Physical Properties

Author: Jimenez Xavier
Publication venue: Lausanne, EPFL
Publication date: 09/12/2014
Field of study

NAND flash is a key storage technology in modern computing systems. Without it, many devices would probably not exist today or would at least not benefit from as many features. The very large success of this technology motivated massive efforts to scale it down in order to increase its density further. However, NAND flash is currently facing physical limitations that prevent it reaching smaller cell sizes without severely reducing its storage reliability and lifetime. Accordingly, in the present thesis we aim at relieving some constraints from device manufacturing by addressing flash irregularities at a higher level. For example, we acknowledge the fact that process variation plus other factors render some regions of a flash device more sensitive than others. This difference usually leads to sensitive regions exhausting their lifetime early, which then causes the device to become unusable, while the rest of the device is still healthy, yet not exploitable. Consequently, we propose to postpone this exhaustion point with new strategies that require minimal resources to be implemented and effectively extend flash devices lifetime. Sometimes, our strategies involve unconventional methods to access the flash that are not supported by specification document and, therefore, should not be used lightly. Hence, we also present thorough characterization experiments on actual NAND flash chips to validate these methods and model their effect on a flash device. Finally, we evaluate the performance of our methods by implementing a trace-driven flash device simulator and execute a large set of realistic disk traces. Overall, we exploit properties that are either neglected or not understood to propose methods that are nearly free to implement and systematically extend NAND flash lifetime. We are convinced that future NAND flash architectures will regularly bring radical physical changes, which will inevitably come together with a new set of physical properties to investigate and to exploit

Dynamic Binary Translation for Embedded Systems with Scratchpad Memory

Author: Baiocchi Paredes Jose Americo
Publication venue
Publication date: 01/01/2011
Field of study

Embedded software development has recently changed with advances in computing. Rather than fully co-designing software and hardware to perform a relatively simple task, nowadays embedded and mobile devices are designed as a platform where multiple applications can be run, new applications can be added, and existing applications can be updated. In this scenario, traditional constraints in embedded systems design (i.e., performance, memory and energy consumption and real-time guarantees) are more difficult to address. New concerns (e.g., security) have become important and increase software complexity as well. In general-purpose systems, Dynamic Binary Translation (DBT) has been used to address these issues with services such as Just-In-Time (JIT) compilation, dynamic optimization, virtualization, power management and code security. In embedded systems, however, DBT is not usually employed due to performance, memory and power overhead. This dissertation presents StrataX, a low-overhead DBT framework for embedded systems. StrataX addresses the challenges faced by DBT in embedded systems using novel techniques. To reduce DBT overhead, StrataX loads code from NAND-Flash storage and translates it into a Scratchpad Memory (SPM), a software-managed on-chip SRAM with limited capacity. SPM has similar access latency as a hardware cache, but consumes less power and chip area. StrataX manages SPM as a software instruction cache, and employs victim compression and pinning to reduce retranslation cost and capture frequently executed code in the SPM. To prevent performance loss due to excessive code expansion, StrataX minimizes the amount of code inserted by DBT to maintain control of program execution. When a hardware instruction cache is available, StrataX dynamically partitions translated code among the SPM and main memory. With these techniques, StrataX has low performance overhead relative to native execution for MiBench programs. Further, it simplifies embedded software and hardware design by operating transparently to applications without any special hardware support. StrataX achieves sufficiently low overhead to make it feasible to use DBT in embedded systems to address important design goals and requirements

CiteSeerX

상변화 메모리 시스템의 간섭 오류 완화 및 RMW 성능 향상 기법

Author: 이효근
Publication venue: 서울대학교 대학원
Publication date: 01/01/2021
Field of study

학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2021.8. 이혁재.Phase-change memory (PCM) announces the beginning of the new era of memory systems, owing to attractive characteristics. Many memory product manufacturers (e.g., Intel, SK Hynix, and Samsung) are developing related products. PCM can be applied to various circumstances; it is not simply limited to an extra-scale database. For example, PCM has a low standby power due to its non-volatility; hence, computation-intensive applications or mobile applications (i.e., long memory idle time) are suitable to run on PCM-based computing systems. Despite these fascinating features of PCM, PCM is still far from the general commercial market due to low reliability and long latency problems. In particular, low reliability is a painful problem for PCM in past decades. As the semiconductor process technology rapidly scales down over the years, DRAM reaches 10 nm class process technology. In addition, it is reported that the write disturbance error (WDE) would be a serious issue for PCM if it scales down below 54 nm class process technology. Therefore, addressing the problem of WDEs becomes essential to make PCM competitive to DRAM. To overcome this problem, this dissertation proposes a novel approach that can restore meta-stable cells on demand by levering two-level SRAM-based tables, thereby significantly reducing the number WDEs. Furthermore, a novel randomized approach is proposed to implement a replacement policy that originally requires hundreds of read ports on SRAM. The second problem of PCM is a long-latency compared to that of DRAM. In particular, PCM tries to enhance its throughput by adopting a larger transaction unit; however, the different unit size from the general-purpose processor cache line further degrades the system performance due to the introduction of a read-modify-write (RMW) module. Since there has never been any research related to RMW in a PCM-based memory system, this dissertation proposes a novel architecture to enhance the overall system performance and reliability of a PCM-based memory system having an RMW module. The proposed architecture enhances data re-usability without introducing extra storage resources. Furthermore, a novel operation that merges commands regardless of command types is proposed to enhance performance notably. Another problem is the absence of a full simulation platform for PCM. While the announced features of the PCM-related product (i.e., Intel Optane) are scarce due to confidential issues, all priceless information can be integrated to develop an architecture simulator that resembles the available product. To this end, this dissertation tries to scrape up all available features of modules in a PCM controller and implement a dedicated simulator for future research purposes.상변화 메모리는(PCM) 매력적인 특성을 통해 메모리 시스템의 새로운 시대의 시작을 알렸다. 많은 메모리 관련 제품 제조업체(예 : 인텔, SK 하이닉스, 삼성)가 관련 제품 개발에 박차를 가하고 있다. PCM은 단순히 대규모 데이터베이스에만 국한되지 않고 다양한 상황에 적용될 수 있다. 예를 들어, PCM은 비휘발성으로 인해 대기 전력이 낮다. 따라서 계산 집약적인 애플리케이션 또는 모바일 애플리케이션은(즉, 긴 메모리 유휴 시간) PCM 기반 컴퓨팅 시스템에서 실행하기에 적합하다. PCM의 이러한 매력적인 특성에도 불구하고 PCM은 낮은 신뢰성과 긴 대기 시간으로 인해 여전히 일반 산업 시장에서는 DRAM과 다소 격차가 있다. 특히 낮은 신뢰성은 지난 수십 년 동안 PCM 기술의 발전을 저해하는 문제다. 반도체 공정 기술이 수년에 걸쳐 빠르게 축소됨에 따라 DRAM은 10nm 급 공정 기술에 도달하였다. 이어서, 쓰기 방해 오류 (WDE)가 54nm 등급 프로세스 기술 아래로 축소되면 PCM에 심각한 문제가 될 것으로 보고되었다. 따라서, WDE 문제를 해결하는 것은 PCM이 DRAM과 동등한 경쟁력을 갖추도록 하는 데 있어 필수적이다. 이 문제를 극복하기 위해 이 논문에서는 2-레벨 SRAM 기반 테이블을 활용하여 WDE 수를 크게 줄여 필요에 따라 준 안정 셀을 복원할 수 있는 새로운 접근 방식을 제안한다. 또한, 원래 SRAM에서 수백 개의 읽기 포트가 필요한 대체 정책을 구현하기 위해 새로운 랜덤 기반의 기법을 제안한다. PCM의 두 번째 문제는 DRAM에 비해 지연 시간이 길다는 것이다. 특히 PCM은 더 큰 트랜잭션 단위를 채택하여 단위시간 당 데이터 처리량 향상을 도모한다. 그러나 범용 프로세서 캐시 라인과 다른 유닛 크기는 읽기-수정-쓰기 (RMW) 모듈의 도입으로 인해 시스템 성능을 저하하게 된다. PCM 기반 메모리 시스템에서 RMW 관련 연구가 없었기 때문에 본 논문은 RMW 모듈을 탑재 한 PCM 기반 메모리 시스템의 전반적인 시스템 성능과 신뢰성을 향상하게 시킬 수 있는 새로운 아키텍처를 제안한다. 제안된 아키텍처는 추가 스토리지 리소스를 도입하지 않고도 데이터 재사용성을 향상시킨다. 또한, 성능 향상을 위해 명령 유형과 관계없이 명령을 병합하는 새로운 작업을 제안한다. 또 다른 문제는 PCM을 위한 완전한 시뮬레이션 플랫폼이 부재하다는 것이다. PCM 관련 제품(예 : Intel Optane)에 대해 발표된 정보는 대외비 문제로 인해 부족하다. 하지만 알려져 있는 정보를 적절히 취합하면 시중 제품과 유사한 아키텍처 시뮬레이터를 개발할 수 있다. 이를 위해 본 논문은 PCM 메모리 컨트롤러에 필요한 모든 모듈 정보를 활용하여 향후 이와 관련된 연구에서 충분히 사용 가능한 전용 시뮬레이터를 구현하였다.1 INTRODUCTION 1 1.1 Limitation of Traditional Main Memory Systems 1 1.2 Phase-Change Memory as Main Memory 3 1.2.1 Opportunities of PCM-based System 3 1.2.2 Challenges of PCM-based System 4 1.3 Dissertation Overview 7 2 BACKGROUND AND PREVIOUS WORK 8 2.1 Phase-Change Memory 8 2.2 Mitigation Schemes for Write Disturbance Errors 10 2.2.1 Write Disturbance Errors 10 2.2.2 Verification and Correction 12 2.2.3 Lazy Correction 13 2.2.4 Data Encoding-based Schemes 14 2.2.5 Sparse-Insertion Write Cache 16 2.3 Performance Enhancement for Read-Modify-Write 17 2.3.1 Traditional Read-Modify-Write 17 2.3.2 Write Coalescing for RMW 19 2.4 Architecture Simulators for PCM 21 2.4.1 NVMain 21 2.4.2 Ramulator 22 2.4.3 DRAMsim3 22 3 IN-MODULE DISTURBANCE BARRIER 24 3.1 Motivation 25 3.2 IMDB: In Module-Disturbance Barrier 29 3.2.1 Architectural Overview 29 3.2.2 Implementation of Data Structures 30 3.2.3 Modification of Media Controller 36 3.3 Replacement Policy 38 3.3.1 Replacement Policy for IMDB 38 3.3.2 Approximate Lowest Number Estimator 40 3.4 Putting All Together: Case Studies 43 3.5 Evaluation 45 3.5.1 Configuration 45 3.5.2 Architectural Exploration 47 3.5.3 Effectiveness of the Replacement Policy 48 3.5.4 Sensitivity to Main Table Configuration 49 3.5.5 Sensitivity to Barrier Buffer Size 51 3.5.6 Sensitivity to AppLE Group Size 52 3.5.7 Comparison with Other Studies 54 3.6 Discussion 59 3.7 Summary 63 4 INTEGRATION OF AN RMW MODULE IN A PCM-BASED SYSTEM 64 4.1 Motivation 65 4.2 Utilization of DRAM Cache for RMW 67 4.2.1 Architectural Design 67 4.2.2 Algorithm 70 4.3 Typeless Command Merging 73 4.3.1 Architectural Design 73 4.3.2 Algorithm 74 4.4 An Alternative Implementation: SRC-RMW 78 4.4.1 Implementation of SRC-RMW 78 4.4.2 Design Constraint 80 4.5 Case Study 82 4.6 Evaluation 85 4.6.1 Configuration 85 4.6.2 Speedup 88 4.6.3 Read Reliability 91 4.6.4 Energy Consumption: Selecting a Proper Page Size 93 4.6.5 Comparison with Other Studies 95 4.7 Discussion 97 4.8 Summary 99 5 AN ALL-INCLUSIVE SIMULATOR FOR A PCM CONTROLLER 100 5.1 Motivation 101 5.2 PCMCsim: PCM Controller Simulator 103 5.2.1 Architectural Overview 103 5.2.2 Underlying Classes of PCMCsim 104 5.2.3 Implementation of Contention Behavior 108 5.2.4 Modules of PCMCsim 109 5.3 Evaluation 116 5.3.1 Correctness of the Simulator 116 5.3.2 Comparison with Other Simulators 117 5.4 Summary 119 6 Conclusion 120 Abstract (In Korean) 141 Acknowledgment 143박

Sisäkkäiset virtuaaliympäristöt

Author: Kuutvuori Jani
Publication venue
Publication date: 28/08/2017
Field of study

Virtual Machines have been a common computation platform in areas of cloud computing for some time now. VMs offer a decent amount of isolation for security and system resources, and from application perspective they behave much like native environments. Software containers are gaining popularity, as a new application delivery technology. Just like VMs, applications started inside containers are running in isolated environments but without the performance overhead caused by virtualization of system resources. This makes containers seem like a more effient option for VMs. In this thesis, different combinations of containers and VMs are benchmarked. For each benchmark, host environment is also measured, to understand the overhead caused by the underlying virtuel environment technology. Benchmarks used include storage and network access benchmarks, and also an application benchmark of compiling Linux kernel. As another part of the thesis, a CPU intensive workload is run on the virtualization host server. Then the benchmarks are repeated, in order to determine how much the given workload effects the benchmark score, and also if this effect can be observed from the virtualization guest side by measuring CPU steal time. Results show that containers are slightly slower in the application benchmark than the host. The main difference is expected to come from the way docker handles storage accesses. With default network configuration, the container is losing in terms of performance to the host. In every benchmark we did, VMs always lost to host and containers in performance.Virtuaalikoneista on tullut yleinen laskenta-alusta pilvitietokoneille. Ne eristävät virtuaaliympäristön muista palveluista samalla fyysisellä koneella ja sovellusten näkökulmasta ne toimivat lähes samalla tavalla kuin natiivit ympäristöt. Ohjelmistokontit ovat nousseet suosioon tehokkaana sovellusten toimitusteknologiana. Molemmat, sekä virtuaalikoneet, että ohjelmistokontit tarjoavat niiden sisällä suoritettaville sovelluksille eristetyn virtuaaliympäristön. Ohjelmistokontit eivät pyri virtualisoimaan kaikkia järjestelmän resursseja vaan käyttävät alla olevaa käyttöjärjestelmän ydintä hyväkseen. Tämä tekee ohjelmistokonteista houkuttelevan vaihtoehdon virtuaalikoneille. Tässä diplomityössä suoritettiin erilaisia suorituskykymittauksia ohjelmistokonttien ja virtuaalikoneiden avulla luoduissa ympäristöissä. Myös alla olevan isäntäkoneen natiivisuorituskyky mitattiin, josta saatiin hyvä arvo erilaisten virtuaaliympäristöjen vertailuun. Mittasimme pysyvän muistin, verkon ja sovelluksen suorituskyvyn. Sovelluksena toimi Linuxin kääntäminen lähdekoodista toimivaksi käyttöjärjestelmäksi. Tuloksemme osoittavat, että sovellussuorituskykytestissä kontit häviävät natiivijärjestelmän suorituskyvylle vain vähän. Eron oletetaan johtuvan tavasta, jolla valitsemamme konttiteknologia hoitaa pysyvän muistin lukemisen ja kirjoittamisen. Oletusverkkoasetuksilla, kontit hävisivät natiivijärjestelmälle myös. Kaikissa tekemissämme suorituskykymittauksissa virtuaalikoneet hävisivät natiivijärjestelmälle sekä ohjelmistokonteille