1,971 research outputs found

    Dynamic Binary Translation for Embedded Systems with Scratchpad Memory

    Get PDF
    Embedded software development has recently changed with advances in computing. Rather than fully co-designing software and hardware to perform a relatively simple task, nowadays embedded and mobile devices are designed as a platform where multiple applications can be run, new applications can be added, and existing applications can be updated. In this scenario, traditional constraints in embedded systems design (i.e., performance, memory and energy consumption and real-time guarantees) are more difficult to address. New concerns (e.g., security) have become important and increase software complexity as well. In general-purpose systems, Dynamic Binary Translation (DBT) has been used to address these issues with services such as Just-In-Time (JIT) compilation, dynamic optimization, virtualization, power management and code security. In embedded systems, however, DBT is not usually employed due to performance, memory and power overhead. This dissertation presents StrataX, a low-overhead DBT framework for embedded systems. StrataX addresses the challenges faced by DBT in embedded systems using novel techniques. To reduce DBT overhead, StrataX loads code from NAND-Flash storage and translates it into a Scratchpad Memory (SPM), a software-managed on-chip SRAM with limited capacity. SPM has similar access latency as a hardware cache, but consumes less power and chip area. StrataX manages SPM as a software instruction cache, and employs victim compression and pinning to reduce retranslation cost and capture frequently executed code in the SPM. To prevent performance loss due to excessive code expansion, StrataX minimizes the amount of code inserted by DBT to maintain control of program execution. When a hardware instruction cache is available, StrataX dynamically partitions translated code among the SPM and main memory. With these techniques, StrataX has low performance overhead relative to native execution for MiBench programs. Further, it simplifies embedded software and hardware design by operating transparently to applications without any special hardware support. StrataX achieves sufficiently low overhead to make it feasible to use DBT in embedded systems to address important design goals and requirements

    Linux kernel compaction through cold code swapping

    Get PDF
    There is a growing trend to use general-purpose operating systems like Linux in embedded systems. Previous research focused on using compaction and specialization techniques to adapt a general-purpose OS to the memory-constrained environment, presented by most, embedded systems. However, there is still room for improvement: it has been shown that even after application of the aforementioned techniques more than 50% of the kernel code remains unexecuted under normal system operation. We introduce a new technique that reduces the Linux kernel code memory footprint, through on-demand code loading of infrequently executed code, for systems that support virtual memory. In this paper, we describe our general approach, and we study code placement algorithms to minimize the performance impact of the code loading. A code, size reduction of 68% is achieved, with a 2.2% execution speedup of the system-mode execution time, for a case study based on the MediaBench II benchmark suite

    RapidSwap: ํšจ์œจ์ ์ธ ๊ณ„์ธตํ˜• Far Memory

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2021.8. Bernhard Egger.As computation responsibilities are transferred and migrated to cloud computing environments, cloud operators are facing more challenges to accommodate workloads provided by their customers. Modern applications typically require a massive amount of main memory. DRAM allows the robust delivery of data to processing entities in conventional node-centric architectures. However, physically expanding DRAM is impracticable due to hardware limits and cost. In this thesis, we present RapidSwap, an efficient hierarchical far memory that exploits phase-change memory (persistent memory) in data centers to present near-DRAM performance at a significantly lower total cost of ownership (TCO). RapidSwap migrates cold memory contents to slower and cheaper storage devices by exhibiting the memory access frequency of applications. Evaluated with several different real-world cloud benchmark scenarios, RapidSwap achieves a reduction of 20% in operating cost at minimal performance degradation and is 30% more cost-effective than pure DRAM solutions. RapidSwap exemplifies that sophisticated utilization of novel storage technologies can present significant TCO savings in cloud data centers.์ปดํ“จํŒ… ํ™˜๊ฒฝ์ด ํด๋ผ์šฐ๋“œ ํ™˜๊ฒฝ์„ ์ค‘์‹ฌ์œผ๋กœ ๋ณ€ํ™”ํ•˜๊ณ  ์žˆ์–ด ํด๋ผ์šฐ๋“œ ์ œ๊ณต์ž๋Š” ๊ณ ๊ฐ์ด ์ œ๊ณตํ•˜๋Š” ์›Œํฌ๋กœ๋“œ๋ฅผ ์ˆ˜์šฉํ•˜๊ธฐ ์œ„ํ•œ ๋‹ค์–‘ํ•œ ๋ฌธ์ œ์— ์ง๋ฉดํ•˜๊ณ  ์žˆ๋‹ค. ์˜ค๋Š˜๋‚  ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ์€ ์ผ๋ฐ˜์ ์œผ๋กœ ๋งŽ์€ ์–‘์˜ ๋ฉ”์ธ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์š”๊ตฌํ•œ๋‹ค. ๊ธฐ์กด ๋…ธ๋“œ ์ค‘์‹ฌ ์•„ํ‚คํ…์ฒ˜์—์„œ DRAM์„ ์‚ฌ์šฉํ•˜๋ฉด ๋น ๋ฅด๊ฒŒ ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜, ๋ฌผ๋ฆฌ์ ์œผ๋กœ DRAM์„ ์ผ์ • ์ˆ˜์ค€ ์ด์ƒ ํ™•์žฅํ•˜๋Š” ๊ฒƒ์€ ํ•˜๋“œ์›จ์–ด ์ œํ•œ๊ณผ ๋น„์šฉ์œผ๋กœ ์ธํ•ด ํ˜„์‹ค์ ์œผ๋กœ ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” DRAM์— ๊ฐ€๊นŒ์šด ์„ฑ๋Šฅ์„ ์ œ๊ณตํ•˜๋ฉด์„œ๋„ ์ด ์†Œ์œ  ๋น„์šฉ์„ ์ƒ๋‹นํžˆ ๋‚ฎ์ถ”๋Š” ํšจ์œจ์  far memory์ธ RapidSwap์„ ์ œ์‹œํ•˜์˜€๋‹ค. RapidSwap์€ ๋ฐ์ดํ„ฐ์„ผํ„ฐ ํ™˜๊ฒฝ์—์„œ ์ƒ๋ณ€ํ™” ๋ฉ”๋ชจ๋ฆฌ (phase-change memory; persistent memory)๋ฅผ ํ™œ์šฉํ•˜๋ฉฐ ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ ๋นˆ๋„๋ฅผ ์ถ”์ ํ•˜์—ฌ ์ž์ฃผ ์ ‘๊ทผ๋˜์ง€ ์•Š๋Š” ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๋Š๋ฆฌ๊ณ  ์ €๋ ดํ•œ ์ €์žฅ์žฅ์น˜๋กœ ์ด์†กํ•˜์—ฌ ์ด๋ฅผ ๋‹ฌ์„ฑํ•œ๋‹ค. ์—ฌ๋Ÿฌ ์ €๋ช…ํ•œ ํด๋ผ์šฐ๋“œ ๋ฒค์น˜๋งˆํฌ ์‹œ๋‚˜๋ฆฌ์˜ค๋กœ ํ‰๊ฐ€ํ•œ ๊ฒฐ๊ณผ, RapidSwap์€ ์ˆœ์ˆ˜ DRAM ๋Œ€๋น„ ์•ฝ 20%์˜ ์šด์˜ ๋น„์šฉ์„ ์ ˆ๊ฐํ•˜๋ฉฐ ์•ฝ 30%์˜ ๋น„์šฉ ํšจ์œจ์„ฑ์„ ์ง€๋‹Œ๋‹ค. RapidSwap์€ ์ƒˆ๋กœ์šด ์Šคํ† ๋ฆฌ์ง€ ๊ธฐ์ˆ ์„ ์ •๊ตํ•˜๊ฒŒ ํ™œ์šฉํ•˜๋ฉด ํด๋ผ์šฐ๋“œ ๋ฐ์ดํ„ฐ ์„ผํ„ฐ ํ™˜๊ฒฝ์—์„œ ์šด์˜๋น„์šฉ์„ ์ƒ๋‹นํžˆ ์ €๊ฐํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์‚ฌ์‹ค์„ ๋ณด์ธ๋‹ค.Chapter 1 Introduction 1 Chapter 2 Background 4 2.1 Tiered Storage 4 2.2 Trends in Storage Devices 5 2.3 Techniques Proposed to Lower Memory Pressure 5 2.3.1 Transparent Memory Compression 5 2.3.2 Far Memory 6 Chapter 3 Motivation 9 3.1 Limitations of Existing Techniques 9 3.2 Tiered Storage as a Promising Alternative 10 Chapter 4 RapidSwap Design and Implementation 12 4.1 RapidSwap Design 12 4.1.1 Storage Frontend 12 4.1.2 Storage Backend 15 4.2 RapidSwap Implementation 17 4.2.1 Swap Handler 17 4.2.2 Storage Frontend 18 4.2.3 Storage Backend 20 Chapter 5 Results 21 5.1 Experimental Setup 21 5.2 RapidSwap Performance 23 5.2.1 Degradation over DRAM 23 5.2.2 Tiered Storage Utilization 27 5.2.3 Hit/Miss Analysis 28 5.3 Cost of Storage Tier 29 5.4 Cost Effectiveness 30 Chapter 6 Conclusion and Future Work 32 6.1 Conclusion 32 6.2 Future Work 33 Bibliography 34 ์š”์•ฝ 39์„

    Reducing Response Time with Preheated Caches

    Get PDF
    CPU performance is increasingly limited by thermal dissipation, and soon aggressive power management will be beneficial for performance. Especially, temporarily idle parts of the chip (including the caches) should be power-gated in order to reduce leakage power. Current CPUs already lose their cache state whenever the CPU is idle for extended periods of time, which causes a performance loss when execution is resumed, due to the high number of cache misses when the working set is fetched from external memory. In a server system, the first network request during this period suffers from increased response time. We present a technique to reduce this overhead by preheating the caches in advance before the network request arrives at the server: Our design predicts the working set of the server application by analyzing the cache contents after similar requests have been processed. As soon as an estimate of the working set is available, a predictable network architecture starts to announce future incoming network packets to the server, which then loads the predicted working set into the cache. Our experiments show that, if this preheating step is complete when the network packet arrives, the response time overhead is reduced by an average of 80%

    A Survey of Virtual Machine Migration Techniques in Cloud Computing

    Get PDF
    Cloud computing is an emerging computing technology that maintains computational resources on large data centers and accessed through internet, rather than on local computers. VM migration provides the capability to balance the load, system maintenance, etc. Virtualization technology gives power to cloud computing. The virtual machine migration techniques can be divided into two categories that is pre-copy and post-copy approach. The process to move running applications or VMs from one physical machine to another, is known as VM migration. In migration process the processor state, storage, memory and network connection are moved from one host to another.. Two important performance metrics are downtime and total migration time that the users care about most, because these metrics deals with service degradation and the time during which the service is unavailable. This paper focus on the analysis of live VM migration Techniques in cloud computing. Keywords: Cloud Computing, Virtualization, Virtual Machine, Live Virtual Machine Migration.

    Proceedings of the NSSDC Conference on Mass Storage Systems and Technologies for Space and Earth Science Applications

    Get PDF
    The proceedings of the National Space Science Data Center Conference on Mass Storage Systems and Technologies for Space and Earth Science Applications held July 23 through 25, 1991 at the NASA/Goddard Space Flight Center are presented. The program includes a keynote address, invited technical papers, and selected technical presentations to provide a broad forum for the discussion of a number of important issues in the field of mass storage systems. Topics include magnetic disk and tape technologies, optical disk and tape, software storage and file management systems, and experiences with the use of a large, distributed storage system. The technical presentations describe integrated mass storage systems that are expected to be available commercially. Also included is a series of presentations from Federal Government organizations and research institutions covering their mass storage requirements for the 1990's

    Hรผperviisorite ja virtuaalmasinate mรคluhalduse analรผรผs

    Get PDF
    The goal of this thesis is to test memory optimization and reclamation tools in the most widely used hypervisors: VMware ESXi, Microsoft Hyper-V, KVM, and Xen. The aim is to measure how much memory could be reclaimed and optimized by different memory management algorithms across hypervisors mentioned above. Dedicated monitoring tools Zabbix and collectd are going to gather the data which will be analyzed. As a result, Hyper-V seems to be the most effective, with ESXi second and KVM falling somewhat behind in the third place. Xen failed to meet specifc criteria (automated memory optimization) which rendered it impractical to include in the testing process

    Resource-Efficient Replication and Migration of Virtual Machines.

    Full text link
    Continuous replication and live migration of Virtual Machines (VMs) are two vital tools in a virtualized environment, but they are resource-expensive. Continuously replicating a VM's checkpointed state to a backup host maintains high-availability (HA) of the VM despite host failures, but checkpoint replication can generate significant network traffic. Each replicated VM also incurs a 100% memory overhead, since the backup unproductively reserves the same amount of memory to hold the redundant VM state. Live migration, though being widely used for load-balancing, power-saving, etc., can also generate excessive network traffic, by transferring VM state iteratively. In addition, it can incur a long completion time and degrade application performance. This thesis explores ways to replicate VMs for HA using resources efficiently, and to migrate VMs fast, with minimal execution disruption and using resources efficiently. First, we investigate the tradeoffs in using different compression methods to reduce the network traffic of checkpoint replication in a HA system. We evaluate gzip, delta and similarity compressions based on metrics that are specifically important in a HA system, and then suggest guidelines for their selection. Next, we propose HydraVM, a storage-based HA approach that eliminates the unproductive memory reservation made in backup hosts. HydraVM maintains a recent image of a protected VM in a shared storage by taking and consolidating incremental VM checkpoints. When a failure occurs, HydraVM quickly resumes the execution of a failed VM by loading a small amount of essential VM state from the storage. As the VM executes, the VM state not yet loaded is supplied on-demand. Finally, we propose application-assisted live migration, which skips transfer of VM memory that need not be migrated to execute running applications at the destination. We develop a generic framework for the proposed approach, and then use the framework to build JAVMM, a system that migrates VMs running Java applications skipping transfer of garbage in Java memory. Our evaluation results show that compared to Xen live migration, which is agnostic of running applications, JAVMM can reduce the completion time, network traffic and application downtime caused by Java VM migration, all by up to over 90%.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111575/1/karenhou_1.pd
    • โ€ฆ
    corecore