Search CORE

105 research outputs found

Dynamic Binary Translation for Embedded Systems with Scratchpad Memory

Author: Baiocchi Paredes Jose Americo
Publication venue
Publication date: 01/01/2011
Field of study

Embedded software development has recently changed with advances in computing. Rather than fully co-designing software and hardware to perform a relatively simple task, nowadays embedded and mobile devices are designed as a platform where multiple applications can be run, new applications can be added, and existing applications can be updated. In this scenario, traditional constraints in embedded systems design (i.e., performance, memory and energy consumption and real-time guarantees) are more difficult to address. New concerns (e.g., security) have become important and increase software complexity as well. In general-purpose systems, Dynamic Binary Translation (DBT) has been used to address these issues with services such as Just-In-Time (JIT) compilation, dynamic optimization, virtualization, power management and code security. In embedded systems, however, DBT is not usually employed due to performance, memory and power overhead. This dissertation presents StrataX, a low-overhead DBT framework for embedded systems. StrataX addresses the challenges faced by DBT in embedded systems using novel techniques. To reduce DBT overhead, StrataX loads code from NAND-Flash storage and translates it into a Scratchpad Memory (SPM), a software-managed on-chip SRAM with limited capacity. SPM has similar access latency as a hardware cache, but consumes less power and chip area. StrataX manages SPM as a software instruction cache, and employs victim compression and pinning to reduce retranslation cost and capture frequently executed code in the SPM. To prevent performance loss due to excessive code expansion, StrataX minimizes the amount of code inserted by DBT to maintain control of program execution. When a hardware instruction cache is available, StrataX dynamically partitions translated code among the SPM and main memory. With these techniques, StrataX has low performance overhead relative to native execution for MiBench programs. Further, it simplifies embedded software and hardware design by operating transparently to applications without any special hardware support. StrataX achieves sufficiently low overhead to make it feasible to use DBT in embedded systems to address important design goals and requirements

CiteSeerX

Executing Hard Real-Time Programs on NAND Flash Memory Considering Read Disturb Errors

Author: 빅토리아
Publication venue: 서울대학교 대학원
Publication date: 01/08/2017
Field of study

학위논문 (석사)-- 서울대학교 대학원 공과대학 컴퓨터공학부, 2017. 8. 이창건.죄근 IoT 와 임베디드 시스템에 대한 관심이 급증하면서 NAND 플래쉬 메모리를 사용하는 장치들 또한 증가하고 있다. 이러한 장치들은 NAND 플래시 메모리를 사용함으로써 큰 이득을 얻을 수 있지만 여전히 신뢰성 측면에서는 해결되지 않은 이슈들이 있다. 본 논문에서는 이러한 신뢰성 문제를 극복할 수 있는 방안에 대해 논의한다. NAND 플래쉬 메모리는 각 페이지에 대해 읽기 명령을 반복적으로 수행 할 있는 물리적 회수가 한정되어 있기 때문에 읽기 횟수가 한계치에 도달하기 전에 재할당을 해주어야 하는 문제가 있다. 본 논문에서는 프로그램 코드가 저장되어 있는 read-only 페이지를 읽어 코드를 수행하는 실시간 임베디드 시스템에서 실시간 제약 조건을 만족하면서 재할당을 하여 READ DISTURB ERROR 를 줄이는 기법에 대해 제안한다. 본 논문에서 제안하는 기법을 구현하고 실험함으로써, NAND 플래쉬 메모리의 읽기 한계치에 도달하기 전에 재할당이 보장됨을 보인다. 또한 제안하는 기법을 사용 할 경우 요구되는 RAM 크기가 최대 48% 감소함을 확인한다1 Introduction 1 2 Related works 6 3 Background and problem description 10 3.1 NAND flash memory 10 3.2 HRT-PLRU 13 3.3 Reliability issues of the NAND flash memory 18 3.4 Problem description 28 3.5 System notation 30 4 Approach 31 4.1 Per-task analysis 31 4.2 Convex optimization 37 5 Evaluation 41 6 Future works 46 7 Conclusion 47 Summary (Korean) 48 References 49Maste

Elevating commodity storage with the SALSA host translation layer

Author: Ioannou Nikolas
Koltsidas Ioannis
Kourtis Kornilios
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/01/2019
Field of study

To satisfy increasing storage demands in both capacity and performance, industry has turned to multiple storage technologies, including Flash SSDs and SMR disks. These devices employ a translation layer that conceals the idiosyncrasies of their mediums and enables random access. Device translation layers are, however, inherently constrained: resources on the drive are scarce, they cannot be adapted to application requirements, and lack visibility across multiple devices. As a result, performance and durability of many storage devices is severely degraded. In this paper, we present SALSA: a translation layer that executes on the host and allows unmodified applications to better utilize commodity storage. SALSA supports a wide range of single- and multi-device optimizations and, because is implemented in software, can adapt to specific workloads. We describe SALSA's design, and demonstrate its significant benefits using microbenchmarks and case studies based on three applications: MySQL, the Swift object store, and a video server.Comment: Presented at 2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS

arXiv.org e-Print Archive

Multi-level Hybrid Cache: Impact and Feasibility

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Recommended from our members

Multi-level Hybrid Cache: Impact and Feasibility

Author: Kim Youngjae
Ma Xiaosong
Shipman Galen M
Zhang Zhe
Zhou Yuanyuan
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 01/02/2012
Field of study

Storage class memories, including flash, has been attracting much attention as promising candidates fitting into in today's enterprise storage systems. In particular, since the cost and performance characteristics of flash are in-between those of DRAM and hard disks, it has been considered by many studies as an secondary caching layer underneath main memory cache. However, there has been a lack of studies of correlation and interdependency between DRAM and flash caching. This paper views this problem as a special form of multi-level caching, and tries to understand the benefits of this multi-level hybrid cache hierarchy. We reveal that significant costs could be saved by using Flash to reduce the size of DRAM cache, while maintaing the same performance. We also discuss design challenges of using flash in the caching hierarchy and present potential solutions

UNT Digital Library

Synergistically Coupling Of Solid State Drives And Hard Disks For Qos-Aware Virtual Memory

Author: Liu Ke
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2013
Field of study

With significant advantages in capacity, power consumption, and price, solid state disk (SSD) has good potential to be employed as an extension of dynamic random-access memory, such that applications with large working sets could run efficiently on a modestly configured system. While initial results reported in recent works show promising prospects for this use of SSD by incorporating it into the management of virtual memory, frequent writes from write-intensive programs could quickly wear out SSD, making the idea less practical. This thesis makes four contributions towards solving this issue. First, we propose a scheme, HybridSwap, that integrates a hard disk with an SSD for virtual memory man-agement, synergistically achieving the advantages of both. In addition, HybridSwap can constrain performance loss caused by swapping according to user-specified QoS requirements. Second, We develop an efficient algorithm to record memory access history and to identify page access sequences and evaluate their locality. Using a history of page access patterns HybridSwap dynamically creates an out-of-memory virtual memory page layout on the swap space spanning the SSD and hard disk such that random reads are served by SSD and sequential reads are asynchronously served by the hard disk with high efficiency. Third, we build a QoS-assurance mechanism into HybridSwap to demonstrate the flexibility of the system in bounding the performance penalty due to swapping. It allows users to specify a bound on the program stall time due to page faults as a percentage of the program\u27s total run time. Forth, we have implemented HybridSwap in a recent Linux kernel, version 2.6.35.7. Our evaluation with representative benchmarks, such as Memcached for key-value store, and scientific programs from the ALGLIB cross-platform numerical analysis and data processing library, shows that the number of writes to SSD can be reduced by 40% with the system\u27s performance comparable to that with pure SSD swapping, and can satisfy a swapping-related QoS requirement as long as the I/O resource is sufficient

Digital Commons@Wayne State University

Optimizing the flash-RAM energy trade-off in deeply embedded systems

Author: Eder Kerstin
Hollis Simon
Pallister James
Publication venue
Publication date: 03/06/2014
Field of study

Deeply embedded systems often have the tightest constraints on energy consumption, requiring that they consume tiny amounts of current and run on batteries for years. However, they typically execute code directly from flash, instead of the more energy efficient RAM. We implement a novel compiler optimization that exploits the relative efficiency of RAM by statically moving carefully selected basic blocks from flash to RAM. Our technique uses integer linear programming, with an energy cost model to select a good set of basic blocks to place into RAM, without impacting stack or data storage. We evaluate our optimization on a common ARM microcontroller and succeed in reducing the average power consumption by up to 41% and reducing energy consumption by up to 22%, while increasing execution time. A case study is presented, where an application executes code then sleeps for a period of time. For this example we show that our optimization could allow the application to run on battery for up to 32% longer. We also show that for this scenario the total application energy can be reduced, even if the optimization increases the execution time of the code

arXiv.org e-Print Archive

Revisiting read-ahead efficiency for raw NAND flash storage in embedded Linux

Author: Benmoussa Y.
Eric Senn
Jalil Boukhobza
Pierre Olivier
Woodhouse D.
Wookey
Wu F.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

VIRTUAL MEMORY ON A MANY-CORE NOC

Author: McMenamin Adrian Ciaran
Publication venue: University of York
Publication date: 01/06/2019
Field of study

Many-core devices are likely to become increasingly common in real-time and embedded systems as computational demands grow and as expectations for higher performance can generally only be met by by increasing core numbers rather than relying on higher clock speeds. Network-on-chip devices, where multiple cores share a single slice of silicon and employ packetised communications, are a widely-deployed many-core option for system designers. As NoCs are expected to run larger and more complex programs, the small amount of fast, on-chip memory available to each core is unlikely to be sufficient for all but the simplest of tasks, and it is necessary to find an efficient, effective, and time-bounded, means of accessing resources stored in off-chip memory, such as DRAM or Flash storage. The abstraction of paged virtual memory is a familiar technique to manage similar tasks in general computing but has often been shunned by real-time developers because of concern about time predictability. We show it can be a poor choice for a many-core NoC system as, unmodified, it typically uses page sizes optimised for interaction with spinning disks and not solid state media, and transports significant volumes of subsequently unused data across already congested links. In this work we outline and simulate an efficient partial paging algorithm where only those memory resources that are locally accessed are transported between global and local storage. We further show that smaller page sizes add to efficiency. We examine the factors that lead to timing delays in such systems, and show we can predict worst case execution times at even safety-critical thresholds by using statistical methods from extreme value theory. We also show these results are applicable to systems with a variety of connections to memory