62 research outputs found

    Dynamic Binary Translation for Embedded Systems with Scratchpad Memory

    Get PDF
    Embedded software development has recently changed with advances in computing. Rather than fully co-designing software and hardware to perform a relatively simple task, nowadays embedded and mobile devices are designed as a platform where multiple applications can be run, new applications can be added, and existing applications can be updated. In this scenario, traditional constraints in embedded systems design (i.e., performance, memory and energy consumption and real-time guarantees) are more difficult to address. New concerns (e.g., security) have become important and increase software complexity as well. In general-purpose systems, Dynamic Binary Translation (DBT) has been used to address these issues with services such as Just-In-Time (JIT) compilation, dynamic optimization, virtualization, power management and code security. In embedded systems, however, DBT is not usually employed due to performance, memory and power overhead. This dissertation presents StrataX, a low-overhead DBT framework for embedded systems. StrataX addresses the challenges faced by DBT in embedded systems using novel techniques. To reduce DBT overhead, StrataX loads code from NAND-Flash storage and translates it into a Scratchpad Memory (SPM), a software-managed on-chip SRAM with limited capacity. SPM has similar access latency as a hardware cache, but consumes less power and chip area. StrataX manages SPM as a software instruction cache, and employs victim compression and pinning to reduce retranslation cost and capture frequently executed code in the SPM. To prevent performance loss due to excessive code expansion, StrataX minimizes the amount of code inserted by DBT to maintain control of program execution. When a hardware instruction cache is available, StrataX dynamically partitions translated code among the SPM and main memory. With these techniques, StrataX has low performance overhead relative to native execution for MiBench programs. Further, it simplifies embedded software and hardware design by operating transparently to applications without any special hardware support. StrataX achieves sufficiently low overhead to make it feasible to use DBT in embedded systems to address important design goals and requirements

    ์ƒ๋ณ€ํ™” ๋ฉ”๋ชจ๋ฆฌ ์‹œ์Šคํ…œ์˜ ๊ฐ„์„ญ ์˜ค๋ฅ˜ ์™„ํ™” ๋ฐ RMW ์„ฑ๋Šฅ ํ–ฅ์ƒ ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2021.8. ์ดํ˜์žฌ.Phase-change memory (PCM) announces the beginning of the new era of memory systems, owing to attractive characteristics. Many memory product manufacturers (e.g., Intel, SK Hynix, and Samsung) are developing related products. PCM can be applied to various circumstances; it is not simply limited to an extra-scale database. For example, PCM has a low standby power due to its non-volatility; hence, computation-intensive applications or mobile applications (i.e., long memory idle time) are suitable to run on PCM-based computing systems. Despite these fascinating features of PCM, PCM is still far from the general commercial market due to low reliability and long latency problems. In particular, low reliability is a painful problem for PCM in past decades. As the semiconductor process technology rapidly scales down over the years, DRAM reaches 10 nm class process technology. In addition, it is reported that the write disturbance error (WDE) would be a serious issue for PCM if it scales down below 54 nm class process technology. Therefore, addressing the problem of WDEs becomes essential to make PCM competitive to DRAM. To overcome this problem, this dissertation proposes a novel approach that can restore meta-stable cells on demand by levering two-level SRAM-based tables, thereby significantly reducing the number WDEs. Furthermore, a novel randomized approach is proposed to implement a replacement policy that originally requires hundreds of read ports on SRAM. The second problem of PCM is a long-latency compared to that of DRAM. In particular, PCM tries to enhance its throughput by adopting a larger transaction unit; however, the different unit size from the general-purpose processor cache line further degrades the system performance due to the introduction of a read-modify-write (RMW) module. Since there has never been any research related to RMW in a PCM-based memory system, this dissertation proposes a novel architecture to enhance the overall system performance and reliability of a PCM-based memory system having an RMW module. The proposed architecture enhances data re-usability without introducing extra storage resources. Furthermore, a novel operation that merges commands regardless of command types is proposed to enhance performance notably. Another problem is the absence of a full simulation platform for PCM. While the announced features of the PCM-related product (i.e., Intel Optane) are scarce due to confidential issues, all priceless information can be integrated to develop an architecture simulator that resembles the available product. To this end, this dissertation tries to scrape up all available features of modules in a PCM controller and implement a dedicated simulator for future research purposes.์ƒ๋ณ€ํ™” ๋ฉ”๋ชจ๋ฆฌ๋Š”(PCM) ๋งค๋ ฅ์ ์ธ ํŠน์„ฑ์„ ํ†ตํ•ด ๋ฉ”๋ชจ๋ฆฌ ์‹œ์Šคํ…œ์˜ ์ƒˆ๋กœ์šด ์‹œ๋Œ€์˜ ์‹œ์ž‘์„ ์•Œ๋ ธ๋‹ค. ๋งŽ์€ ๋ฉ”๋ชจ๋ฆฌ ๊ด€๋ จ ์ œํ’ˆ ์ œ์กฐ์—…์ฒด(์˜ˆ : ์ธํ…”, SK ํ•˜์ด๋‹‰์Šค, ์‚ผ์„ฑ)๊ฐ€ ๊ด€๋ จ ์ œํ’ˆ ๊ฐœ๋ฐœ์— ๋ฐ•์ฐจ๋ฅผ ๊ฐ€ํ•˜๊ณ  ์žˆ๋‹ค. PCM์€ ๋‹จ์ˆœํžˆ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์—๋งŒ ๊ตญํ•œ๋˜์ง€ ์•Š๊ณ  ๋‹ค์–‘ํ•œ ์ƒํ™ฉ์— ์ ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, PCM์€ ๋น„ํœ˜๋ฐœ์„ฑ์œผ๋กœ ์ธํ•ด ๋Œ€๊ธฐ ์ „๋ ฅ์ด ๋‚ฎ๋‹ค. ๋”ฐ๋ผ์„œ ๊ณ„์‚ฐ ์ง‘์•ฝ์ ์ธ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๋˜๋Š” ๋ชจ๋ฐ”์ผ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์€(์ฆ‰, ๊ธด ๋ฉ”๋ชจ๋ฆฌ ์œ ํœด ์‹œ๊ฐ„) PCM ๊ธฐ๋ฐ˜ ์ปดํ“จํŒ… ์‹œ์Šคํ…œ์—์„œ ์‹คํ–‰ํ•˜๊ธฐ์— ์ ํ•ฉํ•˜๋‹ค. PCM์˜ ์ด๋Ÿฌํ•œ ๋งค๋ ฅ์ ์ธ ํŠน์„ฑ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  PCM์€ ๋‚ฎ์€ ์‹ ๋ขฐ์„ฑ๊ณผ ๊ธด ๋Œ€๊ธฐ ์‹œ๊ฐ„์œผ๋กœ ์ธํ•ด ์—ฌ์ „ํžˆ ์ผ๋ฐ˜ ์‚ฐ์—… ์‹œ์žฅ์—์„œ๋Š” DRAM๊ณผ ๋‹ค์†Œ ๊ฒฉ์ฐจ๊ฐ€ ์žˆ๋‹ค. ํŠนํžˆ ๋‚ฎ์€ ์‹ ๋ขฐ์„ฑ์€ ์ง€๋‚œ ์ˆ˜์‹ญ ๋…„ ๋™์•ˆ PCM ๊ธฐ์ˆ ์˜ ๋ฐœ์ „์„ ์ €ํ•ดํ•˜๋Š” ๋ฌธ์ œ๋‹ค. ๋ฐ˜๋„์ฒด ๊ณต์ • ๊ธฐ์ˆ ์ด ์ˆ˜๋…„์— ๊ฑธ์ณ ๋น ๋ฅด๊ฒŒ ์ถ•์†Œ๋จ์— ๋”ฐ๋ผ DRAM์€ 10nm ๊ธ‰ ๊ณต์ • ๊ธฐ์ˆ ์— ๋„๋‹ฌํ•˜์˜€๋‹ค. ์ด์–ด์„œ, ์“ฐ๊ธฐ ๋ฐฉํ•ด ์˜ค๋ฅ˜ (WDE)๊ฐ€ 54nm ๋“ฑ๊ธ‰ ํ”„๋กœ์„ธ์Šค ๊ธฐ์ˆ  ์•„๋ž˜๋กœ ์ถ•์†Œ๋˜๋ฉด PCM์— ์‹ฌ๊ฐํ•œ ๋ฌธ์ œ๊ฐ€ ๋  ๊ฒƒ์œผ๋กœ ๋ณด๊ณ ๋˜์—ˆ๋‹ค. ๋”ฐ๋ผ์„œ, WDE ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๊ฒƒ์€ PCM์ด DRAM๊ณผ ๋™๋“ฑํ•œ ๊ฒฝ์Ÿ๋ ฅ์„ ๊ฐ–์ถ”๋„๋ก ํ•˜๋Š” ๋ฐ ์žˆ์–ด ํ•„์ˆ˜์ ์ด๋‹ค. ์ด ๋ฌธ์ œ๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด ์ด ๋…ผ๋ฌธ์—์„œ๋Š” 2-๋ ˆ๋ฒจ SRAM ๊ธฐ๋ฐ˜ ํ…Œ์ด๋ธ”์„ ํ™œ์šฉํ•˜์—ฌ WDE ์ˆ˜๋ฅผ ํฌ๊ฒŒ ์ค„์—ฌ ํ•„์š”์— ๋”ฐ๋ผ ์ค€ ์•ˆ์ • ์…€์„ ๋ณต์›ํ•  ์ˆ˜ ์žˆ๋Š” ์ƒˆ๋กœ์šด ์ ‘๊ทผ ๋ฐฉ์‹์„ ์ œ์•ˆํ•œ๋‹ค. ๋˜ํ•œ, ์›๋ž˜ SRAM์—์„œ ์ˆ˜๋ฐฑ ๊ฐœ์˜ ์ฝ๊ธฐ ํฌํŠธ๊ฐ€ ํ•„์š”ํ•œ ๋Œ€์ฒด ์ •์ฑ…์„ ๊ตฌํ˜„ํ•˜๊ธฐ ์œ„ํ•ด ์ƒˆ๋กœ์šด ๋žœ๋ค ๊ธฐ๋ฐ˜์˜ ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. PCM์˜ ๋‘ ๋ฒˆ์งธ ๋ฌธ์ œ๋Š” DRAM์— ๋น„ํ•ด ์ง€์—ฐ ์‹œ๊ฐ„์ด ๊ธธ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ํŠนํžˆ PCM์€ ๋” ํฐ ํŠธ๋žœ์žญ์…˜ ๋‹จ์œ„๋ฅผ ์ฑ„ํƒํ•˜์—ฌ ๋‹จ์œ„์‹œ๊ฐ„ ๋‹น ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ๋Ÿ‰ ํ–ฅ์ƒ์„ ๋„๋ชจํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋ฒ”์šฉ ํ”„๋กœ์„ธ์„œ ์บ์‹œ ๋ผ์ธ๊ณผ ๋‹ค๋ฅธ ์œ ๋‹› ํฌ๊ธฐ๋Š” ์ฝ๊ธฐ-์ˆ˜์ •-์“ฐ๊ธฐ (RMW) ๋ชจ๋“ˆ์˜ ๋„์ž…์œผ๋กœ ์ธํ•ด ์‹œ์Šคํ…œ ์„ฑ๋Šฅ์„ ์ €ํ•˜ํ•˜๊ฒŒ ๋œ๋‹ค. PCM ๊ธฐ๋ฐ˜ ๋ฉ”๋ชจ๋ฆฌ ์‹œ์Šคํ…œ์—์„œ RMW ๊ด€๋ จ ์—ฐ๊ตฌ๊ฐ€ ์—†์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ๋ณธ ๋…ผ๋ฌธ์€ RMW ๋ชจ๋“ˆ์„ ํƒ‘์žฌ ํ•œ PCM ๊ธฐ๋ฐ˜ ๋ฉ”๋ชจ๋ฆฌ ์‹œ์Šคํ…œ์˜ ์ „๋ฐ˜์ ์ธ ์‹œ์Šคํ…œ ์„ฑ๋Šฅ๊ณผ ์‹ ๋ขฐ์„ฑ์„ ํ–ฅ์ƒํ•˜๊ฒŒ ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ์ƒˆ๋กœ์šด ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์ œ์•ˆ๋œ ์•„ํ‚คํ…์ฒ˜๋Š” ์ถ”๊ฐ€ ์Šคํ† ๋ฆฌ์ง€ ๋ฆฌ์†Œ์Šค๋ฅผ ๋„์ž…ํ•˜์ง€ ์•Š๊ณ ๋„ ๋ฐ์ดํ„ฐ ์žฌ์‚ฌ์šฉ์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค. ๋˜ํ•œ, ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•ด ๋ช…๋ น ์œ ํ˜•๊ณผ ๊ด€๊ณ„์—†์ด ๋ช…๋ น์„ ๋ณ‘ํ•ฉํ•˜๋Š” ์ƒˆ๋กœ์šด ์ž‘์—…์„ ์ œ์•ˆํ•œ๋‹ค. ๋˜ ๋‹ค๋ฅธ ๋ฌธ์ œ๋Š” PCM์„ ์œ„ํ•œ ์™„์ „ํ•œ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ํ”Œ๋žซํผ์ด ๋ถ€์žฌํ•˜๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. PCM ๊ด€๋ จ ์ œํ’ˆ(์˜ˆ : Intel Optane)์— ๋Œ€ํ•ด ๋ฐœํ‘œ๋œ ์ •๋ณด๋Š” ๋Œ€์™ธ๋น„ ๋ฌธ์ œ๋กœ ์ธํ•ด ๋ถ€์กฑํ•˜๋‹ค. ํ•˜์ง€๋งŒ ์•Œ๋ ค์ ธ ์žˆ๋Š” ์ •๋ณด๋ฅผ ์ ์ ˆํžˆ ์ทจํ•ฉํ•˜๋ฉด ์‹œ์ค‘ ์ œํ’ˆ๊ณผ ์œ ์‚ฌํ•œ ์•„ํ‚คํ…์ฒ˜ ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ๋ฅผ ๊ฐœ๋ฐœํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ๋ณธ ๋…ผ๋ฌธ์€ PCM ๋ฉ”๋ชจ๋ฆฌ ์ปจํŠธ๋กค๋Ÿฌ์— ํ•„์š”ํ•œ ๋ชจ๋“  ๋ชจ๋“ˆ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•˜์—ฌ ํ–ฅํ›„ ์ด์™€ ๊ด€๋ จ๋œ ์—ฐ๊ตฌ์—์„œ ์ถฉ๋ถ„ํžˆ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ์ „์šฉ ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ๋ฅผ ๊ตฌํ˜„ํ•˜์˜€๋‹ค.1 INTRODUCTION 1 1.1 Limitation of Traditional Main Memory Systems 1 1.2 Phase-Change Memory as Main Memory 3 1.2.1 Opportunities of PCM-based System 3 1.2.2 Challenges of PCM-based System 4 1.3 Dissertation Overview 7 2 BACKGROUND AND PREVIOUS WORK 8 2.1 Phase-Change Memory 8 2.2 Mitigation Schemes for Write Disturbance Errors 10 2.2.1 Write Disturbance Errors 10 2.2.2 Verification and Correction 12 2.2.3 Lazy Correction 13 2.2.4 Data Encoding-based Schemes 14 2.2.5 Sparse-Insertion Write Cache 16 2.3 Performance Enhancement for Read-Modify-Write 17 2.3.1 Traditional Read-Modify-Write 17 2.3.2 Write Coalescing for RMW 19 2.4 Architecture Simulators for PCM 21 2.4.1 NVMain 21 2.4.2 Ramulator 22 2.4.3 DRAMsim3 22 3 IN-MODULE DISTURBANCE BARRIER 24 3.1 Motivation 25 3.2 IMDB: In Module-Disturbance Barrier 29 3.2.1 Architectural Overview 29 3.2.2 Implementation of Data Structures 30 3.2.3 Modification of Media Controller 36 3.3 Replacement Policy 38 3.3.1 Replacement Policy for IMDB 38 3.3.2 Approximate Lowest Number Estimator 40 3.4 Putting All Together: Case Studies 43 3.5 Evaluation 45 3.5.1 Configuration 45 3.5.2 Architectural Exploration 47 3.5.3 Effectiveness of the Replacement Policy 48 3.5.4 Sensitivity to Main Table Configuration 49 3.5.5 Sensitivity to Barrier Buffer Size 51 3.5.6 Sensitivity to AppLE Group Size 52 3.5.7 Comparison with Other Studies 54 3.6 Discussion 59 3.7 Summary 63 4 INTEGRATION OF AN RMW MODULE IN A PCM-BASED SYSTEM 64 4.1 Motivation 65 4.2 Utilization of DRAM Cache for RMW 67 4.2.1 Architectural Design 67 4.2.2 Algorithm 70 4.3 Typeless Command Merging 73 4.3.1 Architectural Design 73 4.3.2 Algorithm 74 4.4 An Alternative Implementation: SRC-RMW 78 4.4.1 Implementation of SRC-RMW 78 4.4.2 Design Constraint 80 4.5 Case Study 82 4.6 Evaluation 85 4.6.1 Configuration 85 4.6.2 Speedup 88 4.6.3 Read Reliability 91 4.6.4 Energy Consumption: Selecting a Proper Page Size 93 4.6.5 Comparison with Other Studies 95 4.7 Discussion 97 4.8 Summary 99 5 AN ALL-INCLUSIVE SIMULATOR FOR A PCM CONTROLLER 100 5.1 Motivation 101 5.2 PCMCsim: PCM Controller Simulator 103 5.2.1 Architectural Overview 103 5.2.2 Underlying Classes of PCMCsim 104 5.2.3 Implementation of Contention Behavior 108 5.2.4 Modules of PCMCsim 109 5.3 Evaluation 116 5.3.1 Correctness of the Simulator 116 5.3.2 Comparison with Other Simulators 117 5.4 Summary 119 6 Conclusion 120 Abstract (In Korean) 141 Acknowledgment 143๋ฐ•

    Flash Cache Hybrid Storage System For Video-On-Demand Server

    Get PDF
    A video-on-demand (VOD) system allows users to access any video at any time. This system enables users to view videos in the real-time streaming mode for extended durations. One of the important components of the VOD systems is the hybrid storage server. This component consists of HDD, SSD, and RAM, collectively fulfilling the requirement of simultaneous fast data access and large data distribution to numerous users. However, current hybrid storage systems still pose numerous challenges. (1) The integration and roles of the HDD, SSD, and RAM are relatively weak in terms of optimizing fast access prior to streaming to a large number of simultaneous users. (2) The HDD and SSD exhibit poor data layout and streaming controller in supporting the production of a high number of simultaneous streams. This thesis proposes (1) an enhanced hybrid storage system (EHSS) for the VOD servers. The HDD, SSD, and RAM are managed and integrated using an effective mechanism, in which the top 10% most popular videos are utilized to achieve fast data access and service to a large number of simultaneous users. (2) The flash cache hybrid storage system (FCHSS) is also proposed. Unlike the EHSS architecture, the FCHSS architecture has no RAM, which is removed and replaced by a flash-based SSD. The flash-based SSD is used instead of the RAM as a cache for the HDD to achieve fast data access and service to a large number of simultaneous users. (3) The new data layout stores thousands of video segments in the HDD and SSD

    ์ƒ๋ณ€ํ™” ๋ฉ”๋ชจ๋ฆฌ ์‹œ์Šคํ…œ์˜ ๊ฐ„์„ญ ์˜ค๋ฅ˜ ์™„ํ™” ๋ฐ RMW ์„ฑ๋Šฅ ํ–ฅ์ƒ ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€,2021. 8. ์ดํ˜์žฌ.Phase-change memory (PCM) announces the beginning of the new era of memory systems, owing to attractive characteristics. Many memory product manufacturers (e.g., Intel, SK Hynix, and Samsung) are developing related products. PCM can be applied to various circumstances; it is not simply limited to an extra-scale database. For example, PCM has a low standby power due to its non-volatility; hence, computation-intensive applications or mobile applications (i.e., long memory idle time) are suitable to run on PCM-based computing systems. The second problem of PCM is a long-latency compared to that of DRAM. In particular, PCM tries to enhance its throughput by adopting a larger transaction unit; however, the different unit size from the general-purpose processor cache line further degrades the system performance due to the introduction of a read-modify-write (RMW) module. Since there has never been any research related to RMW in a PCM-based memory system, this dissertation proposes a novel architecture to enhance the overall system performance and reliability of a PCM-based memory system having an RMW module. The proposed architecture enhances data re-usability without introducing extra storage resources. Furthermore, a novel operation that merges commands regardless of command types is proposed to enhance performance notably.Despite these fascinating features of PCM, PCM is still far from the general commercial market due to low reliability and long latency problems. In particular, low reliability is a painful problem for PCM in past decades. As the semiconductor process technology rapidly scales down over the years, DRAM reaches 10 nm class process technology. In addition, it is reported that the write disturbance error (WDE) would be a serious issue for PCM if it scales down below 54 nm class process technology. Therefore, addressing the problem of WDEs becomes essential to make PCM competitive to DRAM. To overcome this problem, this dissertation proposes a novel approach that can restore meta-stable cells on demand by levering two-level SRAM-based tables, thereby significantly reducing the number WDEs. Furthermore, a novel randomized approach is proposed to implement a replacement policy that originally requires hundreds of read ports on SRAM.Another problem is the absence of a full simulation platform for PCM. While the announced features of the PCM-related product (i.e., Intel Optane) are scarce due to confidential issues, all priceless information can be integrated to develop an architecture simulator that resembles the available product. To this end, this dissertation tries to scrape up all available features of modules in a PCM controller and implement a dedicated simulator for future research purposes

    Introductory Computer Forensics

    Get PDF
    INTERPOL (International Police) built cybercrime programs to keep up with emerging cyber threats, and aims to coordinate and assist international operations for ?ghting crimes involving computers. Although signi?cant international efforts are being made in dealing with cybercrime and cyber-terrorism, ?nding effective, cooperative, and collaborative ways to deal with complicated cases that span multiple jurisdictions has proven dif?cult in practic
    • โ€ฆ
    corecore