6 research outputs found

    Improving Phase Change Memory Performance with Data Content Aware Access

    Full text link
    A prominent characteristic of write operation in Phase-Change Memory (PCM) is that its latency and energy are sensitive to the data to be written as well as the content that is overwritten. We observe that overwriting unknown memory content can incur significantly higher latency and energy compared to overwriting known all-zeros or all-ones content. This is because all-zeros or all-ones content is overwritten by programming the PCM cells only in one direction, i.e., using either SET or RESET operations, not both. In this paper, we propose data content aware PCM writes (DATACON), a new mechanism that reduces the latency and energy of PCM writes by redirecting these requests to overwrite memory locations containing all-zeros or all-ones. DATACON operates in three steps. First, it estimates how much a PCM write access would benefit from overwriting known content (e.g., all-zeros, or all-ones) by comprehensively considering the number of set bits in the data to be written, and the energy-latency trade-offs for SET and RESET operations in PCM. Second, it translates the write address to a physical address within memory that contains the best type of content to overwrite, and records this translation in a table for future accesses. We exploit data access locality in workloads to minimize the address translation overhead. Third, it re-initializes unused memory locations with known all-zeros or all-ones content in a manner that does not interfere with regular read and write accesses. DATACON overwrites unknown content only when it is absolutely necessary to do so. We evaluate DATACON with workloads from state-of-the-art machine learning applications, SPEC CPU2017, and NAS Parallel Benchmarks. Results demonstrate that DATACON significantly improves system performance and memory system energy consumption compared to the best of performance-oriented state-of-the-art techniques.Comment: 18 pages, 21 figures, accepted at ACM SIGPLAN International Symposium on Memory Management (ISMM

    Design and Implementation of A High Performance Storage Leveraging the DRAM Host Interface

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2016. 8. ๋ฏผ์ƒ๋ ฌ.์Šคํ† ๋ฆฌ์ง€๋Š” ์ค‘์•™์ฒ˜๋ฆฌ์žฅ์น˜, ๋ฉ”์ธ ๋ฉ”๋ชจ๋ฆฌ ๋“ฑ๊ณผ ํ•จ๊ป˜ ์ปดํ“จํ„ฐ ์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ์„ ๊ฒฐ์ •ํ•˜๋Š” ์ฃผ์š” ์š”์†Œ์ด๋‹ค. ์ปดํ“จํ„ฐ ์‹œ์Šคํ…œ์ด ์ฒ˜๋ฆฌํ•ด์•ผ ํ•  ๋ฐ์ดํ„ฐ์˜ ์–‘์ด ์ง€์†์ ์œผ๋กœ ์ฆ๊ฐ€ํ•จ์— ๋”ฐ๋ผ ์ด๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” ๊ณ ์„ฑ๋Šฅ ์ปดํ“จํ„ฐ ์‹œ์Šคํ…œ์ด ์š”๊ตฌ๋˜๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ณ ์„ฑ๋Šฅ ์ปดํ“จํ„ฐ ์‹œ์Šคํ…œ์„ ๊ตฌ์„ฑํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๊ณ ์„ฑ๋Šฅ ์Šคํ† ๋ฆฌ์ง€๊ฐ€ ํ•„์ˆ˜์ ์ธ๋ฐ, NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ €์žฅ ๋งค์ฒด๋กœ ์‚ฌ์šฉํ•จ์— ๋”ฐ๋ผ ์Šคํ† ๋ฆฌ์ง€์˜ ์„ฑ๋Šฅ์€ ๋น„์•ฝ์ ์œผ๋กœ ๋ฐœ์ „ํ•˜๋ฉด์„œ ๊ณ ์„ฑ๋Šฅ ์Šคํ† ๋ฆฌ์ง€์— ๋Œ€ํ•œ ์š”๊ตฌ๋ฅผ ์ถฉ์กฑ์‹œํ‚ฌ ์ˆ˜ ์žˆ์—ˆ๋‹ค. ์Šคํ† ๋ฆฌ์ง€๋Š” ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ํ†ตํ•ด ํ˜ธ์ŠคํŠธ ์‹œ์Šคํ…œ๊ณผ ์—ฐ๊ฒฐ๋œ๋‹ค. ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค๋Š” ํ˜ธ์ŠคํŠธ ์‹œ์Šคํ…œ์ด ์Šคํ† ๋ฆฌ์ง€์˜ ์„ฑ๋Šฅ์„ ์ตœ๋Œ€๋กœ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ์Šคํ† ๋ฆฌ์ง€์™€ ํ•จ๊ป˜ ๋ฐœ์ „ํ•ด์™”๋‹ค. ํ–ฅํ›„์—๋„ ์Šคํ† ๋ฆฌ์ง€์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ๋”ฐ๋ผ ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค ๋˜ํ•œ ๊ณ„์†ํ•ด์„œ ๋ณ€๊ฒฝ๋˜๊ณ  ๋ฐœ์ „๋  ๊ฒƒ์ด๋‹ค. ํŠนํžˆ NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ์˜ ๊ธ‰๊ฒฉํ•œ ๋ฐœ์ „๊ณผ PRAM, ReRAM ๋“ฑ์˜ ๊ณ ์„ฑ๋Šฅ ์ฐจ์„ธ๋Œ€ ๋ฉ”๋ชจ๋ฆฌ์˜ ๋“ฑ์žฅ์€ ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค์˜ ๋ฐœ์ „์„ ๋”์šฑ ๊ฐ€์†ํ™” ํ•  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋œ๋‹ค. ํ•œํŽธ ์ปดํ“จํ„ฐ ์‹œ์Šคํ…œ์—๋Š” ์ฃผ๋กœ ๋ฉ”์ธ ๋ฉ”๋ชจ๋ฆฌ๋กœ ์‚ฌ์šฉ๋˜๋Š” DRAM์„ ์œ„ํ•œ ์ธํ„ฐํŽ˜์ด์Šค๋„ ์กด์žฌํ•œ๋‹ค. ๋ฉ”์ธ ๋ฉ”๋ชจ๋ฆฌ์˜ ๋Œ€์—ญํญ๊ณผ ๋ ˆ์ดํ„ด์‹œ๋Š” ์ „์ฒด ์ปดํ“จํ„ฐ ์‹œ์Šคํ…œ์˜ ์„ฑ๋Šฅ์— ์ง์ ‘์ ์ธ ์˜ํ–ฅ์„ ๋ฏธ์น˜๋ฏ€๋กœ, DRAM ์ธํ„ฐํŽ˜์ด์Šค๋Š” ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ํฌํ•จํ•œ ๋‹ค๋ฅธ ์ธํ„ฐํŽ˜์ด์Šค์— ๋น„ํ•ด ํ•ญ์ƒ ๊ณ ์„ฑ๋Šฅ์˜ ๋Œ€์—ญํญ๊ณผ ๋ ˆ์ดํ„ด์‹œ๋ฅผ ์ œ๊ณตํ•ด์™”๋‹ค. ์ด๋Ÿฌํ•œ ๊ณ ์„ฑ๋Šฅ DRAM ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค๋กœ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ๊ธฐ์กด์˜ ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค์— ๋น„ํ•ด ๊ณ ์†์˜ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ํ™•๋ณดํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ๋ณ„๋„์˜ ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ๊ฐœ๋ฐœํ•  ํ•„์š”๊ฐ€ ์—†๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค์™€ DRAM ์ธํ„ฐํŽ˜์ด์Šค๋Š” ์‚ฌ์šฉํ•˜๋Š” ๋งค์ฒด์˜ ํŠน์„ฑ์— ๋”ฐ๋ผ์„œ ์ƒํ˜ธํ˜ธํ™˜์„ฑ ์—†์ด ๋ฐœ์ „ํ•ด์™”๊ธฐ ๋•Œ๋ฌธ์— ํ˜„์žฌ์˜ ์Šคํ† ๋ฆฌ์ง€๋ฅผ ๊ทธ๋Œ€๋กœ DRAM ์ธํ„ฐํŽ˜์ด์Šค์— ์—ฐ๊ฒฐํ•˜์—ฌ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์€ ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ๊ณ ์†์˜ DRAM ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค์˜ ๋ฌผ๋ฆฌ ๊ณ„์ธต์œผ๋กœ ํ™œ์šฉํ•˜๋Š” ์Šคํ† ๋ฆฌ์ง€ ์•„ํ‚คํ…์ณ๋ฅผ ์„ค๊ณ„, ๊ตฌํ˜„, ํ‰๊ฐ€ํ•จ์œผ๋กœ์จ ์ œ์•ˆ๋œ ์•„ํ‚คํ…์ณ์˜ ํƒ€๋‹น์„ฑ์„ ์ฆ๋ช…ํ•œ๋‹ค. ์ œ์•ˆ๋œ ์Šคํ† ๋ฆฌ์ง€ ์•„ํ‚คํ…์ณ๋Š” ๊ธฐ์กด์˜ DRAM ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค์˜ ๋ฌผ๋ฆฌ ๊ณ„์ธต์œผ๋กœ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ DRAM๊ณผ ๋™์ผํ•œ ๋™์ž‘ ํŠน์„ฑ์„ ๊ฐ–๋Š” ์ž‘์€ ํฌ๊ธฐ์˜ ๊ณต์œ  ๋ฉ”๋ชจ๋ฆฌ ๋ฒ„ํผ๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. ์ด ๊ณต์œ  ๋ฉ”๋ชจ๋ฆฌ ๋ฒ„ํผ๋Š” ์Šคํ† ๋ฆฌ์ง€์— ์˜ํ•ด ์ œ๊ณต๋˜๋ฉฐ ํ˜ธ์ŠคํŠธ ์‹œ์Šคํ…œ์˜ ๋ฉ”๋ชจ๋ฆฌ ์ฃผ์†Œ ๊ณต๊ฐ„์— ๋งคํ•‘๋œ๋‹ค. DRAM ํ”„๋กœํ† ์ฝœ ์ž์ฒด๋กœ๋Š” ์ œ์•ˆ๋œ ์Šคํ† ๋ฆฌ์ง€ ์žฅ์น˜๋ฅผ ๋™์ž‘์‹œํ‚ฌ ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์— ์ œ์•ˆ๋œ ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋™์ž‘ํ•˜๋Š” ์†Œํ”„ํŠธ์›จ์–ด ์ˆ˜์ค€์˜ ์‹ ๊ทœ ์Šคํ† ๋ฆฌ์ง€ ํ”„๋กœํ† ์ฝœ์„ ์ •์˜ํ•œ๋‹ค. ์ œ์•ˆ๋œ ํ”„๋กœํ† ์ฝœ์„ ๊ธฐ๋ฐ˜์œผ๋กœ DRAM (LPDDR3) ํ˜ธ์ŠคํŠธ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์Šคํ† ๋ฆฌ์ง€, ํ˜ธ์ŠคํŠธ ์‹œ์Šคํ…œ์„ ์„ค๊ณ„ํ•˜๊ณ  ๊ตฌํ˜„ํ•œ๋‹ค. ์ตœ์ข…์ ์œผ๋กœ ์ œ์•ˆํ•œ ์Šคํ† ๋ฆฌ์ง€์™€ ํ˜ธ์ŠคํŠธ ์‹œ์Šคํ…œ์„ ๊ฒฐํ•ฉํ•˜์—ฌ ์™„์ „ํ•œ Android ์‹œ์Šคํ…œ์„ ๊ตฌ์„ฑํ•จ์œผ๋กœ์จ ๋ณธ ์—ฐ๊ตฌ์˜ ํƒ€๋‹น์„ฑ์„ ๊ฒ€์ฆํ•œ๋‹ค. ๋จผ์ € ๊ตฌํ˜„๋œ ์Šคํ† ๋ฆฌ์ง€์— ๋Œ€ํ•œ ์ •๋Ÿ‰์  ํ‰๊ฐ€๋ฅผ ํ†ตํ•ด ์‹ ๊ทœ ์Šคํ† ๋ฆฌ์ง€ ํ”„๋กœํ† ์ฝœ์ด ๋งค์šฐ ๋‚ฎ์€ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ๊ฐ–๊ณ  ์žˆ์œผ๋ฉฐ ์ œ์•ˆ๋œ ์Šคํ† ๋ฆฌ์ง€๊ฐ€ ์ตœ์‹ ์˜ UFS 2.0 ์Šคํ† ๋ฆฌ์ง€์™€ ๋น„๊ต๋ ๋งŒํ•œ ์„ฑ๋Šฅ์„ ํ™•๋ณดํ–ˆ์Œ์„ ๋ณด์ธ๋‹ค. ๋˜ํ•œ ์ œ์•ˆ๋œ ์Šคํ† ๋ฆฌ์ง€์— ๋Œ€ํ•œ ์ •์„ฑ์  ํ‰๊ฐ€๋ฅผ ํ†ตํ•ด ํ•ด๋‹น ์Šคํ† ๋ฆฌ์ง€๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ ๊ฐœ์„ ์ ์— ๋Œ€ํ•ด ์‚ดํŽด๋ณธ๋‹คStorage is a key factor that determines the overall performance of a computer system. In the era of big data, the demand for high performance computer systems has been ever increasing. High performance storage is also needed in order to construct a high performance computer system. The performance of a storage has increased dramatically with the adoption of NAND fash memory. A storage is connected with a host system via a storage interface. The storage interface has evolved in order to fully exploit the performance of a storage. Its performance will evolve as storage advances. However, we already have faster interface in current computer system: the DRAM interface. It provides up to 25.6 GB/s in case of latest DDR4 specifcation. Since the protocols for storage and DRAM are not compatible, we cannot exploit the DRAM interface as a storage interface as is. In this work, a new storage protocol is proposed in order to turn the DRAM interface to a storage interface. It runs on top of the DRAM interface. This protocol builds on a small host interface buffer structure mapped to the host systems memory space. Given the protocol, a design of storage controller and frmware is proposed. The storage controller natively supports the DRAM (LPDDR3) interface. Also a new host platform including both hardware and software is proposed for the proposed storage since the storage cannot be connected with conventional computer systems. Finally the feasibility of this work is proved by constructing a full Android system running on the developed storage and platform. Evaluation result shows that the proposed storage architecture has very low protocol handling overheads and compares favorably to a latest commercial UFS 2.0 storage.I. ์„œ๋ก  1 1.1 ์—ฐ๊ตฌ ๋™๊ธฐ 1 1.2 ์—ฐ๊ตฌ ๋‚ด์šฉ 4 1.3 ๋…ผ๋ฌธ์˜ ๊ตฌ์„ฑ 7 II. ๋ฐฐ๊ฒฝ ์ง€์‹ ๋ฐ ๊ด€๋ จ ์—ฐ๊ตฌ 10 2.1 ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค 10 2.1.1 ๋ฐ์Šคํฌํƒ‘/์„œ๋ฒ„ ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค 11 2.1.2 ๋ชจ๋ฐ”์ผ ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค 16 2.2 DRAM 19 2.2.1 DRAM ๊ตฌ์กฐ์™€ ํŠน์„ฑ 20 2.2.2 DRAM ์˜คํผ๋ ˆ์ด์…˜ 21 2.2.3 DRAM DIMM 21 2.3 NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ 24 2.3.1 NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ์˜ ๊ตฌ์กฐ 24 2.3.2 NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ ์˜คํผ๋ ˆ์ด์…˜๊ณผ ์ธํ„ฐํŽ˜์ด์Šค 26 2.3.3 ํ”Œ๋ž˜์‹œ ๋ณ€ํ™˜ ๊ณ„์ธต 30 2.3.4 ํ”Œ๋ž˜์‹œ ํŒŒ์ผ ์‹œ์Šคํ…œ 35 2.4 NVDIMM ์žฅ์น˜ 37 2.4.1 NVDIMM-N 38 2.4.2 NVDIMM-F 41 2.4.3 NVDIMM-P 47 2.4.4 Intel DIMM 47 2.5 NVDIMM ์†Œํ”„ํŠธ์›จ์–ด 48 2.5.1 BIOS 49 2.5.2 Linux ์ง€์› 51 2.5.3 Microsoft Windows ์ง€์› 52 2.5.4 NVM Programming Model 52 III. ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค ์„ค๊ณ„ 56 3.1 ๋ฌผ๋ฆฌ ๊ณ„์ธต 56 3.2 ์Šคํ† ๋ฆฌ์ง€ ํ”„๋กœํ† ์ฝœ 57 3.3 ๊ณ ๋ ค์‚ฌํ•ญ 60 3.3.1 ๋ช…๋ น ์‹คํ–‰ ์™„๋ฃŒ ์ „๋‹ฌ 60 3.3.2 ๋ช…๋ น ์ œ์ถœ์‹œ ๋ฌด๊ฒฐ์„ฑ ๋ณด์žฅ 62 3.3.3 ๋ฐ์ดํ„ฐ ์ „์†ก ๋ถ€ํ•˜ ๋ถ„๋ฐฐ 63 IV. ์Šคํ† ๋ฆฌ์ง€ ์„ค๊ณ„ 65 4.1 ์ปจํŠธ๋กค๋Ÿฌ 65 4.1.1 ํ˜ธ์ŠคํŠธ ์ธํ„ฐํŽ˜์ด์Šค ๋ธ”๋ก 66 4.1.2 ๋ฐฑ๋ณธ (backbone) ๋ธ”๋ก 68 4.1.3 ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ ์ธํ„ฐํŽ˜์ด์Šค ๋ธ”๋ก 69 4.2 ํŽŒ์›จ์–ด (Firmware) 70 4.2.1 ํ˜ธ์ŠคํŠธ ์ธํ„ฐํŽ˜์ด์Šค ๊ณ„์ธต (Host Interface Layer) 70 4.2.2 ํ”Œ๋ž˜์‹œ ๋ณ€ํ™˜ ๊ณ„์ธต (Flash Translation Layer) 72 4.2.3 ํ”Œ๋ž˜์‹œ ์ธํ„ฐํŽ˜์ด์Šค ๊ณ„์ธต (Flash Interface Layer) 73 V. ํ˜ธ์ŠคํŠธ ์‹œ์Šคํ…œ ์„ค๊ณ„ 75 5.1 ํ˜ธ์ŠคํŠธ ์‹œ์Šคํ…œ ํ”Œ๋žซํผ ํ•˜๋“œ์›จ์–ด 75 5.2 ๋ถ€ํŠธ ๋กœ๋” (boot loader) 76 5.3 ๋””๋ฐ”์ด์Šค ๋“œ๋ผ์ด๋ฒ„ 77 5.4 ์†Œํ”„ํŠธ์›จ์–ด ์ตœ์ ํ™” 80 5.4.1 DMA๋ฅผ ์ด์šฉํ•œ ๋ฐ์ดํ„ฐ ์ „์†ก 80 5.4.2 I/O ๋ถ„ํ•  ์ฒ˜๋ฆฌ 81 5.4.3 ์บ์‹œ ๋ฌดํšจํ™” ๋น„์šฉ ์ตœ์†Œํ™” 83 5.4.4 ๋ฐ์ดํ„ฐ ๋ฒ„ํผ์˜ ์™ธ๋ถ€ ๋‹จํŽธํ™” ์ตœ์†Œํ™” 83 5.4.5 Linux I/O ์Šค์ผ€์ฅด๋Ÿฌ ๋ฏธ์‚ฌ์šฉ 84 VI. ํ‰๊ฐ€ 86 6.1 ํ‰๊ฐ€ ํ™˜๊ฒฝ 86 6.2 ์ •๋Ÿ‰์  ํ‰๊ฐ€ 88 6.2.1 ํ”„๋กœํ† ์ฝœ ์˜ค๋ฒ„ํ—ค๋“œ 88 6.2.2 ์Šคํ† ๋ฆฌ์ง€ ์ธํ„ฐํŽ˜์ด์Šค ๋ฒ„ํผ ํฌ๊ธฐ์™€ ์ตœ๋Œ€ ์ „์†ก์†๋„ 91 6.2.3 ์ฝ๊ธฐ/์“ฐ๊ธฐ ์„ฑ๋Šฅ 94 6.2.4 ์„ฑ๋Šฅ ๋น„๊ต 95 6.3 ์ •์„ฑ์  ํ‰๊ฐ€ 98 6.3.1 DRAM ๋ฒ„์Šค ๋Œ€์—ญํญ ์‚ฌ์šฉ๋Ÿ‰ ์ฆ๊ฐ€ 98 6.3.2 ์œ ์—ฐํ•œ DRAM ์ฃผ์†Œ ๋งคํ•‘ 98 6.3.3 DRAM ์ธํ„ฐํŽ˜์ด์Šค์™€์˜ ๋ฌผ๋ฆฌ์  ์ ‘์† 101 6.3.4 ๋ฉ”๋ชจ๋ฆฌ ํŠธ๋žœ์žญ์…˜ ์ˆœ์„œ ๋ณด์žฅ 103 6.3.5 ๋ถ€ํŒ… ์ง€์› 104 6.3.6 ๊ณ ์„ฑ๋Šฅ ๋ฒ”์šฉ DMA ์ง€์› 107 VII. ๊ฒฐ๋ก  ๋ฐ ํ–ฅํ›„ ์—ฐ๊ตฌ 108 7.1 ๊ฒฐ๋ก  108 7.2 ํ–ฅํ›„ ์—ฐ๊ตฌ 108 ์ฐธ๊ณ  ๋ฌธํ—Œ 111 Abstract 117Docto

    EFFICIENT SECURITY IN EMERGING MEMORIES

    Get PDF
    The wide adoption of cloud computing has established integrity and confidentiality of data in memory as a first order design concern in modern computing systems. Data integrity is ensured by Merkle Tree (MT) memory authentication. However, in the context of emerging non-volatile memories (NVMs), the MT memory authentication related increase in cell writes and memory accesses impose significant energy, lifetime, and performance overheads. This dissertation presents ASSURE, an Authentication Scheme for SecURE (ASSURE) energy efficient NVMs. ASSURE integrates (i) smart message authentication codes with (ii) multi-root MTs to decrease MT reads and writes, while also reducing the number of cell writes on each MT write. Whereas data confidentiality is effectively ensured by encryption, the memory access patterns can be exploited as a side-channel to obtain confidential data. Oblivious RAM (ORAM) is a secure cryptographic construct that effectively thwarts access-pattern-based attacks. However, in Path ORAM (state-of-the-art efficient ORAM for main memories) and its variants, each last-level cache miss (read or write) is transformed to a sequence of memory reads and writes (collectively termed read phase and write phase, respectively), increasing the number of memory writes due to data re-encryption, increasing effective latency of the memory accesses, and degrading system performance. This dissertation efficiently addresses the challenges of both read and write phase operations during an ORAM access. First, it presents ReadPRO (Read Promotion), which is an efficient ORAM scheduler that leverages runtime identification of read accesses to effectively prioritize the service of critical-path-bound read access read phase operations, while preserving all data dependencies. Second, it presents LEO (Low overhead Encryption ORAM) that reduces cell writes by opportunistically decreasing the number of block encryptions, while preserving the security guarantees of the baseline Path ORAM. This dissertation therefore addresses the core chal- lenges of read/write energy and latency, endurance, and system performance for integration of essential security primitives in emerging memory architectures. Future research directions will focus on (i) exploring efficient solutions for ORAM read phase optimization and secure ORAM resizing, (ii) investigating the security challenges of emerging processing-in-memory architectures, and (iii) investigating the interplay of security primitives with reliability enhancing architectures

    Memory Systems and Interconnects for Scale-Out Servers

    Get PDF
    The information revolution of the last decade has been fueled by the digitization of almost all human activities through a wide range of Internet services. The backbone of this information age are scale-out datacenters that need to collect, store, and process massive amounts of data. These datacenters distribute vast datasets across a large number of servers, typically into memory-resident shards so as to maintain strict quality-of-service guarantees. While data is driving the skyrocketing demands for scale-out servers, processor and memory manufacturers have reached fundamental efficiency limits, no longer able to increase server energy efficiency at a sufficient pace. As a result, energy has emerged as the main obstacle to the scalability of information technology (IT) with huge economic implications. Delivering sustainable IT calls for a paradigm shift in computer system design. As memory has taken a central role in IT infrastructure, memory-centric architectures are required to fully utilize the IT's costly memory investment. In response, processor architects are resorting to manycore architectures to leverage the abundant request-level parallelism found in data-centric applications. Manycore processors fully utilize available memory resources, thereby increasing IT efficiency by almost an order of magnitude. Because manycore server chips execute a large number of concurrent requests, they exhibit high incidence of accesses to the last-level-cache for fetching instructions (due to large instruction footprints), and off-chip memory (due to lack of temporal reuse in on-chip caches) for accessing dataset objects. As a result, on-chip interconnects and the memory system are emerging as major performance and energy-efficiency bottlenecks in servers. This thesis seeks to architect on-chip interconnects and memory systems that are tuned for the requirements of memory-centric scale-out servers. By studying a wide range of data-centric applications, we uncover application phenomena common in data-centric applications, and examine their implications on on-chip network and off-chip memory traffic. Finally, we propose specialized on-chip interconnects and memory systems that leverage common traffic characteristics, thereby improving server throughput and energy efficiency
    corecore