140 research outputs found

    ์ดˆ๊ณ ์šฉ๋Ÿ‰ ์†”๋ฆฌ๋“œ ์Šคํ…Œ์ด๋“œ ๋“œ๋ผ์ด๋ธŒ๋ฅผ ์œ„ํ•œ ์‹ ๋ขฐ์„ฑ ํ–ฅ์ƒ ๋ฐ ์„ฑ๋Šฅ ์ตœ์ ํ™” ๊ธฐ์ˆ 

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2021.8. ๊น€์ง€ํ™.The development of ultra-large NAND flash storage devices (SSDs) is recently made possible by NAND flash memory semiconductor process scaling and multi-leveling techniques, and NAND package technology, which enables continuous increasing of storage capacity by mounting many NAND flash memory dies in an SSD. As the capacity of an SSD increases, the total cost of ownership of the storage system can be reduced very effectively, however due to limitations of ultra-large SSDs in reliability and performance, there exists some obstacles for ultra-large SSDs to be widely adopted. In order to take advantage of an ultra-large SSD, it is necessary to develop new techniques to improve these reliability and performance issues. In this dissertation, we propose various optimization techniques to solve the reliability and performance issues of ultra-large SSDs. In order to overcome the optimization limitations of the existing approaches, our techniques were designed based on various characteristic evaluation results of NAND flash devices and field failure characteristics analysis results of real SSDs. We first propose a low-stress erase technique for the purpose of reducing the characteristic deviation between wordlines (WLs) in a NAND flash block. By reducing the erase stress on weak WLs, it effectively slows down NAND degradation and improves NAND endurance. From the NAND evaluation results, the conditions that can most effectively guard the weak WLs are defined as the gerase mode. In addition, considering the user workload characteristics, we propose a technique to dynamically select the optimal gerase mode that can maximize the lifetime of the SSD. Secondly, we propose an integrated approach that maximizes the efficiency of copyback operations to improve performance while not compromising data reliability. Based on characterization using real 3D TLC flash chips, we propose a novel per-block error propagation model under consecutive copyback operations. Our model significantly increases the number of successive copybacks by exploiting the aging characteristics of NAND blocks. Furthermore, we devise a resource-efficient error management scheme that can handle successive copybacks where pages move around multiple blocks with different reliability. By utilizing proposed copyback operation for internal data movement, SSD performance can be effectively improved without any reliability issues. Finally, we propose a new recovery scheme, called reparo, for a RAID storage system with ultra-large SSDs. Unlike the existing RAID recovery schemes, reparo repairs a failed SSD at the NAND die granularity without replacing it with a new SSD, thus avoiding most of the inter-SSD data copies during a RAID recovery step. When a NAND die of an SSD fails, reparo exploits a multi-core processor of the SSD controller to identify failed LBAs from the failed NAND die and to recover data from the failed LBAs. Furthermore, reparo ensures no negative post-recovery impact on the performance and lifetime of the repaired SSD. In order to evaluate the effectiveness of the proposed techniques, we implemented them in a storage device prototype, an open NAND flash storage device development environment, and a real SSD environment. And their usefulness was verified using various benchmarks and I/O traces collected the from real-world applications. The experiment results show that the reliability and performance of the ultra-large SSD can be effectively improved through the proposed techniques.๋ฐ˜๋„์ฒด ๊ณต์ •์˜ ๋ฏธ์„ธํ™”, ๋‹ค์น˜ํ™” ๊ธฐ์ˆ ์— ์˜ํ•ด์„œ ์ง€์†์ ์œผ๋กœ ์šฉ๋Ÿ‰์ด ์ฆ๊ฐ€ํ•˜๊ณ  ์žˆ๋Š” ๋‹จ์œ„ ๋‚ธ๋“œ ํ”Œ๋ž˜์‰ฌ ๋ฉ”๋ชจ๋ฆฌ์™€ ํ•˜๋‚˜์˜ ๋‚ธ๋“œ ํ”Œ๋ž˜์‰ฌ ๊ธฐ๋ฐ˜ ์Šคํ† ๋ฆฌ์ง€ ์‹œ์Šคํ…œ ๋‚ด์— ์ˆ˜ ๋งŽ์€ ๋‚ธ๋“œ ํ”Œ๋ž˜์‰ฌ ๋ฉ”๋ชจ๋ฆฌ ๋‹ค์ด๋ฅผ ์‹ค์žฅํ•  ์ˆ˜ ์žˆ๊ฒŒํ•˜๋Š” ๋‚ธ๋“œ ํŒจํ‚ค์ง€ ๊ธฐ์ˆ ๋กœ ์ธํ•ด ํ•˜๋“œ๋””์Šคํฌ๋ณด๋‹ค ํ›จ์”ฌ ๋” ํฐ ์ดˆ๊ณ ์šฉ๋Ÿ‰์˜ ๋‚ธ๋“œ ํ”Œ๋ž˜์‰ฌ ์ €์žฅ์žฅ์น˜์˜ ๊ฐœ๋ฐœ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ–ˆ๋‹ค. ํ”Œ๋ž˜์‰ฌ ์ €์žฅ์žฅ์น˜์˜ ์šฉ๋Ÿ‰์ด ์ฆ๊ฐ€ํ•  ์ˆ˜๋ก ์Šคํ† ๋ฆฌ์ง€ ์‹œ์Šคํ…œ์˜ ์ด ์†Œ์œ ๋น„์šฉ์„ ์ค„์ด๋Š”๋ฐ ๋งค์šฐ ํšจ๊ณผ์ ์ธ ์žฅ์ ์„ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋‚˜, ์‹ ๋ขฐ์„ฑ ๋ฐ ์„ฑ๋Šฅ์˜ ์ธก๋ฉด์—์„œ์˜ ํ•œ๊ณ„๋กœ ์ธํ•ด์„œ ์ดˆ๊ณ ์šฉ๋Ÿ‰ ๋‚ธ๋“œ ํ”Œ๋ž˜์‰ฌ ์ €์žฅ์žฅ์น˜๊ฐ€ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š”๋ฐ ์žˆ์–ด์„œ ์žฅ์• ๋ฌผ๋กœ ์ž‘์šฉํ•˜๊ณ  ์žˆ๋‹ค. ์ดˆ๊ณ ์šฉ๋Ÿ‰ ์ €์žฅ์žฅ์น˜์˜ ์žฅ์ ์„ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ด๋Ÿฌํ•œ ์‹ ๋ขฐ์„ฑ ๋ฐ ์„ฑ๋Šฅ์„ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•œ ์ƒˆ๋กœ์šด ๊ธฐ๋ฒ•์˜ ๊ฐœ๋ฐœ์ด ํ•„์š”ํ•˜๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ดˆ๊ณ ์šฉ๋Ÿ‰ ๋‚ธ๋“œ๊ธฐ๋ฐ˜ ์ €์žฅ์žฅ์น˜(SSD)์˜ ๋ฌธ์ œ์ ์ธ ์„ฑ๋Šฅ ๋ฐ ์‹ ๋ขฐ์„ฑ์„ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•œ ๋‹ค์–‘ํ•œ ์ตœ์ ํ™” ๊ธฐ์ˆ ์„ ์ œ์•ˆํ•œ๋‹ค. ๊ธฐ์กด ๊ธฐ๋ฒ•๋“ค์˜ ์ตœ์ ํ™” ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด์„œ, ์šฐ๋ฆฌ์˜ ๊ธฐ์ˆ ์€ ์‹ค์ œ ๋‚ธ๋“œ ํ”Œ๋ž˜์‰ฌ ์†Œ์ž์— ๋Œ€ํ•œ ๋‹ค์–‘ํ•œ ํŠน์„ฑ ํ‰๊ฐ€ ๊ฒฐ๊ณผ์™€ SSD์˜ ํ˜„์žฅ ๋ถˆ๋Ÿ‰ ํŠน์„ฑ ๋ถ„์„๊ฒฐ๊ณผ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ณ ์•ˆ๋˜์—ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด์„œ ๋‚ธ๋“œ์˜ ํ”Œ๋ž˜์‰ฌ ํŠน์„ฑ๊ณผ SSD, ๊ทธ๋ฆฌ๊ณ  ํ˜ธ์ŠคํŠธ ์‹œ์Šคํ…œ์˜ ๋™์ž‘ ํŠน์„ฑ์„ ๊ณ ๋ คํ•œ ์„ฑ๋Šฅ ๋ฐ ์‹ ๋ขฐ์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ์ตœ์ ํ™” ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์‹œํ•œ๋‹ค. ์ฒซ์งธ๋กœ, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋‚ธ๋“œ ํ”Œ๋ž˜์‰ฌ ๋ถˆ๋ก๋‚ด์˜ ํŽ˜์ด์ง€๋“ค๊ฐ„์˜ ํŠน์„ฑํŽธ์ฐจ๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด์„œ ๋™์ ์ธ ์†Œ๊ฑฐ ์ŠคํŠธ๋ ˆ์Šค ๊ฒฝ๊ฐ ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ์ œ์•ˆ๋œ ๊ธฐ๋ฒ•์€ ๋‚ธ๋“œ ๋ธ”๋ก์˜ ๋‚ด๊ตฌ์„ฑ์„ ๋Š˜๋ฆฌ๊ธฐ ์œ„ํ•ด์„œ ํŠน์„ฑ์ด ์•ฝํ•œ ํŽ˜์ด์ง€๋“ค์— ๋Œ€ํ•ด์„œ ๋” ์ ์€ ์†Œ๊ฑฐ ์ŠคํŠธ๋ ˆ์Šค๊ฐ€ ์ธ๊ฐ€ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋‚ธ๋“œ ํ‰๊ฐ€ ๊ฒฐ๊ณผ๋กœ ๋ถ€ํ„ฐ ์†Œ๊ฑฐ ์ŠคํŠธ๋ ˆ์Šค ๊ฒฝ๊ฐ ๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ•œ๋‹ค. ๋˜ํ•œ ์‚ฌ์šฉ์ž ์›Œํฌ๋กœ๋“œ ํŠน์„ฑ์„ ๊ณ ๋ คํ•˜์—ฌ, ์†Œ๊ฑฐ ์ŠคํŠธ๋ ˆ์Šค ๊ฒฝ๊ฐ ๊ธฐ๋ฒ•์˜ ํšจ๊ณผ๊ฐ€ ์ตœ๋Œ€ํ™” ๋  ์ˆ˜ ์žˆ๋Š” ์ตœ์ ์˜ ๊ฒฝ๊ฐ ์ˆ˜์ค€์„ ๋™์ ์œผ๋กœ ํŒ๋‹จํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด์„œ ๋‚ธ๋“œ ๋ธ”๋ก์„ ์—ดํ™”์‹œํ‚ค๋Š” ์ฃผ์š” ์›์ธ์ธ ์†Œ๊ฑฐ ๋™์ž‘์„ ํšจ์œจ์ ์œผ๋กœ ์ œ์–ดํ•จ์œผ๋กœ์จ ์ €์žฅ์žฅ์น˜์˜ ์ˆ˜๋ช…์„ ํšจ๊ณผ์ ์œผ๋กœ ํ–ฅ์ƒ์‹œํ‚จ๋‹ค. ๋‘˜์งธ๋กœ, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๊ณ ์šฉ๋Ÿ‰ SSD์—์„œ์˜ ๋‚ด๋ถ€ ๋ฐ์ดํ„ฐ ์ด๋™์œผ๋กœ ์ธํ•œ ์„ฑ๋Šฅ ์ €ํ•˜๋ฌธ์ œ๋ฅผ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋‚ธ๋“œ ํ”Œ๋ž˜์‰ฌ์˜ ์ œํ•œ๋œ ์นดํ”ผ๋ฐฑ(copyback) ๋ช…๋ น์„ ํ™œ์šฉํ•˜๋Š” ์ ์‘ํ˜• ๊ธฐ๋ฒ•์ธ rCPB์„ ์ œ์•ˆํ•œ๋‹ค. rCPB์€ Copyback ๋ช…๋ น์˜ ํšจ์œจ์„ฑ์„ ๊ทน๋Œ€ํ™” ํ•˜๋ฉด์„œ๋„ ๋ฐ์ดํ„ฐ ์‹ ๋ขฐ์„ฑ์— ๋ฌธ์ œ๊ฐ€ ์—†๋„๋ก ๋‚ธ๋“œ์˜ ๋ธ”๋Ÿญ์˜ ๋…ธํ™”ํŠน์„ฑ์„ ๋ฐ˜์˜ํ•œ ์ƒˆ๋กœ์šด copyback ์˜ค๋ฅ˜ ์ „ํŒŒ ๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœํ•œ๋‹ค. ์ด์—๋”ํ•ด, ์‹ ๋ขฐ์„ฑ์ด ๋‹ค๋ฅธ ๋ธ”๋Ÿญ๊ฐ„์˜ copyback ๋ช…๋ น์„ ํ™œ์šฉํ•œ ๋ฐ์ดํ„ฐ ์ด๋™์„ ๋ฌธ์ œ์—†์ด ๊ด€๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด์„œ ์ž์› ํšจ์œจ์ ์ธ ์˜ค๋ฅ˜ ๊ด€๋ฆฌ ์ฒด๊ณ„๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด์„œ ์‹ ๋ขฐ์„ฑ์— ๋ฌธ์ œ๋ฅผ ์ฃผ์ง€ ์•Š๋Š” ์ˆ˜์ค€์—์„œ copyback์„ ์ตœ๋Œ€ํ•œ ํ™œ์šฉํ•˜์—ฌ ๋‚ด๋ถ€ ๋ฐ์ดํ„ฐ ์ด๋™์„ ์ตœ์ ํ™” ํ•จ์œผ๋กœ์จ SSD์˜ ์„ฑ๋Šฅํ–ฅ์ƒ์„ ๋‹ฌ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ดˆ๊ณ ์šฉ๋Ÿ‰ SSD์—์„œ ๋‚ธ๋“œ ํ”Œ๋ž˜์‰ฌ์˜ ๋‹ค์ด(die) ๋ถˆ๋Ÿ‰์œผ๋กœ ์ธํ•œ ๋ ˆ์ด๋“œ(redundant array of independent disks, RAID) ๋ฆฌ๋นŒ๋“œ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ตœ์†Œํ™” ํ•˜๊ธฐ์œ„ํ•œ ์ƒˆ๋กœ์šด RAID ๋ณต๊ตฌ ๊ธฐ๋ฒ•์ธ reparo๋ฅผ ์ œ์•ˆํ•œ๋‹ค. Reparo๋Š” SSD์— ๋Œ€ํ•œ ๊ต์ฒด์—†์ด SSD์˜ ๋ถˆ๋Ÿ‰ die์— ๋Œ€ํ•ด์„œ๋งŒ ๋ณต๊ตฌ๋ฅผ ์ˆ˜ํ–‰ํ•จ์œผ๋กœ์จ ๋ณต๊ตฌ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ตœ์†Œํ™”ํ•œ๋‹ค. ๋ถˆ๋Ÿ‰์ด ๋ฐœ์ƒํ•œ die์˜ ๋ฐ์ดํ„ฐ๋งŒ ์„ ๋ณ„์ ์œผ๋กœ ๋ณต๊ตฌํ•จ์œผ๋กœ์จ ๋ณต๊ตฌ ๊ณผ์ •์˜ ๋ฆฌ๋นŒ๋“œ ํŠธ๋ž˜ํ”ฝ์„ ์ตœ์†Œํ™”ํ•˜๋ฉฐ, SSD ๋‚ด๋ถ€์˜ ๋ณ‘๋ ฌ๊ตฌ์กฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋ถˆ๋Ÿ‰ die ๋ณต๊ตฌ ์‹œ๊ฐ„์„ ํšจ๊ณผ์ ์œผ๋กœ ๋‹จ์ถ•ํ•œ๋‹ค. ๋˜ํ•œ die ๋ถˆ๋Ÿ‰์œผ๋กœ ์ธํ•œ ๋ฌผ๋ฆฌ์  ๊ณต๊ฐ„๊ฐ์†Œ์˜ ๋ถ€์ž‘์šฉ์„ ์ตœ์†Œํ™” ํ•จ์œผ๋กœ์จ ๋ณต๊ตฌ ์ดํ›„์˜ ์„ฑ๋Šฅ ์ €ํ•˜ ๋ฐ ์ˆ˜๋ช…์˜ ๊ฐ์†Œ ๋ฌธ์ œ๊ฐ€ ์—†๋„๋ก ํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ ๊ธฐ๋ฒ•๋“ค์€ ์ €์žฅ์žฅ์น˜ ํ”„๋กœํ† ํƒ€์ž… ๋ฐ ๊ณต๊ฐœ ๋‚ธ๋“œ ํ”Œ๋ž˜์‰ฌ ์ €์žฅ์žฅ์น˜ ๊ฐœ๋ฐœํ™˜๊ฒฝ, ๊ทธ๋ฆฌ๊ณ  ์‹ค์žฅ SSDํ™˜๊ฒฝ์— ๊ตฌํ˜„๋˜์—ˆ์œผ๋ฉฐ, ์‹ค์ œ ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ์„ ๋ชจ์‚ฌํ•œ ๋‹ค์–‘ํ•œ ๋ฒคํŠธ๋งˆํฌ ๋ฐ ์‹ค์ œ I/O ํŠธ๋ ˆ์ด์Šค๋“ค์„ ์ด์šฉํ•˜์—ฌ ๊ทธ ์œ ์šฉ์„ฑ์„ ๊ฒ€์ฆํ•˜์˜€๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ, ์ œ์•ˆ๋œ ๊ธฐ๋ฒ•๋“ค์„ ํ†ตํ•ด์„œ ์ดˆ๊ณ ์šฉ๋Ÿ‰ SSD์˜ ์‹ ๋ขฐ์„ฑ ๋ฐ ์„ฑ๋Šฅ์„ ํšจ๊ณผ์ ์œผ๋กœ ๊ฐœ์„ ํ•  ์ˆ˜ ์žˆ์Œ์„ ํ™•์ธํ•˜์˜€๋‹ค.I Introduction 1 1.1 Motivation 1 1.2 Dissertation Goals 3 1.3 Contributions 5 1.4 Dissertation Structure 8 II Background 11 2.1 Overview of 3D NAND Flash Memory 11 2.2 Reliability Management in NAND Flash Memory 14 2.3 UL SSD architecture 15 2.4 Related Work 17 2.4.1 NAND endurance optimization by utilizing page characteristics difference 17 2.4.2 Performance optimizations using copyback operation 18 2.4.3 Optimizations for RAID Rebuild 19 2.4.4 Reliability improvement using internal RAID 20 III GuardedErase: Extending SSD Lifetimes by Protecting Weak Wordlines 22 3.1 Reliability Characterization of a 3D NAND Flash Block 22 3.1.1 Large Reliability Variations Among WLs 22 3.1.2 Erase Stress on Flash Reliability 26 3.2 GuardedErase: Design Overview and its Endurance Model 28 3.2.1 Basic Idea 28 3.2.2 Per-WL Low-Stress Erase Mode 31 3.2.3 Per-Block Erase Modes 35 3.3 Design and Implementation of LongFTL 39 3.3.1 Overview 39 3.3.2 Weak WL Detector 40 3.3.3 WAF Monitor 42 3.3.4 GErase Mode Selector 43 3.4 Experimental Results 46 3.4.1 Experimental Settings 46 3.4.2 Lifetime Improvement 47 3.4.3 Performance Overhead 49 3.4.4 Effectiveness of Lowest Erase Relief Ratio 50 IV Improving SSD Performance Using Adaptive Restricted- Copyback Operations 52 4.1 Motivations 52 4.1.1 Data Migration in Modern SSD 52 4.1.2 Need for Block Aging-Aware Copyback 53 4.2 RCPB: Copyback with a Limit 55 4.2.1 Error-Propagation Characteristics 55 4.2.2 RCPB Operation Model 58 4.3 Design and Implementation of rcFTL 59 4.3.1 EPM module 60 4.3.2 Data Migration Mode Selection 64 4.4 Experimental Results 65 4.4.1 Experimental Setup 65 4.4.2 Evaluation Results 66 V Reparo: A Fast RAID Recovery Scheme for Ultra- Large SSDs 70 5.1 SSD Failures: Causes and Characteristics 70 5.1.1 SSD Failure Types 70 5.1.2 SSD Failure Characteristics 72 5.2 Impact of UL SSDs on RAID Reliability 74 5.3 RAID Recovery using Reparo 77 5.3.1 Overview of Reparo 77 5.4 Cooperative Die Recovery 82 5.4.1 Identifier: Parallel Search of Failed LBAs 82 5.4.2 Handler: Per-Core Space Utilization Adjustment 83 5.5 Identifier Acceleration Using P2L Mapping Information 89 5.5.1 Page-level P2L Entrustment to Neighboring Die 90 5.5.2 Block-level P2L Entrustment to Neighboring Die 92 5.5.3 Additional Considerations for P2L Entrustment 94 5.6 Experimental Results 95 5.6.1 Experimental Settings 95 5.6.2 Experimental Results 97 VI Conclusions 109 6.1 Summary 109 6.2 Future Work 111 6.2.1 Optimization with Accurate WAF Prediction 111 6.2.2 Maximizing Copyback Threshold 111 6.2.3 Pre-failure Detection 112๋ฐ•

    Reliable Low-Power High Performance Spintronic Memories

    Get PDF
    Moores Gesetz folgend, ist es der Chipindustrie in den letzten fรผnf Jahrzehnten gelungen, ein explosionsartiges Wachstum zu erreichen. Dies hatte ebenso einen exponentiellen Anstieg der Nachfrage von Speicherkomponenten zur Folge, was wiederum zu speicherlastigen Chips in den heutigen Computersystemen fรผhrt. Allerdings stellen traditionelle on-Chip Speichertech- nologien wie Static Random Access Memories (SRAMs), Dynamic Random Access Memories (DRAMs) und Flip-Flops eine Herausforderung in Bezug auf Skalierbarkeit, Verlustleistung und Zuverlรคssigkeit dar. Eben jene Herausforderungen und die รผberwรคltigende Nachfrage nach hรถherer Performanz und Integrationsdichte des on-Chip Speichers motivieren Forscher, nach neuen nichtflรผchtigen Speichertechnologien zu suchen. Aufkommende spintronische Spe- ichertechnologien wie Spin Orbit Torque (SOT) und Spin Transfer Torque (STT) erhielten in den letzten Jahren eine hohe Aufmerksamkeit, da sie eine Reihe an Vorteilen bieten. Dazu gehรถren Nichtflรผchtigkeit, Skalierbarkeit, hohe Bestรคndigkeit, CMOS Kompatibilitรคt und Unan- fรคlligkeit gegenรผber Soft-Errors. In der Spintronik reprรคsentiert der Spin eines Elektrons dessen Information. Das Datum wird durch die Hรถhe des Widerstandes gespeichert, welche sich durch das Anlegen eines polarisierten Stroms an das Speichermedium verรคndern lรคsst. Das Prob- lem der statischen Leistung gehen die Speichergerรคte sowohl durch deren verlustleistungsfreie Eigenschaft, als auch durch ihr Standard- Aus/Sofort-Ein Verhalten an. Nichtsdestotrotz sind noch andere Probleme, wie die hohe Zugriffslatenz und die Energieaufnahme zu lรถsen, bevor sie eine verbreitete Anwendung finden kรถnnen. Um diesen Problemen gerecht zu werden, sind neue Computerparadigmen, -architekturen und -entwurfsphilosophien notwendig. Die hohe Zugriffslatenz der Spintroniktechnologie ist auf eine vergleichsweise lange Schalt- dauer zurรผckzufรผhren, welche die von konventionellem SRAM รผbersteigt. Des Weiteren ist auf Grund des stochastischen Schaltvorgangs der Speicherzelle und des Einflusses der Prozessvari- ation ein nicht zu vernachlรคssigender Zeitraum dafรผr erforderlich. In diesem Zeitraum wird ein konstanter Schreibstrom durch die Bitzelle geleitet, um den Schaltvorgang zu gewรคhrleisten. Dieser Vorgang verursacht eine hohe Energieaufnahme. Fรผr die Leseoperation wird gleicher- maรŸen ein beachtliches Zeitfenster benรถtigt, ebenfalls bedingt durch den Einfluss der Prozess- variation. Dem gegenรผber stehen diverse Zuverlรคssigkeitsprobleme. Dazu gehรถren unter An- derem die Leseintereferenz und andere Degenerationspobleme, wie das des Time Dependent Di- electric Breakdowns (TDDB). Diese Zuverlรคssigkeitsprobleme sind wiederum auf die benรถtigten lรคngeren Schaltzeiten zurรผckzufรผhren, welche in der Folge auch einen รผber lรคngere Zeit an- liegenden Lese- bzw. Schreibstrom implizieren. Es ist daher notwendig, sowohl die Energie, als auch die Latenz zur Steigerung der Zuverlรคssigkeit zu reduzieren, um daraus einen potenziellen Kandidaten fรผr ein on-Chip Speichersystem zu machen. In dieser Dissertation werden wir Entwurfsstrategien vorstellen, welche das Ziel verfolgen, die Herausforderungen des Cache-, Register- und Flip-Flop-Entwurfs anzugehen. Dies erre- ichen wir unter Zuhilfenahme eines Cross-Layer Ansatzes. Fรผr Caches entwickelten wir ver- schiedene Ansรคtze auf Schaltkreisebene, welche sowohl auf der Speicherarchitekturebene, als auch auf der Systemebene in Bezug auf Energieaufnahme, Performanzsteigerung und Zuver- lรคssigkeitverbesserung evaluiert werden. Wir entwickeln eine Selbstabschalttechnik, sowohl fรผr die Lese-, als auch die Schreiboperation von Caches. Diese ist in der Lage, den Abschluss der entsprechenden Operation dynamisch zu ermitteln. Nachdem der Abschluss erkannt wurde, wird die Lese- bzw. Schreiboperation sofort gestoppt, um Energie zu sparen. Zusรคtzlich limitiert die Selbstabschalttechnik die Dauer des Stromflusses durch die Speicherzelle, was wiederum das Auftreten von TDDB und Leseinterferenz bei Schreib- bzw. Leseoperationen re- duziert. Zur Verbesserung der Schreiblatenz heben wir den Schreibstrom an der Bitzelle an, um den magnetischen Schaltprozess zu beschleunigen. Um registerbankspezifische Anforderungen zu berรผcksichtigen, haben wir zusรคtzlich eine Multiport-Speicherarchitektur entworfen, welche eine einzigartige Eigenschaft der SOT-Zelle ausnutzt, um simultan Lese- und Schreiboperatio- nen auszufรผhren. Es ist daher mรถglich Lese/Schreib- Konfilkte auf Bitzellen-Ebene zu lรถsen, was sich wiederum in einer sehr viel einfacheren Multiport- Registerbankarchitektur nieder- schlรคgt. Zusรคtzlich zu den Speicheransรคtzen haben wir ebenfalls zwei Flip-Flop-Architekturen vorgestellt. Die erste ist eine nichtflรผchtige non-Shadow Flip-Flop-Architektur, welche die Speicherzelle als aktive Komponente nutzt. Dies ermรถglicht das sofortige An- und Ausschalten der Versorgungss- pannung und ist daher besonders gut fรผr aggressives Powergating geeignet. Alles in Allem zeigt der vorgestellte Flip-Flop-Entwurf eine รคhnliche Timing-Charakteristik wie die konventioneller CMOS Flip-Flops auf. Jedoch erlaubt er zur selben Zeit eine signifikante Reduktion der statis- chen Leistungsaufnahme im Vergleich zu nichtflรผchtigen Shadow- Flip-Flops. Die zweite ist eine fehlertolerante Flip-Flop-Architektur, welche sich unanfรคllig gegenรผber diversen Defekten und Fehlern verhรคlt. Die Leistungsfรคhigkeit aller vorgestellten Techniken wird durch ausfรผhrliche Simulationen auf Schaltkreisebene verdeutlicht, welche weiter durch detaillierte Evaluationen auf Systemebene untermauert werden. Im Allgemeinen konnten wir verschiedene Techniken en- twickeln, die erhebliche Verbesserungen in Bezug auf Performanz, Energie und Zuverlรคssigkeit von spintronischen on-Chip Speichern, wie Caches, Register und Flip-Flops erreichen

    Investigating ferroelectric and metal-insulator phase transition devices for neuromorphic computing

    Get PDF
    Neuromorphic computing has been proposed to accelerate the computation for deep neural networks (DNNs). The objective of this thesis work is to investigate the ferroelectric and metal-insulator phase transition devices for neuromorphic computing. This thesis proposed and experimentally demonstrated the drain erase scheme in FeFET to enable the individual cell program/erase/inhibition for in-situ training in 3D NAND-like FeFET array. To achieve multi-level states for analog in-memory computing, the ferroelectric thin film needs to be partially switched. This thesis identified a new challenge of ferroelectric partial switching, namely โ€œhistory effectโ€ in minor loop dynamics. The experimental characterization of both FeCap and FeFET validated the history effect, suggesting that the intermediate states programming condition depends on the prior states that the device has gone through. A phase-field model was constructed to understand the origin. Such history effect was then modelled into the FeFET based neural network simulation and analyze its negative impact on the training accuracy and then propose a possible mitigation strategy. Apart from using FeFET as synaptic devices, using metal-insulator phase transition device, as neuron was also explored experimentally. A NbOx metal-insulator phase transition threshold switch was integrated at the edge of the crossbar array as an oscillation neuron. One promising application for FeFET+NbOx neuromorphic system is to implement quantum error correction (QEC) circuitry at 4K. Cryo-NeuroSim, a device-to-system modeling framework that calibrates data at cryogenic temperature was developed to benchmark the performance of the FeFET+NbOx neuromorphic system.Ph.D

    Flash Memory Devices

    Get PDF
    Flash memory devices have represented a breakthrough in storage since their inception in the mid-1980s, and innovation is still ongoing. The peculiarity of such technology is an inherent flexibility in terms of performance and integration density according to the architecture devised for integration. The NOR Flash technology is still the workhorse of many code storage applications in the embedded world, ranging from microcontrollers for automotive environment to IoT smart devices. Their usage is also forecasted to be fundamental in emerging AI edge scenario. On the contrary, when massive data storage is required, NAND Flash memories are necessary to have in a system. You can find NAND Flash in USB sticks, cards, but most of all in Solid-State Drives (SSDs). Since SSDs are extremely demanding in terms of storage capacity, they fueled a new wave of innovation, namely the 3D architecture. Today โ€œ3Dโ€ means that multiple layers of memory cells are manufactured within the same piece of silicon, easily reaching a terabit capacity. So far, Flash architectures have always been based on "floating gate," where the information is stored by injecting electrons in a piece of polysilicon surrounded by oxide. On the contrary, emerging concepts are based on "charge trap" cells. In summary, flash memory devices represent the largest landscape of storage devices, and we expect more advancements in the coming years. This will require a lot of innovation in process technology, materials, circuit design, flash management algorithms, Error Correction Code and, finally, system co-design for new applications such as AI and security enforcement

    High-Density Solid-State Memory Devices and Technologies

    Get PDF
    This Special Issue aims to examine high-density solid-state memory devices and technologies from various standpoints in an attempt to foster their continuous success in the future. Considering that broadening of the range of applications will likely offer different types of solid-state memories their chance in the spotlight, the Special Issue is not focused on a specific storage solution but rather embraces all the most relevant solid-state memory devices and technologies currently on stage. Even the subjects dealt with in this Special Issue are widespread, ranging from process and design issues/innovations to the experimental and theoretical analysis of the operation and from the performance and reliability of memory devices and arrays to the exploitation of solid-state memories to pursue new computing paradigms

    Semiconductor Memory Applications in Radiation Environment, Hardware Security and Machine Learning System

    Get PDF
    abstract: Semiconductor memory is a key component of the computing systems. Beyond the conventional memory and data storage applications, in this dissertation, both mainstream and eNVM memory technologies are explored for radiation environment, hardware security system and machine learning applications. In the radiation environment, e.g. aerospace, the memory devices face different energetic particles. The strike of these energetic particles can generate electron-hole pairs (directly or indirectly) as they pass through the semiconductor device, resulting in photo-induced current, and may change the memory state. First, the trend of radiation effects of the mainstream memory technologies with technology node scaling is reviewed. Then, single event effects of the oxide based resistive switching random memory (RRAM), one of eNVM technologies, is investigated from the circuit-level to the system level. Physical Unclonable Function (PUF) has been widely investigated as a promising hardware security primitive, which employs the inherent randomness in a physical system (e.g. the intrinsic semiconductor manufacturing variability). In the dissertation, two RRAM-based PUF implementations are proposed for cryptographic key generation (weak PUF) and device authentication (strong PUF), respectively. The performance of the RRAM PUFs are evaluated with experiment and simulation. The impact of non-ideal circuit effects on the performance of the PUFs is also investigated and optimization strategies are proposed to solve the non-ideal effects. Besides, the security resistance against modeling and machine learning attacks is analyzed as well. Deep neural networks (DNNs) have shown remarkable improvements in various intelligent applications such as image classification, speech classification and object localization and detection. Increasing efforts have been devoted to develop hardware accelerators. In this dissertation, two types of compute-in-memory (CIM) based hardware accelerator designs with SRAM and eNVM technologies are proposed for two binary neural networks, i.e. hybrid BNN (HBNN) and XNOR-BNN, respectively, which are explored for the hardware resource-limited platforms, e.g. edge devices.. These designs feature with high the throughput, scalability, low latency and high energy efficiency. Finally, we have successfully taped-out and validated the proposed designs with SRAM technology in TSMC 65 nm. Overall, this dissertation paves the paths for memory technologiesโ€™ new applications towards the secure and energy-efficient artificial intelligence system.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Error Characterization and Correction Techniques for Reliable STT-RAM Designs

    Get PDF
    The concerns on the continuous scaling of mainstream memory technologies have motivated tremendous investment to emerging memories. Being a promising candidate, spin-transfer torque random access memory (STT-RAM) offers nanosecond access time comparable to SRAM, high integration density close to DRAM, non-volatility as Flash memory, and good scalability. It is well positioned as the replacement of SRAM and DRAM for on-chip cache and main memory applications. However, reliability issue continues being one of the major challenges in STT-RAM memory designs due to the process variations and unique thermal fluctuations, i.e., the stochastic resistance switching property of magnetic devices. In this dissertation, I decoupled the reliability issues as following three-folds: First, the characterization of STT-RAM operation errors often require expensive Monte-Carlo runs with hybrid magnetic-CMOS simulation steps, making it impracticable for architects and system designs; Second, the state of the art does not have sufficiently understanding on the unique reliability issue of STT-RAM, and conventional error correction codes (ECCs) cannot efficiently handle such errors; Third, while the information density of STT-RAM can be boosted by multi-level cell (MLC) design, the more prominent reliability concerns and the complicated access mechanism greatly limit its applications in memory subsystems. Thus, I present a novel through solution set to both characterize and tackle the above reliability challenges in STT-RAM designs. In the first part of the dissertation, I introduce a new characterization method that can accurately and efficiently capture the multi-variable design metrics of STT-RAM cells; Second, a novel ECC scheme, namely, content-dependent ECC (CD-ECC), is developed to combat the characterized asymmetric errors of STT-RAM at 0->1 and 1->0 bit flipping's; Third, I present a circuit-architecture design, namely state-restricted multi-level cell (SR-MLC) STT-RAM design, which simultaneously achieves high information density, good storage reliability and fast write speed, making MLC STT-RAM accessible for system designers under current technology node. Finally, I conclude that efficient robust (or ECC) designs for STT-RAM require a deep holistic understanding on three different levels-device, circuit and architecture. Innovative ECC schemes and their architectural applications, still deserve serious research and investigation in the near future

    A Statistical STT-RAM Design View and Robust Designs at Scaled Technologies

    Get PDF
    Rapidly increased demands for memory in electronic industry and the significant technical scaling challenges of all conventional memory technologies motivated the researches on the next generation memory technology. As one promising candidate, spin-transfer torque random access memory (STT-RAM) features fast access time, high density, non-volatility, and good CMOS process compatibility. In recent years, many researches have been conducted to improve the storage density and enhance the scalability of STT-RAM, such as reducing the write current and switching time of magnetic tunneling junction (MTJ) devices. In parallel with these efforts, the continuous increasing of tunnel magneto-resistance(TMR) ratio of the MTJ inspires the development of multi-level cell (MLC) STT-RAM, which allows multiple data bits be stored in a single memory cell. Two types of MLC STT-RAM cells, namely, parallel MLC and series MLC, were also proposed. However, like all other nanoscale devices, the performance and reliability of STT-RAM cells are severely affected by process variations, intrinsic device operating uncertainties and environmental fluctuations. The storage margin of a MLC STT-RAM cell, i.e., the distinction between the lowest and highest resistance states, is partitioned into multiple segments for multi-level data representation. As a result, the performance and reliability of MLC STT-RAM cells become more sensitive to the MOS and MTJ device variations and the thermal-induced randomness of MTJ switching. In this work, we systematically analyze the impacts of CMOS and MTJ process variations, MTJ resistance switching randomness that induced by intrinsic thermal fluctuations, and working temperature changes on STT-RAM cell designs. The STT-RAM cell reliability issues in both read and write operations are first investigated. A combined circuit and magnetic simulation platform is then established to quantitatively study the persistent and non-persistent errors in STT-RAM cell operations. Then, we analyzed the extension of STT-RAM cell behaviors from SLC (single-level- cell) to MLC (multi-level- cell). On top of that, we also discuss the optimal device parameters of the MLC MTJ for the minimization of the operation error rate of the MLC STT-RAM cells from statistical design perspective. Our simulation results show that under the current available technology, series MLC STT-RAM demonstrates overwhelming benefits in the read and write reliability compared to parallel MLC STT-RAM and could potentially satisfy the requirement of commercial practices. Finally, with the detail analysis study of STT-RAM cells, we proposed several error reduction design, such as ADAMS structure, and FA-STT structure

    Designs for increasing reliability while reducing energy and increasing lifetime

    Get PDF
    In the last decades, the computing technology experienced tremendous developments. For instance, transistors' feature size shrank to half at every two years as consistently from the first time Moore stated his law. Consequently, number of transistors and core count per chip doubles at each generation. Similarly, petascale systems that have the capability of processing more than one billion calculation per second have been developed. As a matter of fact, exascale systems are predicted to be available at year 2020. However, these developments in computer systems face a reliability wall. For instance, transistor feature sizes are getting so small that it becomes easier for high-energy particles to temporarily flip the state of a memory cell from 1-to-0 or 0-to-1. Also, even if we assume that fault-rate per transistor stays constant with scaling, the increase in total transistor and core count per chip will significantly increase the number of faults for future desktop and exascale systems. Moreover, circuit ageing is exacerbated due to increased manufacturing variability and thermal stresses, therefore, lifetime of processor structures are becoming shorter. On the other side, due to the limited power budget of the computer systems such that mobile devices, it is attractive to scale down the voltage. However, when the voltage level scales to beyond the safe margin especially to the ultra-low level, the error rate increases drastically. Nevertheless, new memory technologies such as NAND flashes present only limited amount of nominal lifetime, and when they exceed this lifetime, they can not guarantee storing of the data correctly leading to data retention problems. Due to these issues, reliability became a first-class design constraint for contemporary computing in addition to power and performance. Moreover, reliability even plays increasingly important role when computer systems process sensitive and life-critical information such as health records, financial information, power regulation, transportation, etc. In this thesis, we present several different reliability designs for detecting and correcting errors occurring in processor pipelines, L1 caches and non-volatile NAND flash memories due to various reasons. We design reliability solutions in order to serve three main purposes. Our first goal is to improve the reliability of computer systems by detecting and correcting random and non-predictable errors such as bit flips or ageing errors. Second, we aim to reduce the energy consumption of the computer systems by allowing them to operate reliably at ultra-low voltage level. Third, we target to increase the lifetime of new memory technologies by implementing efficient and low-cost reliability schemes
    • โ€ฆ
    corecore