1,101 research outputs found

    DRAM Bender: An Extensible and Versatile FPGA-based Infrastructure to Easily Test State-of-the-art DRAM Chips

    Full text link
    To understand and improve DRAM performance, reliability, security and energy efficiency, prior works study characteristics of commodity DRAM chips. Unfortunately, state-of-the-art open source infrastructures capable of conducting such studies are obsolete, poorly supported, or difficult to use, or their inflexibility limit the types of studies they can conduct. We propose DRAM Bender, a new FPGA-based infrastructure that enables experimental studies on state-of-the-art DRAM chips. DRAM Bender offers three key features at the same time. First, DRAM Bender enables directly interfacing with a DRAM chip through its low-level interface. This allows users to issue DRAM commands in arbitrary order and with finer-grained time intervals compared to other open source infrastructures. Second, DRAM Bender exposes easy-to-use C++ and Python programming interfaces, allowing users to quickly and easily develop different types of DRAM experiments. Third, DRAM Bender is easily extensible. The modular design of DRAM Bender allows extending it to (i) support existing and emerging DRAM interfaces, and (ii) run on new commercial or custom FPGA boards with little effort. To demonstrate that DRAM Bender is a versatile infrastructure, we conduct three case studies, two of which lead to new observations about the DRAM RowHammer vulnerability. In particular, we show that data patterns supported by DRAM Bender uncovers a larger set of bit-flips on a victim row compared to the data patterns commonly used by prior work. We demonstrate the extensibility of DRAM Bender by implementing it on five different FPGAs with DDR4 and DDR3 support. DRAM Bender is freely and openly available at https://github.com/CMU-SAFARI/DRAM-Bender.Comment: To appear in TCAD 202

    Energy-Aware Data Movement In Non-Volatile Memory Hierarchies

    Get PDF
    While technology scaling enables increased density for memory cells, the intrinsic high leakage power of conventional CMOS technology and the demand for reduced energy consumption inspires the use of emerging technology alternatives such as eDRAM and Non-Volatile Memory (NVM) including STT-MRAM, PCM, and RRAM. The utilization of emerging technology in Last Level Cache (LLC) designs which occupies a signifcant fraction of total die area in Chip Multi Processors (CMPs) introduces new dimensions of vulnerability, energy consumption, and performance delivery. To be specific, a part of this research focuses on eDRAM Bit Upset Vulnerability Factor (BUVF) to assess vulnerable portion of the eDRAM refresh cycle where the critical charge varies depending on the write voltage, storage and bit-line capacitance. This dissertation broaden the study on vulnerability assessment of LLC through investigating the impact of Process Variations (PV) on narrow resistive sensing margins in high-density NVM arrays, including on-chip cache and primary memory. Large-latency and power-hungry Sense Amplifers (SAs) have been adapted to combat PV in the past. Herein, a novel approach is proposed to leverage the PV in NVM arrays using Self-Organized Sub-bank (SOS) design. SOS engages the preferred SA alternative based on the intrinsic as-built behavior of the resistive sensing timing margin to reduce the latency and power consumption while maintaining acceptable access time. On the other hand, this dissertation investigates a novel technique to prioritize the service to 1) Extensive Read Reused Accessed blocks of the LLC that are silently dropped from higher levels of cache, and 2) the portion of the working set that may exhibit distant re-reference interval in L2. In particular, we develop a lightweight Multi-level Access History Profiler to effciently identify ERRA blocks through aggregating the LLC block addresses tagged with identical Most Signifcant Bits into a single entry. Experimental results indicate that the proposed technique can reduce the L2 read miss ratio by 51.7% on average across PARSEC and SPEC2006 workloads. In addition, this dissertation will broaden and apply advancements in theories of subspace recovery to pioneer computationally-aware in-situ operand reconstruction via the novel Logic In Interconnect (LI2) scheme. LI2 will be developed, validated, and re?ned both theoretically and experimentally to realize a radically different approach to post-Moore\u27s Law computing by leveraging low-rank matrices features offering data reconstruction instead of fetching data from main memory to reduce energy/latency cost per data movement. We propose LI2 enhancement to attain high performance delivery in the post-Moore\u27s Law era through equipping the contemporary micro-architecture design with a customized memory controller which orchestrates the memory request for fetching low-rank matrices to customized Fine Grain Reconfigurable Accelerator (FGRA) for reconstruction while the other memory requests are serviced as before. The goal of LI2 is to conquer the high latency/energy required to traverse main memory arrays in the case of LLC miss, by using in-situ construction of the requested data dealing with low-rank matrices. Thus, LI2 exchanges a high volume of data transfers with a novel lightweight reconstruction method under specific conditions using a cross-layer hardware/algorithm approach

    Measuring the Energy Consumption of Software written in C on x86-64 Processors

    Get PDF
    In 2016 German data centers consumed 12.4 terawatt-hours of electrical energy, which accounts for about 2% of Germanyโ€™s total energy consumption in that year. In 2020 this rose to 16 terawatt-hours or 2.9% of Germanyโ€™s total energy consumption in that year. The ever-increasing energy consumption of computers consequently leads to considerations to reduce it to save energy, money and to protect the environment. This thesis aims to answer fundamental questions about the energy consumption of software, e. g. how and how precise can a measurement be taken or if CPU load and energy consumption are correlated. An overview of measurement methods and the related software tooling was created. The most promising approach using software called 'Scaphandre' was chosen as the main basis and further developed. Different sorting algorithms were benchmarked to study their behavior regarding energy consumption. The resulting dataset was also used to answer the fundamental questions stated in the beginning. A replication and reproduction package was provided to enable the reproducibility of the results.Im Jahr 2016 verbrauchten deutsche Rechenzentren 12,4 Terawattstunden elektrische Energie, was etwa 2 % des gesamten Energieverbrauchs in Deutschland in diesem Jahr ausmacht. Im Jahr 2020 stieg dieser Wert auf 16 Terawattstunden bzw. 2,9 % des Gesamtenergieverbrauchs in Deutschland. Der stetig steigende Energieverbrauch von Computern fรผhrt folglich zu รœberlegungen, diesen zu reduzieren, um Energie und Geld zu sparen und die Umwelt zu schรผtzen. Ziel dieser Arbeit ist es, grundlegende Fragen zum Energieverbrauch von Software zu beantworten, z. B. wie und mit welcher Genauigkeit gemessen werden kann oder ob CPU-Last und Energieverbrauch korrelieren. Es wurde eine รœbersicht รผber Messmethoden und die dazugehรถrigen Softwaretools erstellt. Der vielversprechendste Ansatz mit der Software 'Scaphandre' wurde als Hauptgrundlage ausgewรคhlt und weiterentwickelt. Verschiedene Sortieralgorithmen wurden einem Benchmarking unterzogen, um ihr Verhalten hinsichtlich des Energieverbrauchs zu untersuchen. Der resultierende Datensatz wurde auch zur Beantwortung der eingangs gestellten grundlegenden Fragen verwendet. Ein Replikations- und Reproduktionspaket wurde bereitgestellt, um die Reproduzierbarkeit der Ergebnisse zu ermรถglichen

    Programmable built-in self-testing of embedded RAM clusters in system-on-chip architectures

    Get PDF
    Multiport memories are widely used as embedded cores in all communication system-on-chip devices. Due to their high complexity and very low accessibility, built-in self-test (BIST) is the most common solution implemented to test the different memories embedded in the system. This article presents a programmable BIST architecture based on a single microprogrammable BIST processor and a set of memory wrappers designed to simplify the test of a system containing a large number of distributed multiport memories of different sizes (number of bits, number of words), access protocols (asynchronous, synchronous), and timing

    Uniform resistive switching memory using localized charge trapping

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์žฌ๋ฃŒ๊ณตํ•™๋ถ€(ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ์žฌ๋ฃŒ), 2020. 8. ํ™ฉ์ฒ ์„ฑ.๋ฉค๋ฆฌ์Šคํ„ฐ๋Š” 1971๋…„ ์ถ”์•„ ๊ต์ˆ˜์— ์˜ํ•ด ๊ทธ ๊ฐœ๋…์ด ์†Œ๊ฐœ ๋˜๊ณ , 2008๋…„ ํœด๋ ›ํŒฉ์ปค๋“œ(HP) ์‚ฌ์—์„œ ์—ฐ๊ตฌ ๊ฐœ๋ฐœ์„ ๋ฐœํ‘œํ•œ ๊ธฐ์ ์œผ๋กœ, ๋งŽ์€ ์—ฐ๊ตฌ๊ฐ€ ์ง€์†์ ์œผ๋กœ ์ง„ํ–‰๋˜๊ณ  ์žˆ๋‹ค. ์ตœ๊ทผ์—๋Š” ๋‰ด๋กœ๋ชจํ”ฝ๊ณผ ๋กœ์ง, ์‹ ๊ฒฝ๋ชจ์‚ฌ๋“ค ๋‹ค์–‘ํ•œ ๋ถ„์•ผ๋กœ์˜ ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜๊ณ  ์žˆ๋Š” ์ €ํ•ญ๋ณ€ํ™”๋ฉ”๋ชจ๋ฆฌ๋Š”, ๊ธˆ์†-์ ˆ์—ฐ๋ง‰-๊ธˆ์†์˜ ๊ฐ„๋‹จํ•œ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๋ฉฐ ๊ฐ„๋‹จํ•œ ๊ณต์ •๋ฐฉ๋ฒ•์œผ๋กœ ์ธํ•ด ์ ์€ ๋น„์šฉ์œผ๋กœ ์ œ์ž‘์ด ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์ด์  ๋ฐ ํฌ๋กœ์Šค๋ฐ” ์–ด๋ ˆ์ด ๊ตฌ์กฐ์—์„œ ๋‹จ์œ„ ์…€ ํฌ๊ธฐ๊ฐ€ 4F2๋กœ ์ œ์ž‘์ด ๊ฐ€๋Šฅํ•˜๋‹ค. ์—ฌ๊ธฐ์„œ F๋Š” ๊ตฌํ˜„ ๊ฐ€๋Šฅํ•œ ์ตœ์†Œ ์„ ํญ์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. ๋ฐ˜๋ฉด DRAM, NAND, NOR ํ”Œ๋ž˜์‹œ๋ฉ”๋ชจ๋ฆฌ๋Š” ๊ฐ๊ฐ 6F2, 5F2, 10F2 ์˜ ๋‹จ์œ„ ์…€ ํฌ๊ธฐ๋ฅผ ๊ฐ–๊ณ  ์žˆ๋‹ค. ์ฆ‰, ๋ฉค๋ฆฌ์Šคํ„ฐ๋Š” ๊ณ ์ง‘์  ๋ฉ”๋ชจ๋ฆฌ ์†Œ์ž์˜ ๊ตฌํ˜„์— ๊ฐ€์žฅ ์ ํ•ฉํ•œ ์†Œ์ž๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ ์—์„œ ์ €ํ•ญ๋ณ€ํ™”๋ฉ”๋ชจ๋ฆฌ๋Š” ๊ธฐ์กด์˜ NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๋Œ€์ฒดํ•  ์ฐจ์„ธ๋Œ€ ์ €์žฅ๋ฉ”๋ชจ๋ฆฌ๋กœ ์ฃผ๋ชฉ๋ฐ›๊ณ  ์žˆ๋‹ค. NAND ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ ๋˜ํ•œ ์—ฐ๊ตฌ ๊ฐœ๋ฐœ์ด ๊พธ์ค€ํžˆ ์ด๋ฃจ์–ด์ง€๊ณ  ์žˆ์œผ๋ฉฐ ์ˆ˜์ง์†Œ์ž์˜ ๊ฐœ๋ฐœ๋กœ ์ธํ•ด ์ง‘์ ๋„๊ฐ€ ํฌ๊ฒŒ ์ฆ๊ฐ€ํ•˜์˜€๋‹ค. ํ•˜์ง€๋งŒ ํ˜„์žฌ์˜ ์ˆ˜์ง ํ”Œ๋ž˜์‹œ ๋ฉ”๋ชจ๋ฆฌ์˜ ๊ฒฝ์šฐ 100๋‹จ ์ด์ƒ์˜ ๊ฐœ๋ฐœ์— ์„ฑ๊ณตํ•˜์˜€์ง€๋งŒ ๊ฐˆ์ˆ˜๋ก ๊ณต์ • ๋‚œ์ด๋„๊ฐ€ ์˜ฌ๋ผ๊ฐ€๊ณ  ์žˆ๋Š” ์ถ”์„ธ์ด๋ฉฐ ์•ฝ 10๋…„ ๋‚ด์— ํ•œ๊ณ„์— ์ง๋ฉดํ•  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋˜๊ณ  ์žˆ๋‹ค. ๋™์ž‘ ์ „์••์ด ํฐ ํ”Œ๋ž˜์‹œ๋ฉ”๋ชจ๋ฆฌ์˜ ํŠน์ง•์œผ๋กœ ์ธํ•ด ์ˆ˜์ง์†Œ์ž ์ œ์ž‘ ๊ณผ์ •์—์„œ ์ ˆ์—ฐ๋ง‰์˜ ๋‘๊ป˜๊ฐ€ ๋‘๊บผ์›Œ์ง€๊ฒŒ ๋˜๋Š”๋ฐ, ์ด๋Š” ์ œํ’ˆ ๋‚ด ์žฅ์ฐฉ๋˜๋Š” ๋ฉ”๋ชจ๋ฆฌ ์นฉ์˜ ์ตœ๋Œ€ ๋†’์ด์— ์ˆ˜์ง ์†Œ์ž๊ฐ€ ๋„๋‹ฌํ•˜์˜€์„ ๋•Œ ๋” ์ด์ƒ ์ง‘์ ๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์—†๋Š” ํ•œ๊ณ„์ ์œผ๋กœ ์ž‘์šฉํ•˜๊ฒŒ ๋œ๋‹ค. ์ €ํ•ญ๋ณ€ํ™” ๋ฉ”๋ชจ๋ฆฌ๋Š” ๋‚ฎ์€ ๋™์ž‘ ์ „์••๊ณผ ๋†’์€ ์ง‘์ ๋„, ์ˆ˜์ง ์†Œ์ž๋กœ์˜ ์—ฐ๊ตฌ ๊ฐœ๋ฐœ ๊ฐ€๋Šฅ์„ฑ ๋“ฑ์œผ๋กœ ์ฐจ์„ธ๋Œ€ ์ €์žฅ๋ฉ”๋ชจ๋ฆฌ๋กœ์˜ ์žฅ์ ๋“ค์„ ๋งŽ์ด ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ์ €ํ•ญ๋ณ€ํ™” ๋ฉ”๋ชจ๋ฆฌ์˜ ์ƒ์šฉํ™” ๋‹จ๊ณ„์—์„œ ๊ฐ€์žฅ ํฐ ๋ฌธ์ œ์ ์œผ๋กœ ์ž‘์šฉํ•˜๋Š” ๊ฒƒ์€ ๋ฐ”๋กœ ์•ˆ์ •์„ฑ ๋ฌธ์ œ์ด๋‹ค. ์ €ํ•ญ๋ณ€ํ™” ๋ฉ”๋ชจ๋ฆฌ์˜ ๋™์ž‘ ์›๋ฆฌ ํŠน์„ฑ์ƒ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ „๋„์„ฑ ๊ฒฝ๋กœ(conductive path)๊ฐ€ ๋™์‹œ๋‹ค๋ฐœ์ ์œผ๋กœ ์ƒ๊ธฐ๋ฉฐ, ์ด ๊ฒฝ๋กœ๋“ค์€ ์ƒ์„ฑ๊ณผ ํŒŒ์—ด์ด ๋ฐ˜๋ณต์ ์œผ๋กœ ์ผ์–ด๋‚˜๋ฉฐ ๋™์ž‘ํ•˜๊ฒŒ ๋˜๋Š”๋ฐ, ๊ทธ ๊ณผ์ •์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋™์ž‘ ์‚ฐํฌ๊ฐ€ ์•ˆ์ •์„ฑ์— ์˜ํ–ฅ์„ ์ฃผ๊ฒŒ ๋œ๋‹ค. ์•ž์„œ ์–ธ๊ธ‰ํ•œ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ๋กœ์˜ ์—ฐ๊ตฌ ๊ฐœ๋ฐœ์ด ์ด๋ฃจ์–ด์ง€๊ณ  ์žˆ์ง€๋งŒ ์ด๋Ÿฌํ•œ ์—ฐ๊ตฌ์— ์ €ํ•ญ๋ณ€ํ™” ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ์‚ฌ์šฉ๋˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๊ณ ์ง‘์  ๋ฉ”๋ชจ๋ฆฌ์˜ ๊ฐœ๋ฐœ ๋ฟ ์•„๋‹ˆ๋ผ ์†Œ์ž ๋‚ด ๋ฐ˜๋ณต ๋™์ž‘์—์„œ์˜ ์•ˆ์ •์„ฑ ๋ฐ ์–ด๋ ˆ์ด์—์„œ ๋ชจ๋“  ์†Œ์ž๋“ค์ด ๋™์ผํ•œ ๋™์ž‘ ํŠน์„ฑ์„ ๋ณด์ด๋Š” ์†Œ์ž๊ฐ„ ๋™์ž‘ ์‚ฐํฌ์˜ ๊ฐœ์„ ์ด ์šฐ์„ ์ ์œผ๋กœ ์ด๋ฃจ์–ด์ ธ์•ผ ํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์˜ ์ฒซ ๋ฒˆ์งธ ํŒŒํŠธ์—์„œ, ์ €ํ•ญ๋ณ€ํ™” ๋ฉ”๋ชจ๋ฆฌ์—์„œ ๊ฐ€์žฅ ํฐ ๋ฌธ์ œ์ ์œผ๋กœ ์ง€๋ชฉ๋˜๊ณ  ์žˆ๋Š” ๋ฐ˜๋ณต ๋™์ž‘๊ฐ„ ์‚ฐํฌ, ์†Œ์ž์™€ ์†Œ์ž๊ฐ„ ์‚ฐํฌ๋ฅผ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ Pt/Ta2O5/HfO2/TiN ์†Œ์ž ๋‚ด Au nanodots์ด ์‚ฝ์ž…๋˜๋Š” ์‹คํ—˜์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค. ์ด ์†Œ์ž๋Š” HfO2 ๋ง‰๋‚ด์— ์กด์žฌํ•˜๋Š” shallow trap sites์— ์ „์ž๊ฐ€ trapping/detrapping ํ•˜๋Š” ํ˜„์ƒ์œผ๋กœ๋ถ€ํ„ฐ trapping ๋˜์—ˆ์„ ๋•Œ ๋‚ฎ์€ ์ €ํ•ญ ์ƒํƒœ๋ฅผ, detrapping ๋˜์—ˆ์„ ๋•Œ ๋†’์€ ์ €ํ•ญ ์ƒํƒœ๋ฅผ ๋ณด์ด๋Š” ์ €ํ•ญ๋ณ€ํ™” ๋ฉ”๋ชจ๋ฆฌ ๊ฑฐ๋™์„ ๋ณด์ธ๋‹ค. Ta2O5 ๋ฐ•๋ง‰์„ ์ฆ์ฐฉ ํ•˜๋Š” ๊ณผ์ •์—์„œ HfO2 ๋ฐ•๋ง‰์— ๊ฐ€ํ•ด์ง€๋Š” plasma๋กœ ์ธํ•ด ํ˜•์„ฑ๋˜๋Š” deep trap sites๋“ค์— ์˜ํ•ด ์•ˆ์ •์ ์ธ ๋ฉ”๋ชจ๋ฆฌ ๊ฑฐ๋™์„ ๋ณด์ด๊ฒŒ ๋˜๋Š”๋ฐ, ํ•ด๋‹น ์˜์—ญ์— Au nanodots์„ ์‚ฝ์ž…ํ•จ์œผ๋กœ์จ ์ „๊ณ„ ์ง‘์ค‘ ํšจ๊ณผ๋ฅผ ํ†ตํ•˜์—ฌ ์•ˆ์ •์ ์ธ ๋ฉ”๋ชจ๋ฆฌ ๊ฑฐ๋™์„ ๋ณด์ž„์„ ํ™•์ธํ•˜์˜€๋‹ค. Au nanodots์ด ์‚ฝ์ž…๋˜์ง€ ์•Š์€ ์†Œ์ž์™€ ๋น„๊ตํ•˜์˜€์„ ๋•Œ ๋™์ž‘ ์‚ฐํฌ๊ฐ€ ๊ทน์ ์œผ๋กœ ๊ฐœ์„ ๋˜๋Š” ๊ฒฐ๊ณผ๋ฅผ ํ™•์ธํ•˜์˜€์œผ๋ฉฐ, Au nanodots์ด ์‚ฝ์ž…๋˜์ง€ ์•Š์€ ์†Œ์ž๋Š” ~200๋ฒˆ ๊ฐ€๋Ÿ‰์˜ ๋ฐ˜๋ณต ๋™์ž‘์ด ๊ฐ€๋Šฅํ•œ ๋ฐ˜๋ฉด, Au nanodots์ด ์‚ฝ์ž…๋œ ์†Œ์ž์˜ ๊ฒฝ์šฐ 1000๋ฒˆ ์ด์ƒ์˜ ๋ฐ˜๋ณต ๋™์ž‘์—์„œ๋„ ๋™์ผํ•˜๊ณ  ์•ˆ์ •์ ์ธ ๋ฉ”๋ชจ๋ฆฌ ๊ฑฐ๋™์„ ๋ณด์ž„์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋˜ํ•œ ํ•ด๋‹น ์†Œ์ž๋ฅผ ๋™์ž‘ ์‹œํ‚ค๋Š” ๊ณผ์ •์—์„œ, ์ปดํ”Œ๋ผ์ด์–ธ์Šค ์ „๋ฅ˜(compliance current)๋ฅผ ์กฐ์ ˆํ•จ์œผ๋กœ์จ trap sites์— ํฌํš๋˜๋Š” ์ „์ž์˜ ์–‘์„ ์กฐ์ ˆํ•˜๊ณ  ์ด๋ฅผ ํ†ตํ•˜์—ฌ off ์ƒํƒœ๋ฅผ ์ œ์™ธํ•œ 8๊ฐœ์˜ ์„œ๋กœ ๊ฒน์น˜์ง€ ์•Š๋Š” ์ „๋ฅ˜ ๋ ˆ๋ฒจ์„ ํ™•๋ณดํ•จ์œผ๋กœ์จ multi-level ๋™์ž‘ ๋˜ํ•œ ๊ฐ€๋Šฅํ•จ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋ณธ ๋…ผ๋ฌธ์˜ ๋‘ ๋ฒˆ์งธ ํŒŒํŠธ์—์„œ๋Š”, ์‚ฝ์ž…ํ•˜๋Š” Au nanodots์˜ ์œ„์น˜์— ๋”ฐ๋ผ ๋‚˜ํƒ€๋‚˜๋Š” ์†Œ์ž์˜ ์ „๊ธฐ์  ๋™์ž‘ ํŠน์„ฑ์„ ํ™•์ธํ•˜๊ณ , COMSOL ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ํ†ตํ•ด ์ „๊ณ„์ง‘์ค‘ ์–‘์ƒ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๊ธฐ์กด HfO2 ๋ฐ•๋ง‰ ๋‚ด ์กด์žฌํ•˜๋Š” ๋‹ค์ˆ˜์˜ trap sites์— ์˜ํ•ด ๊ณ„๋ฉด์—์„œ ์Šค์œ„์นญ์ด ์ผ์–ด๋‚œ๋‹ค๊ณ  ์•Œ๋ ค์ ธ ์žˆ๋Š” ์†Œ์ž์— Au nanodots์˜ ์‚ฝ์ž… ์œ„์น˜๋ฅผ HfO2 ๋ฐ•๋ง‰๊ณผ Ta2O5 ๋ฐ•๋ง‰ ๋‚ด ์‚ฝ์ž…ํ•˜์˜€๋‹ค. ๋‹จ์›์ž์ฆ์ฐฉ๋ฒ•์œผ๋กœ HfO2 ๋ฐ•๋ง‰์„ ์ผ์ • ๋‘๊ป˜ ์ฆ์ฐฉํ•˜๊ณ  Au nanodots์„ ํ˜•์„ฑํ•˜์—ฌ ์ค€ ํ›„ ๋‹ค์‹œ HfO2 ๋ฐ•๋ง‰์„ ์ฆ์ฐฉํ•˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ HfO2 ๋ฐ•๋ง‰ ๋‚ด Au nanodots์„ ์‚ฝ์ž…ํ•˜์˜€๊ณ  Ta2O5 ๋ฐ•๋ง‰ ๋‚ด์—๋„ ๋™์ผํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ Au nanodots์„ ์‚ฝ์ž…ํ•˜์˜€๋‹ค. Ta2O5 ์˜ ๊ฒฝ์šฐ ํ•ด๋‹น ์†Œ์ž์—์„œ ์Šค์œ„์นญ์—๋Š” ๊ด€์—ฌ๋ฅผ ํ•˜์ง€ ์•Š์œผ๋ฉฐ ์ƒ๋ถ€ ์ „๊ทน์œผ๋กœ ์‚ฌ์šฉ๋œ ๋†’์€ ์ผํ•จ์ˆ˜๋ฅผ ๊ฐ–๋Š” Pt ์™€ Schottky barrier๋ฅผ ํ˜•์„ฑํ•˜์—ฌ ๋‹ค์ด์˜ค๋“œ์™€ ๊ฐ™์€ ํŠน์„ฑ์„ ๋ณด์—ฌ์ฃผ๋Š” ์ž๊ฐ€์ •๋ฅ˜ ํŠน์„ฑ์— ๊ธฐ์—ฌํ•œ๋‹ค๊ณ  ์•Œ๋ ค์ ธ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ Ta2O5 ๋ฐ•๋ง‰ ๋‚ด Au nanodots์ด ์‚ฝ์ž…๋˜์—ˆ์„ ๋•, ์Šค์œ„์นญ์—๋Š” ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š์„๊ฑฐ๋ผ ์˜ˆ์ƒ๋˜์ง€๋งŒ ์ด ๋˜ํ•œ ๋™์ž‘ ๋ฐ˜๋ณต์„ฑ์ด ํฌ๊ฒŒ ํ–ฅ์ƒ๋˜๋Š” ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ์—ˆ๊ณ , COMSOL ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ํ†ตํ•ด Au nanodots์˜ ์‚ฝ์ž… ์œ„์น˜๊ฐ€ ๊ณ„๋ฉด์œผ๋กœ๋ถ€ํ„ฐ ๋ฉ€์–ด์ง€๊ฒŒ ๋˜๋ฉด ์ „๊ณ„ ์ง‘์ค‘ ํšจ๊ณผ๊ฐ€ ์‚ฌ๋ผ์ง€๊ฒŒ ๋˜๊ณ  ๊ทธ์™€ ๋™์‹œ์— ๋™์ž‘ ๋ฐ˜๋ณต์„ฑ์˜ ๊ฐœ์„  ํšจ๊ณผ ๋˜ํ•œ ์‚ฌ๋ผ์ง€๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๋‹ค. ์ด ๊ฒฐ๊ณผ๋ฅผ ํ†ตํ•ด ์ „๊ณ„ ์ง‘์ค‘ ํšจ๊ณผ๋กœ ์ธํ•ด ๋™์ž‘ ๋ฐ˜๋ณต์„ฑ์ด ๊ฐœ์„ ๋˜๋ฉฐ, ๊ณ„๋ฉด์—์„œ ์Šค์œ„์นญ์ด ์ผ์–ด๋‚œ๋‹ค๋Š” ๊ฒƒ์„ ์‹คํ—˜์ ์œผ๋กœ ์ฆ๋ช…ํ•  ์ˆ˜ ์žˆ๋Š” ์—ฐ๊ตฌ ๊ฒฐ๊ณผ์ด๋‹ค. ๋ณธ ๋…ผ๋ฌธ์˜ ์„ธ ๋ฒˆ์งธ ํŒŒํŠธ์—์„œ, Au nanodots ์˜ ํ˜•์„ฑ ๊ณผ์ •์„ ๊ธฐ์กด ์ „๋ฉด์— ํ˜•์„ฑ ํ•˜๋˜ ๋ฐฉ๋ฒ•์—์„œ ์ „์ž๋น” ๋…ธ๊ด‘ ๋ฐฉ์‹์„ ํ†ตํ•˜์—ฌ ๊ตญ๋ถ€์ ์ธ ์˜์—ญ์— ํ˜•์„ฑํ•˜๋Š” ์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ํ•˜์˜€๋‹ค. nanodots์„ ํ˜•์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•์—๋Š” ๋‹ค์–‘ํ•œ ๋ฐฉ๋ฒ•๋“ค์ด ์กด์žฌํ•˜๋Š”๋ฐ ๋„๋ฆฌ ์•Œ๋ ค์ ธ ์žˆ๋Š” AAO, ๊ตฌ ํ˜•ํƒœ์˜ ๋‚˜๋…ธ ๊ตฌ์กฐ๋ฌผ๋“ค์„ ์ด์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•๋“ค์€ nanodots์˜ ํฌ๊ธฐ๋‚˜ ๋ถ„ํฌ๋ฅผ ์›ํ•˜๋Š” ํฌ๊ธฐ๋กœ ์ œ์ž‘ํ•  ์ˆ˜ ์—†๋‹ค๋Š” ๋‹จ์ ์ด ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ nanodots์˜ ๋ถ„ํฌ์˜ ์ฐจ์ด๋Š” ์ฐจํ›„ ์†Œ์ž ์ œ์ž‘์„ ํ•˜์˜€์„ ๋•Œ ์†Œ์ž์™€ ์†Œ์ž ๊ฐ„ ์‚ฌ์ด ์‚ฐํฌ๋ฅผ ์•ผ๊ธฐํ•˜๋Š” ์š”์ธ์œผ๋กœ ์ž‘์šฉํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ๊ทธ ์ •๋„๊ฐ€ ์‹ฌํ•ด์ง€๊ฒŒ ๋˜๋ฉด nanodots์ด ์‚ฝ์ž…๋˜์ง€ ์•Š๋Š” ์†Œ์ž๋„ ์กด์žฌํ•˜๊ฒŒ ๋  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ „์ž๋น” ๋…ธ๊ด‘ ๋ฐฉ์‹์„ ํ†ตํ•˜์—ฌ ์›ํ•˜๋Š” ์œ„์น˜์— ์›ํ•˜๋Š” ํฌ๊ธฐ๋กœ Au nanodots์„ ํ˜•์„ฑํ•˜๊ณ ์ž ํ•˜์˜€๋‹ค. ์ „์ž๋น” ๋…ธ๊ด‘์„ ์ง„ํ–‰ํ•œ ํ›„ Au ๋ฐ•๋ง‰์„ ์ฆ์ฐฉํ•˜๊ณ  lift-off ๋ฐฉ์‹์„ ํ†ตํ•˜์—ฌ Au nanodots์„ ํ˜•์„ฑํ•  ์ˆ˜ ์žˆ์—ˆ์œผ๋ฉฐ, ์ตœ์†Œ 50nm ํฌ๊ธฐ๋กœ ํ˜•์„ฑํ•  ์ˆ˜ ์žˆ์—ˆ์œผ๋ฉฐ ๋…ธ๊ด‘ํ•˜๋Š” ๊ณผ์ •์—์„œ ๊ฐ๊ด‘๋ฌผ์งˆ์˜ ์ธก๋ฉด ๊ธฐ์šธ๊ธฐ๋ฅผ ์กฐ์ ˆํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ์„œ๋กœ ๋‹ค๋ฅธ ๋ถ„์ž๋Ÿ‰์„ ๊ฐ–๋Š” PMMA๋ฅผ ๋‘ ์ธต์œผ๋กœ ์ฆ์ฐฉํ•˜์—ฌ ๋ถ„์ž๋Ÿ‰์— ๋”ฐ๋ฅธ ๋ฏผ๊ฐ์„ฑ์˜ ์ฐจ์ด๋ฅผ ์ด์šฉํ•˜์—ฌ ํ™•์‹คํ•œ undercut์„ ํ˜•์„ฑํ•จ์œผ๋กœ์จ lift-off ๊ณผ์ •์—์„œ Au ๋ฐ•๋ง‰์— ๊ฐ€ํ•ด์ง€๋Š” ๋ฌผ๋ฆฌ์ ์ธ ํž˜์„ ์ตœ์†Œํ™” ํ•จ์œผ๋กœ์จ ์ž‘์€ ํฌ๊ธฐ์˜ nanodots ๋˜ํ•œ ํ˜•์„ฑํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋˜ํ•œ ์ „์ž๋น” ๋…ธ๊ด‘ ๊ณผ์ •์—์„œ ๊ฐ€ํ•ด์ง€๋Š” ์ „์ž์˜ ๋ฐฉ์‚ฌ๋Ÿ‰์„ ์กฐ์ ˆํ•˜์˜€๋‹ค. ๋„ˆ๋ฌด ์ ์€ ๋ฐฉ์‚ฌ๋Ÿ‰์€ ๊ฐ๊ด‘ ๋ฌผ์งˆ์„ ๋ชจ๋‘ ๋ฐ˜์‘ ์‹œํ‚ค์ง€ ๋ชปํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์›ํ•˜๋Š” ํŒจํ„ด์„ ํ˜•์„ฑํ•  ์ˆ˜ ์—†๊ณ , ๋„ˆ๋ฌด ๋งŽ์€ ๋ฐฉ์‚ฌ๋Ÿ‰์€ ํŒจํ„ด์„ ๋„“์–ด์ง€๊ฒŒ ๋งŒ๋“œ๋Š” ์š”์ธ์œผ๋กœ ์ž‘์šฉํ•˜๊ฒŒ ๋˜์–ด ๋ฏธ์„ธํ•œ ์กฐ์ ˆ์ด ํ•„์š”ํ•˜๊ฒŒ ๋œ๋‹ค. ์ด๋ ‡๊ฒŒ ํ˜•์„ฑํ•œ Au nanodots์„ ์‚ฝ์ž…ํ•˜์—ฌ ์†Œ์ž๋ฅผ ์ œ์ž‘ํ•˜๊ณ  ์›์ž ํž˜ ํ˜„๋ฏธ๊ฒฝ์„ ์ด์šฉํ•˜์—ฌ ํ‘œ๋ฉด ๋ถ„์„์„ ์ง„ํ–‰ํ•˜์˜€์œผ๋ฉฐ nanodots์ด ์‚ฝ์ž…๋˜์–ด ์žˆ๋Š” ํ‘œ๋ฉด์—์„œ ๋ˆˆ์— ๋„๊ฒŒ ๋†’์€ ์ „๋ฅ˜๊ฐ€ ํ๋ฅด๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๊ณ , ์ด๋Š” ์•ž์„œ ํ™•์ธํ•œ ๊ฒฐ๊ณผ์™€ ๋™์ผํ•œ ๊ฒƒ์œผ๋กœ nanodots์˜ ์œ„์น˜์— ์ „๊ณ„๊ฐ€ ์ง‘์ค‘๋˜๋Š” ๊ฒƒ์„ ํ™•์‹คํ•˜๊ฒŒ ๋ณด์—ฌ์ฃผ์—ˆ์œผ๋ฉฐ, ์ด๋กœ ์ธํ•ด ๋™์ž‘ ํŠน์„ฑ๋“ค์ด ๊ฐœ์„ ๋˜๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์—ˆ๋‹ค.The Memristor was firstly introduced by the professor Chua in 1971 and has been researched by many groups such as Hewlett-Packard (HP) since 2008. Resistive switching memory (ReRAM) has simple structure of metal-insulator-metal and has potential usage for recent ongoing topics of neuromorphic, synapse, and logic. Due to simple structure, it can be fabricated with low cost and has advantage of crossbar array with a unit cell of 4F2, where F means minimum feature size. Whereas, DRAM, NAND, NOR, and Flash Memory have 6F2, 5F2, 10F2, respectively. Since the memristor has the smallest unit cell among the other memory, it has a significant potential to replace NAND flash memory for high integration system. Although the recent technology of the vertical NAND flash memory increases the integration, it has a couple of limitations. First is its fabrication difficulty after layers of 100. Higher height also derives limitation of the high operation voltage in the Flash memory due to thicker insulating layer. On the other hand, ReRAM has many advantages over the flash memory such as low operation voltage, high integration, and potential compatibility of the vertical devices. Despite the advantages, it has a low reproducibility due to formation of the multiple conductive paths. These paths affect the variation of the operation voltage in the process of the formation and destruction of the paths. To address this problem, many researches have to be done not only for a high integration but most importantly for uniformity of the operation in the array. In the first part, insertion of Au nanodots in Pt/Ta2O5/HfO2/TiN was introduced to improve cell-to-cell variation and cyclic variation. The mechanism of the HfO2 was that electrons were trapped and detrapped in the shallow trap sites. When the electrons were trapped, it showed the low resistance state, whereas the high resistance for the detrapping state. In addition, when Ta2O5 was deposited on the HfO2, its plasma created the deep trap states, which acted as a conducting path. If Au nanodots were inserted in this layer, they assisted the conducting path and improved the memory switching because of the electric field concentration effect. The device without Au nanodots could exhibit around 200 cycles, but more than 1000 cycles could be done with the Au nanodots inserted. The Au nanodots inserted device was also capable of doing the multi-level operation by creating the stable 8 current level states under controls of the number of trapped electrons with compliance current. In the second part, electric switching operation based on the location of the inserted Au nanodots was addressed along with the COMSOL simulation tool for the electric field concentration. Two different locations, atomic layer deposited HfO2 and Ta2O5, were examined. Ta2O5 was well known for non-resistive switching layer and diode-like rectifying behavior from the Schottky barrier between high work function of Pt. Therefore, insertion of the Au nanodots might not affect this switching behavior. Switching behavior in Ta2O5, however, was improved after insertion of the Au nanodots. This unexpected behavior was confirmed through COMSOL simulation that if the location of the Au nanodots was sufficiently away from the interface, its improvement of the endurance was faded out along with the weaker field concentration effect. As a result, this experimentally confirms that the switching behavior was occurred at the interface. In the third part, fabrication of the Au nanodots in the localized area with electron beam (e-beam) deposition was addressed. There were many methods to deposit nanodots such as AAO, but those methods could not control the size or distribution of the nanodots since they used the circular shape nanostructure. The distribution of the nanodots is important factor because it could cause the cell-to-cell variation. To control the two factors, e-beam deposition was used. Au nanodots could be fabricated with these steps in order, e-beam exposure, deposition of the Au thin film and subsequent lift-off process. To achieve the fine size of the Au nanodots, reducing stress to the Au thin film and fine control of the e-beam power were important. Reducing stress could be achieved by controlling side slope of the photoresist (PR) in the exposure process. Two layers of PMMA with different molecular weight were deposited to create undercut slope PR, which reduced stress to the Au thin film. E-beam power was also important, which determined number of electrons emit to the PR layer. Too small of the power caused not enough reaction to create the pattern, whereas too high of the power caused broader pattern of the PR. Therefore, fine control of the power was necessary. As a result, the minimum size of 50 nm Au nanodots could be fabricated. After insertion of the Au nanodots, atomic force microscopy (AFM) was used to confirm locations of the conductive path on the surface. In the device, the conductive path showed in the nanodots, which confirmed successful induction of the electric field concentration. Therefore, this field concentration around the nanodots showed improvement in the switching properties.1. Introduction 1 1.1. Resistive switching Random Access Memory 1 1.2. Critical factor for a high-density array 4 1.3. Research scope and objective 6 2. Improvement of resistive switching uniformity by embedding Au nanodots in the Pt/Ta2O5/HfO2/TiN structure 7 2.1. Introduction 7 2.2. Experimental 12 2.3. Results and Discussions 14 2.4. Summary 36 3. Effect of electric field concentration depending on the location of Au nanodots in the device 37 3.1. Introduction 37 3.2. Experimental 40 3.3. Results and Discussions 42 3.4. Summary 57 4. Quantification of Au nanodots in the nanoscale devices 58 4.1. Introduction 58 4.2. Experimental 60 4.3. Results and Discussion 62 4.4. Summary 76 Conclusion 78 Biblography 82 List of publications 90 Abstract (in Korean) 101Docto

    Energy Measurements of High Performance Computing Systems: From Instrumentation to Analysis

    Get PDF
    Energy efficiency is a major criterion for computing in general and High Performance Computing in particular. When optimizing for energy efficiency, it is essential to measure the underlying metric: energy consumption. To fully leverage energy measurements, their quality needs to be well-understood. To that end, this thesis provides a rigorous evaluation of various energy measurement techniques. I demonstrate how the deliberate selection of instrumentation points, sensors, and analog processing schemes can enhance the temporal and spatial resolution while preserving a well-known accuracy. Further, I evaluate a scalable energy measurement solution for production HPC systems and address its shortcomings. Such high-resolution and large-scale measurements present challenges regarding the management of large volumes of generated metric data. I address these challenges with a scalable infrastructure for collecting, storing, and analyzing metric data. With this infrastructure, I also introduce a novel persistent storage scheme for metric time series data, which allows efficient queries for aggregate timelines. To ensure that it satisfies the demanding requirements for scalable power measurements, I conduct an extensive performance evaluation and describe a productive deployment of the infrastructure. Finally, I describe different approaches and practical examples of analyses based on energy measurement data. In particular, I focus on the combination of energy measurements and application performance traces. However, interweaving fine-grained power recordings and application events requires accurately synchronized timestamps on both sides. To overcome this obstacle, I develop a resilient and automated technique for time synchronization, which utilizes crosscorrelation of a specifically influenced power measurement signal. Ultimately, this careful combination of sophisticated energy measurements and application performance traces yields a detailed insight into application and system energy efficiency at full-scale HPC systems and down to millisecond-range regions.:1 Introduction 2 Background and Related Work 2.1 Basic Concepts of Energy Measurements 2.1.1 Basics of Metrology 2.1.2 Measuring Voltage, Current, and Power 2.1.3 Measurement Signal Conditioning and Analog-to-Digital Conversion 2.2 Power Measurements for Computing Systems 2.2.1 Measuring Compute Nodes using External Power Meters 2.2.2 Custom Solutions for Measuring Compute Node Power 2.2.3 Measurement Solutions of System Integrators 2.2.4 CPU Energy Counters 2.2.5 Using Models to Determine Energy Consumption 2.3 Processing of Power Measurement Data 2.3.1 Time Series Databases 2.3.2 Data Center Monitoring Systems 2.4 Influences on the Energy Consumption of Computing Systems 2.4.1 Processor Power Consumption Breakdown 2.4.2 Energy-Efficient Hardware Configuration 2.5 HPC Performance and Energy Analysis 2.5.1 Performance Analysis Techniques 2.5.2 HPC Performance Analysis Tools 2.5.3 Combining Application and Power Measurements 2.6 Conclusion 3 Evaluating and Improving Energy Measurements 3.1 Description of the Systems Under Test 3.2 Instrumentation Points and Measurement Sensors 3.2.1 Analog Measurement at Voltage Regulators 3.2.2 Instrumentation with Hall Effect Transducers 3.2.3 Modular Instrumentation of DC Consumers 3.2.4 Optimal Wiring for Shunt-Based Measurements 3.2.5 Node-Level Instrumentation for HPC Systems 3.3 Analog Signal Conditioning and Analog-to-Digital Conversion 3.3.1 Signal Amplification 3.3.2 Analog Filtering and Analog-To-Digital Conversion 3.3.3 Integrated Solutions for High-Resolution Measurement 3.4 Accuracy Evaluation and Calibration 3.4.1 Synthetic Workloads for Evaluating Power Measurements 3.4.2 Improving and Evaluating the Accuracy of a Single-Node Measuring System 3.4.3 Absolute Accuracy Evaluation of a Many-Node Measuring System 3.5 Evaluating Temporal Granularity and Energy Correctness 3.5.1 Measurement Signal Bandwidth at Different Instrumentation Points 3.5.2 Retaining Energy Correctness During Digital Processing 3.6 Evaluating CPU Energy Counters 3.6.1 Energy Readouts with RAPL 3.6.2 Methodology 3.6.3 RAPL on Intel Sandy Bridge-EP 3.6.4 RAPL on Intel Haswell-EP and Skylake-SP 3.7 Conclusion 4 A Scalable Infrastructure for Processing Power Measurement Data 4.1 Requirements for Power Measurement Data Processing 4.2 Concepts and Implementation of Measurement Data Management 4.2.1 Message-Based Communication between Agents 4.2.2 Protocols 4.2.3 Application Programming Interfaces 4.2.4 Efficient Metric Time Series Storage and Retrieval 4.2.5 Hierarchical Timeline Aggregation 4.3 Performance Evaluation 4.3.1 Benchmark Hardware Specifications 4.3.2 Throughput in Symmetric Configuration with Replication 4.3.3 Throughput with Many Data Sources and Single Consumers 4.3.4 Temporary Storage in Message Queues 4.3.5 Persistent Metric Time Series Request Performance 4.3.6 Performance Comparison with Contemporary Time Series Storage Solutions 4.3.7 Practical Usage of MetricQ 4.4 Conclusion 5 Energy Efficiency Analysis 5.1 General Energy Efficiency Analysis Scenarios 5.1.1 Live Visualization of Power Measurements 5.1.2 Visualization of Long-Term Measurements 5.1.3 Integration in Application Performance Traces 5.1.4 Graphical Analysis of Application Power Traces 5.2 Correlating Power Measurements with Application Events 5.2.1 Challenges for Time Synchronization of Power Measurements 5.2.2 Reliable Automatic Time Synchronization with Correlation Sequences 5.2.3 Creating a Correlation Signal on a Power Measurement Channel 5.2.4 Processing the Correlation Signal and Measured Power Values 5.2.5 Common Oversampling of the Correlation Signals at Different Rates 5.2.6 Evaluation of Correlation and Time Synchronization 5.3 Use Cases for Application Power Traces 5.3.1 Analyzing Complex Power Anomalies 5.3.2 Quantifying C-State Transitions 5.3.3 Measuring the Dynamic Power Consumption of HPC Applications 5.4 Conclusion 6 Summary and Outloo

    Design for Test and Hardware Security Utilizing Tester Authentication Techniques

    Get PDF
    Design-for-Test (DFT) techniques have been developed to improve testability of integrated circuits. Among the known DFT techniques, scan-based testing is considered an efficient solution for digital circuits. However, scan architecture can be exploited to launch a side channel attack. Scan chains can be used to access a cryptographic core inside a system-on-chip to extract critical information such as a private encryption key. For a scan enabled chip, if an attacker is given unlimited access to apply all sorts of inputs to the Circuit-Under-Test (CUT) and observe the outputs, the probability of gaining access to critical information increases. In this thesis, solutions are presented to improve hardware security and protect them against attacks using scan architecture. A solution based on tester authentication is presented in which, the CUT requests the tester to provide a secret code for authentication. The tester authentication circuit limits the access to the scan architecture to known testers. Moreover, in the proposed solution the number of attempts to apply test vectors and observe the results through the scan architecture is limited to make brute-force attacks practically impossible. A tester authentication utilizing a Phase Locked Loop (PLL) to encrypt the operating frequency of both DUT/Tester has also been presented. In this method, the access to the critical security circuits such as crypto-cores are not granted in the test mode. Instead, a built-in self-test method is used in the test mode to protect the circuit against scan-based attacks. Security for new generation of three-dimensional (3D) integrated circuits has been investigated through 3D simulations COMSOL Multiphysics environment. It is shown that the process of wafer thinning for 3D stacked IC integration reduces the leakage current which increases the chip security against side-channel attacks
    • โ€ฆ
    corecore