8 research outputs found

    ์ •์  ๋žจ ๋ฐ ํŒŒ์›Œ ๊ฒŒ์ดํŠธ ํšŒ๋กœ์— ๋Œ€ํ•œ ์ „์•• ๋ฐ ๋ณด์กด์šฉ ๊ณต๊ฐ„ ํ• ๋‹น ๋ฌธ์ œ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2021.8. ๊น€ํƒœํ™˜.์นฉ์˜ ์ €์ „๋ ฅ ๋™์ž‘์€ ์ค‘์š”ํ•œ ๋ฌธ์ œ์ด๋ฉฐ, ๊ณต์ •์ด ๋ฐœ์ „ํ•˜๋ฉด์„œ ๊ทธ ์ค‘์š”์„ฑ์€ ์ ์  ์ปค์ง€๊ณ  ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ์นฉ์„ ๊ตฌ์„ฑํ•˜๋Š” ์ •์  ๋žจ(SRAM) ๋ฐ ๋กœ์ง(logic) ๊ฐ๊ฐ์— ๋Œ€ํ•ด์„œ ์ €์ „๋ ฅ์œผ๋กœ ๋™์ž‘์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•๋ก ์„ ๋…ผํ•œ๋‹ค. ์šฐ์„ , ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์นฉ์„ ๋ฌธํ„ฑ ์ „์•• ๊ทผ์ฒ˜์˜ ์ „์••(NTV)์—์„œ ๋™์ž‘์‹œํ‚ค๊ณ ์ž ํ•  ๋•Œ ๋ชจ๋‹ˆํ„ฐ๋ง ํšŒ๋กœ์˜ ์ธก์ •์„ ํ†ตํ•ด ์นฉ ๋‚ด์˜ ๋ชจ๋“  SRAM ๋ธ”๋ก์—์„œ ๋™์ž‘ ์‹คํŒจ๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š๋Š” ์ตœ์†Œ ๋™์ž‘ ์ „์••์„ ์ถ”๋ก ํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆํ•œ๋‹ค. ์นฉ์„ NTV ์˜์—ญ์—์„œ ๋™์ž‘์‹œํ‚ค๋Š” ๊ฒƒ์€ ์—๋„ˆ์ง€ ํšจ์œจ์„ฑ์„ ์ฆ๋Œ€์‹œํ‚ฌ ์ˆ˜ ์žˆ๋Š” ๋งค์šฐ ํšจ๊ณผ์ ์ธ ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜์ด์ง€๋งŒ SRAM์˜ ๊ฒฝ์šฐ ๋™์ž‘ ์‹คํŒจ ๋•Œ๋ฌธ์— ๋™์ž‘ ์ „์••์„ ๋‚ฎ์ถ”๊ธฐ ์–ด๋ ต๋‹ค. ํ•˜์ง€๋งŒ ์นฉ๋งˆ๋‹ค ์˜ํ–ฅ์„ ๋ฐ›๋Š” ๊ณต์ • ๋ณ€์ด๊ฐ€ ๋‹ค๋ฅด๋ฏ€๋กœ ์ตœ์†Œ ๋™์ž‘ ์ „์••์€ ์นฉ๋งˆ๋‹ค ๋‹ค๋ฅด๋ฉฐ, ๋ชจ๋‹ˆํ„ฐ๋ง์„ ํ†ตํ•ด ์ด๋ฅผ ์ถ”๋ก ํ•ด๋‚ผ ์ˆ˜ ์žˆ๋‹ค๋ฉด ์นฉ๋ณ„๋กœ SRAM์— ์„œ๋กœ ๋‹ค๋ฅธ ์ „์••์„ ์ธ๊ฐ€ํ•ด ์—๋„ˆ์ง€ ํšจ์œจ์„ฑ์„ ๋†’์ผ ์ˆ˜ ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ณผ์ •์„ ํ†ตํ•ด ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•œ๋‹ค: (1) ๋””์ž์ธ ์ธํ”„๋ผ ์„ค๊ณ„ ๋‹จ๊ณ„์—์„œ๋Š” SRAM์˜ ์ตœ์†Œ ๋™์ž‘ ์ „์••์„ ์ถ”๋ก ํ•˜๊ณ  ์นฉ ์ƒ์‚ฐ ๋‹จ๊ณ„์—์„œ๋Š” SRAM ๋ชจ๋‹ˆํ„ฐ์˜ ์ธก์ •์„ ํ†ตํ•ด ์ „์••์„ ์ธ๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆํ•œ๋‹ค; (2) ์นฉ์˜ SRAM ๋น„ํŠธ์…€(bitcell)๊ณผ ์ฃผ๋ณ€ ํšŒ๋กœ๋ฅผ ํฌํ•จํ•œ SRAM ๋ธ”๋ก๋“ค์˜ ๊ณต์ • ๋ณ€์ด๋ฅผ ๋ชจ๋‹ˆํ„ฐ๋งํ•  ์ˆ˜ ์žˆ๋Š” SRAM ๋ชจ๋‹ˆํ„ฐ์™€ SRAM ๋ชจ๋‹ˆํ„ฐ์—์„œ ๋ชจ๋‹ˆํ„ฐ๋งํ•  ๋Œ€์ƒ์„ ์ •์˜ํ•œ๋‹ค; (3) SRAM ๋ชจ๋‹ˆํ„ฐ์˜ ์ธก์ •๊ฐ’์„ ์ด์šฉํ•ด ๊ฐ™์€ ์นฉ์— ์กด์žฌํ•˜๋Š” ๋ชจ๋“  SRAM ๋ธ”๋ก์—์„œ ๋ชฉํ‘œ ์‹ ๋ขฐ์ˆ˜์ค€ ๋‚ด์—์„œ ์ฝ๊ธฐ, ์“ฐ๊ธฐ, ๋ฐ ์ ‘๊ทผ ๋™์ž‘ ์‹คํŒจ๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š๋Š” ์ตœ์†Œ ๋™์ž‘ ์ „์••์„ ์ถ”๋ก ํ•œ๋‹ค. ๋ฒค์น˜๋งˆํฌ ํšŒ๋กœ์˜ ์‹คํ—˜ ๊ฒฐ๊ณผ๋Š” ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•์„ ๋”ฐ๋ผ ์นฉ๋ณ„๋กœ SRAM ๋ธ”๋ก๋“ค์˜ ์ตœ์†Œ ๋™์ž‘ ์ „์••์„ ๋‹ค๋ฅด๊ฒŒ ์ธ๊ฐ€ํ•  ๊ฒฝ์šฐ, ๊ธฐ์กด ๋ฐฉ๋ฒ•๋Œ€๋กœ ๋ชจ๋“  ์นฉ์— ๋™์ผํ•œ ์ „์••์„ ์ธ๊ฐ€ํ•˜๋Š” ๊ฒƒ ๋Œ€๋น„ ์ˆ˜์œจ์€ ๊ฐ™์€ ์ˆ˜์ค€์œผ๋กœ ์œ ์ง€ํ•˜๋ฉด์„œ SRAM ๋น„ํŠธ์…€ ๋ฐฐ์—ด์˜ ์ „๋ ฅ ์†Œ๋ชจ๋ฅผ ๊ฐ์†Œ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ์„ ๋ณด์ธ๋‹ค. ๋‘ ๋ฒˆ์งธ๋กœ, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ํŒŒ์›Œ ๊ฒŒ์ดํŠธ ํšŒ๋กœ์—์„œ ๊ธฐ์กด์˜ ๋ณด์กด์šฉ ๊ณต๊ฐ„ ํ• ๋‹น ๋ฐฉ๋ฒ•๋“ค์ด ์ง€๋‹ˆ๊ณ  ์žˆ๋Š” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ  ๋ˆ„์„ค ์ „๋ ฅ ์†Œ๋ชจ๋ฅผ ๋” ์ค„์ผ ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆํ•œ๋‹ค. ๊ธฐ์กด์˜ ๋ณด์กด์šฉ ๊ณต๊ฐ„ ํ• ๋‹น ๋ฐฉ๋ฒ•์€ ๋ฉ€ํ‹ฐํ”Œ๋ ‰์„œ ํ”ผ๋“œ๋ฐฑ ๋ฃจํ”„๊ฐ€ ์žˆ๋Š” ๋ชจ๋“  ํ”Œ๋ฆฝํ”Œ๋กญ์—๋Š” ๋ฌด์กฐ๊ฑด ๋ณด์กด์šฉ ๊ณต๊ฐ„์„ ํ• ๋‹นํ•ด์•ผ ํ•ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋‹ค์ค‘ ๋น„ํŠธ ๋ณด์กด์šฉ ๊ณต๊ฐ„์˜ ์žฅ์ ์„ ์ถฉ๋ถ„ํžˆ ์‚ด๋ฆฌ์ง€ ๋ชปํ•˜๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด ๋ณด์กด์šฉ ๊ณต๊ฐ„์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•œ๋‹ค: (1) ๋ณด์กด์šฉ ๊ณต๊ฐ„ ํ• ๋‹น ๊ณผ์ •์—์„œ ๋ฉ€ํ‹ฐํ”Œ๋ ‰์„œ ํ”ผ๋“œ๋ฐฑ ๋ฃจํ”„๋ฅผ ๋ฌด์‹œํ•  ์ˆ˜ ์žˆ๋Š” ์กฐ๊ฑด์„ ์ œ์‹œํ•˜๊ณ , (2) ํ•ด๋‹น ์กฐ๊ฑด์„ ์ด์šฉํ•ด ๋ฉ€ํ‹ฐํ”Œ๋ ‰์„œ ํ”ผ๋“œ๋ฐฑ ๋ฃจํ”„๊ฐ€ ์žˆ๋Š” ํ”Œ๋ฆฝํ”Œ๋กญ์ด ๋งŽ์ด ์กด์žฌํ•˜๋Š” ํšŒ๋กœ์—์„œ ๋ณด์กด์šฉ ๊ณต๊ฐ„์„ ์ตœ์†Œํ™”ํ•œ๋‹ค; (3) ์ถ”๊ฐ€๋กœ, ํ”Œ๋ฆฝํ”Œ๋กญ์— ์ด๋ฏธ ํ• ๋‹น๋œ ๋ณด์กด์šฉ ๊ณต๊ฐ„ ์ค‘ ์ผ๋ถ€๋ฅผ ์ œ๊ฑฐํ•  ์ˆ˜ ์žˆ๋Š” ์กฐ๊ฑด์„ ์ฐพ๊ณ , ์ด๋ฅผ ์ด์šฉํ•ด ๋ณด์กด์šฉ ๊ณต๊ฐ„์„ ๋” ๊ฐ์†Œ์‹œํ‚จ๋‹ค. ๋ฒค์น˜๋งˆํฌ ํšŒ๋กœ์˜ ์‹คํ—˜ ๊ฒฐ๊ณผ๋Š” ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•๋ก ์ด ๊ธฐ์กด์˜ ๋ณด์กด์šฉ ๊ณต๊ฐ„ ํ• ๋‹น ๋ฐฉ๋ฒ•๋ก ๋ณด๋‹ค ๋” ์ ์€ ๋ณด์กด์šฉ ๊ณต๊ฐ„์„ ํ• ๋‹นํ•˜๋ฉฐ, ๋”ฐ๋ผ์„œ ์นฉ์˜ ๋ฉด์  ๋ฐ ์ „๋ ฅ ์†Œ๋ชจ๋ฅผ ๊ฐ์†Œ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ์„ ๋ณด์ธ๋‹ค.Low power operation of a chip is an important issue, and its importance is increasing as the process technology advances. This dissertation addresses the methodology of operating at low power for each of the SRAM and logic constituting the chip. Firstly, we propose a methodology to infer the minimum operating voltage at which SRAM failure does not occur in all SRAM blocks in the chip operating on near threshold voltage (NTV) regime through the measurement of a monitoring circuit. Operating the chip on NTV regime is one of the most effective ways to increase energy efficiency, but in case of SRAM, it is difficult to lower the operating voltage because of SRAM failure. However, since the process variation on each chip is different, the minimum operating voltage is also different for each chip. If it is possible to infer the minimum operating voltage of SRAM blocks of each chip through monitoring, energy efficiency can be increased by applying different voltage. In this dissertation, we propose a new methodology of resolving this problem. Specifically, (1) we propose to infer minimum operation voltage of SRAM in design infra development phase, and assign the voltage using measurement of SRAM monitor in silicon production phase; (2) we define a SRAM monitor and features to be monitored that can monitor process variation on SRAM blocks including SRAM bitcell and peripheral circuits; (3) we propose a new methodology of inferring minimum operating voltage of SRAM blocks in a chip that does not cause read, write, and access failures under a target confidence level. Through experiments with benchmark circuits, it is confirmed that applying different voltage to SRAM blocks in each chip that inferred by our proposed methodology can save overall power consumption of SRAM bitcell array compared to applying same voltage to SRAM blocks in all chips, while meeting the same yield target. Secondly, we propose a methodology to resolve the problem of the conventional retention storage allocation methods and thereby further reduce leakage power consumption of power gated circuit. Conventional retention storage allocation methods have problem of not fully utilizing the advantage of multi-bit retention storage because of the unavoidable allocation of retention storage on flip-flops with mux-feedback loop. In this dissertation, we propose a new methodology of breaking the bottleneck of minimizing the state retention storage. Specifically, (1) we find a condition that mux-feedback loop can be disregarded during the retention storage allocation; (2) utilizing the condition, we minimize the retention storage of circuits that contain many flip-flops with mux-feedback loop; (3) we find a condition to remove some of the retention storage already allocated to each of flip-flops and propose to further reduce the retention storage. Through experiments with benchmark circuits, it is confirmed that our proposed methodology allocates less retention storage compared to the state-of-the-art methods, occupying less cell area and consuming less power.1 Introduction 1 1.1 Low Voltage SRAM Monitoring Methodology 1 1.2 Retention Storage Allocation on Power Gated Circuit 5 1.3 Contributions of this Dissertation 8 2 SRAM On-Chip Monitoring Methodology for High Yield and Energy Efficient Memory Operation at Near Threshold Voltage 13 2.1 SRAM Failures 13 2.1.1 Read Failure 13 2.1.2 Write Failure 15 2.1.3 Access Failure 16 2.1.4 Hold Failure 16 2.2 SRAM On-chip Monitoring Methodology: Bitcell Variation 18 2.2.1 Overall Flow 18 2.2.2 SRAM Monitor and Monitoring Target 18 2.2.3 Vfail to Vddmin Inference 22 2.3 SRAM On-chip Monitoring Methodology: Peripheral Circuit IR Drop and Variation 29 2.3.1 Consideration of IR Drop 29 2.3.2 Consideration of Peripheral Circuit Variation 30 2.3.3 Vddmin Prediction including Access Failure Prohibition 33 2.4 Experimental Results 41 2.4.1 Vddmin Considering Read and Write Failures 42 2.4.2 Vddmin Considering Read/Write and Access Failures 45 2.4.3 Observation for Practical Use 45 3 Allocation of Always-On State Retention Storage for Power Gated Circuits - Steady State Driven Approach 49 3.1 Motivations and Analysis 49 3.1.1 Impact of Self-loop on Power Gating 49 3.1.2 Circuit Behavior Before Sleeping 52 3.1.3 Wakeup Latency vs. Retention Storage 54 3.2 Steady State Driven Retention Storage Allocation 56 3.2.1 Extracting Steady State Self-loop FFs 57 3.2.2 Allocating State Retention Storage 59 3.2.3 Designing and Optimizing Steady State Monitoring Logic 59 3.2.4 Analysis of the Impact of Steady State Monitoring Time on the Standby Power 63 3.3 Retention Storage Refinement Utilizing Steadiness 65 3.3.1 Extracting Flip-flops for Retention Storage Refinement 66 3.3.2 Designing State Monitoring Logic and Control Signals 68 3.4 Experimental Results 73 3.4.1 Comparison of State Retention Storage 75 3.4.2 Comparison of Power Consumption 79 3.4.3 Impact on Circuit Performance 82 3.4.4 Support for Immediate Power Gating 83 4 Conclusions 89 4.1 Chapter 2 89 4.2 Chapter 3 90๋ฐ•

    Optimization of Cell-Aware Test

    Get PDF

    Optimization of Cell-Aware Test

    Get PDF

    ๅ…ˆ็ซฏใƒ—ใƒญใ‚ปใ‚นๆŠ€่ก“ใซใŠใ‘ใ‚‹ๆทท่ผ‰SRAMใฎ้ซ˜ไฟก้ ผใƒปไฝŽ้›ปๅŠ›ๅŒ–ใซ้–ขใ™ใ‚‹็ ”็ฉถ

    Get PDF
    13301็”ฒ็ฌฌ4843ๅทๅšๅฃซ๏ผˆๅทฅๅญฆ๏ผ‰้‡‘ๆฒขๅคงๅญฆๅšๅฃซ่ซ–ๆ–‡ๆœฌๆ–‡Ful

    Hardware / Software Architectural and Technological Exploration for Energy-Efficient and Reliable Biomedical Devices

    Get PDF
    Nowadays, the ubiquity of smart appliances in our everyday lives is increasingly strengthening the links between humans and machines. Beyond making our lives easier and more convenient, smart devices are now playing an important role in personalized healthcare delivery. This technological breakthrough is particularly relevant in a world where population aging and unhealthy habits have made non-communicable diseases the first leading cause of death worldwide according to international public health organizations. In this context, smart health monitoring systems termed Wireless Body Sensor Nodes (WBSNs), represent a paradigm shift in the healthcare landscape by greatly lowering the cost of long-term monitoring of chronic diseases, as well as improving patients' lifestyles. WBSNs are able to autonomously acquire biological signals and embed on-node Digital Signal Processing (DSP) capabilities to deliver clinically-accurate health diagnoses in real-time, even outside of a hospital environment. Energy efficiency and reliability are fundamental requirements for WBSNs, since they must operate for extended periods of time, while relying on compact batteries. These constraints, in turn, impose carefully designed hardware and software architectures for hosting the execution of complex biomedical applications. In this thesis, I develop and explore novel solutions at the architectural and technological level of the integrated circuit design domain, to enhance the energy efficiency and reliability of current WBSNs. Firstly, following a top-down approach driven by the characteristics of biomedical algorithms, I perform an architectural exploration of a heterogeneous and reconfigurable computing platform devoted to bio-signal analysis. By interfacing a shared Coarse-Grained Reconfigurable Array (CGRA) accelerator, this domain-specific platform can achieve higher performance and energy savings, beyond the capabilities offered by a baseline multi-processor system. More precisely, I propose three CGRA architectures, each contributing differently to the maximization of the application parallelization. The proposed Single, Multi and Interleaved-Datapath CGRA designs allow the developed platform to achieve substantial energy savings of up to 37%, when executing complex biomedical applications, with respect to a multi-core-only platform. Secondly, I investigate how the modeling of technology reliability issues in logic and memory components can be exploited to adequately adjust the frequency and supply voltage of a circuit, with the aim of optimizing its computing performance and energy efficiency. To this end, I propose a novel framework for workload-dependent Bias Temperature Instability (BTI) impact analysis on biomedical application results quality. Remarkably, the framework is able to determine the range of safe circuit operating frequencies without introducing worst-case guard bands. Experiments highlight the possibility to safely raise the frequency up to 101% above the maximum obtained with the classical static timing analysis. Finally, through the study of several well-known biomedical algorithms, I propose an approach allowing energy savings by dynamically and unequally protecting an under-powered data memory in a new way compared to regular error protection schemes. This solution relies on the Dynamic eRror compEnsation And Masking (DREAM) technique that reduces by approximately 21% the energy consumed by traditional error correction codes

    Harnessing noise to enhance robustness vs. efficiency trade-off in machine learning

    Get PDF
    While deep nets have achieved human-comparable accuracy in various classification tasks, they fall short significantly in terms of the robustness and cost metrics. For example, tiny engineered corruptions in deep net inputs can reduce their accuracy to zero. Furthermore, deep nets also require millions of trainable parameters, resulting in significant training and inference costs. These robustness and cost challenges are well recognized today. In response, there have been a plethora of works focusing on improving either the accuracy vs. robustness trade-off, or the accuracy vs. cost trade-off. However, simultaneous consideration of accuracy, robustness, and cost metrics is largely absent today, in part, because far fewer works have explored the robustness vs. cost trade-off. This dissertation aims to fill this gap by focusing explicitly on the robustness vs. cost trade-off in the presence of data noise, as well as hardware noise. Specifically, we explore how to harness the noise in order to enhance this trade-off. We characterize and improve robustness vs. cost trade-offs across diverse problem settings, ranging from beyond-CMOS hardware implementations of machine learning (ML) classifiers to efficient training of deep nets that are robust to multiple types of corruptions in their inputs. This dissertation can be roughly divided into two part, one focusing on hardware noise and the other on data noise. In the first part, we start by focusing on harnessing noise in spintronic hardware implementations, where the logic gates become error prone when operated at lower switching energy/delay. We propose techniques to shape the resulting hardware noise distribution and to efficiently compensate it at the system-level output. As a result, we observe 1000x improvement intolerance to gate-level switching error rates, while keeping the area/energy overhead of compensation circuits to as low as 15%. These robustness enhancements further enable 3ร— reduction in iso-throughput energy consumption of a binary ML classifier employed for EEG-based seizure detection. Building on this work, we propose spintronic channel networks, exponential decay of spin current to efficiently realize multi-bit dot product computation. We employ error-prone nanomagnets as efficient stochastic slicers biased by spin currents proportional to the likelihood of the classification decision. We achieve 112x-to-22.5x and 14x-to-2.5x higher energy-efficiency over conventional spin-based and 20 nm CMOS designs, respectively, when realizing 10-to-100-dimensional binary classifiers. Furthermore, we also consider the impact of hardware noise originated from process variations and readout circuits in in-memory computing implementations employing non-volatile resistive crossbar arrays. Based on our analysis, we identify design configurations achieving the highest signal-to-noise ratio (SNR), and further estimate how such robustness trades off with the array energy consumption. In the second part, we switch gears to improve the robustness vs. cost trade-off for deep nets in the presence of data noise. Specifically, we focus on the impact of adversarial perturbations in the deep nets inputs. We propose and validate the hypotheses about orientations of dominant subspaces of adversarial perturbations. We demonstrate how changes in the curvature of decision boundary of the deep nets affects the orientations of the adversarial perturbations. Based on these insights we demonstrate how shaped noise can be introduced as a feature to enhance robustness vs. cost trade-off in deep nets. Specifically, we propose shaped noise augmented processing (SNAP), a method to efficiently train deep nets that are robust to multiple types of adversarial perturbations, simultaneously. SNAP prepends a deep net with a shaped noise augmentation layer whose distribution is learned along with the network parameters using any established robust training framework. Based on extensive comparisons with nine state-of-the-art (SOTA) robust training frameworks, we show that SNAP achieves the best robustness vs. training cost trade-off. In particular, it enables 4x reduction in the training cost compared to the SOTA approach published just this last year. Furthermore, thanks to the computational simplicity of SNAP, it is the first technique of its kind that is scalable to large datasets, such as ImageNet

    A Holistic Solution for Reliability of 3D Parallel Systems

    Full text link
    As device scaling slows down, emerging technologies such as 3D integration and carbon nanotube field-effect transistors are among the most promising solutions to increase device density and performance. These emerging technologies offer shorter interconnects, higher performance, and lower power. However, higher levels of operating temperatures and current densities project significantly higher failure rates. Moreover, due to the infancy of the manufacturing process, high variation, and defect densities, chip designers are not encouraged to consider these emerging technologies as a stand-alone replacement for Silicon-based transistors. The goal of this dissertation is to introduce new architectural and circuit techniques that can work around high-fault rates in the emerging 3D technologies, improving performance and reliability comparable to Silicon. We propose a new holistic approach to the reliability problem that addresses the necessary aspects of an effective solution such as detection, diagnosis, repair, and prevention synergically for a practical solution. By leveraging 3D fabric layouts, it proposes the underlying architecture to efficiently repair the system in the presence of faults. This thesis presents a fault detection scheme by re-executing instructions on idle identical units that distinguishes between transient and permanent faults while localizing it to the granularity of a pipeline stage. Furthermore, with the use of a dynamic and adaptive reconfiguration policy based on activity factors and temperature variation, we propose a framework that delivers a significant improvement in lifetime management to prevent faults due to aging. Finally, a design framework that can be used for large-scale chip production while mitigating yield and variation failures to bring up Carbon Nano Tube-based technology is presented. The proposed framework is capable of efficiently supporting high-variation technologies by providing protection against manufacturing defects at different granularities: module and pipeline-stage levels.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/168118/1/javadb_1.pd

    Design for prognostics and security in field programmable gate arrays (FPGAs).

    Get PDF
    There is an evolutionary progression of Field Programmable Gate Arrays (FPGAs) toward more complex and high power density architectures such as Systems-on- Chip (SoC) and Adaptive Compute Acceleration Platforms (ACAP). Primarily, this is attributable to the continual transistor miniaturisation and more innovative and efficient IC manufacturing processes. Concurrently, degradation mechanism of Bias Temperature Instability (BTI) has become more pronounced with respect to its ageing impact. It could weaken the reliability of VLSI devices, FPGAs in particular due to their run-time reconfigurability. At the same time, vulnerability of FPGAs to device-level attacks in the increasing cyber and hardware threat environment is also quadrupling as the susceptible reliability realm opens door for the rogue elements to intervene. Insertion of highly stealthy and malicious circuitry, called hardware Trojans, in FPGAs is one of such malicious interventions. On the one hand where such attacks/interventions adversely affect the security ambit of these devices, they also undermine their reliability substantially. Hitherto, the security and reliability are treated as two separate entities impacting the FPGA health. This has resulted in fragmented solutions that do not reflect the true state of the FPGA operational and functional readiness, thereby making them even more prone to hardware attacks. The recent episodes of Spectre and Meltdown vulnerabilities are some of the key examples. This research addresses these concerns by adopting an integrated approach and investigating the FPGA security and reliability as two inter-dependent entities with an additional dimension of health estimation/ prognostics. The design and implementation of a small footprint frequency and threshold voltage-shift detection sensor, a novel hardware Trojan, and an online transistor dynamic scaling circuitry present a viable FPGA security scheme that helps build a strong microarchitectural level defence against unscrupulous hardware attacks. Augmented with an efficient Kernel-based learning technique for FPGA health estimation/prognostics, the optimal integrated solution proves to be more dependable and trustworthy than the prevalent disjointed approach.Samie, Mohammad (Associate)PhD in Transport System
    corecore