31 research outputs found

    A Novel Methodology for Error-Resilient Circuits in Near-Threshold Computing

    Get PDF
    Department of Electrical EngineeringThe main goal of designing VLSI system is high performance with low energy consumption. Actually, to realize the human-related techniques, such as internet of things (IoTs) and wearable devices, efficient power management techniques are required. Near threshold computing (NTC) is one of the most well-known techniques which is proposed for the trade-off between energy consumption and performance improvement. With this technique, the solution would be selected by the lowest energy with highest performance. However, NTC suffers a significant performance degradation, which is prone to timing errors. However, main goal of Integrated Circuit (IC) design is making the circuit to always operate correctly though worst-case condition. But, in order to make the circuit always work correctly, considerable area and power overheads may occur. As an alternative, better-than-worst-case (BTWC) design paradigm has been proposed. One of the main design of BTWC design includes error-resilient circuits which detect and correct timing errors, though they cause area and power overheads. In this thesis, we propose various design methodologies which provide an optimal implementation of error-resilient circuits. Slack-based, sensitivity-based methodology and modified Quine-McCluskey (Q-M) algorithm have been exploited to earn the minimum set of error-resilient circuits without any loss of detection ability. From sensitivity-based methodology, benchmark results show that the optimal designs reduces up to 46% monitoring area without compromising error detection ability of the initial error-resilient design. From the Quine-McCluskey (Q-M) algorithm, benchmark results show that optimal design reduces up to 72% of flip-flops which are required to be changed to error-resilient circuits without compromising an error detection ability. In addition, more power and area reduction can be possible when reasonable underestimation of error detection ability is accepted. Monte-Carlo analysis validates that our proposed method is tolerant to process variation.ope

    Low-Power and Error-Resilient VLSI Circuits and Systems.

    Full text link
    Efficient low-power operation is critically important for the success of the next-generation signal processing applications. Device and supply voltage have been continuously scaled to meet a more constrained power envelope, but scaling has created resiliency challenges, including increasing timing faults and soft errors. Our research aims at designing low-power and robust circuits and systems for signal processing by drawing circuit, architecture, and algorithm approaches. To gain an insight into the system faults due to supply voltage reduction, we researched the two primary effects that determine the minimum supply voltage (VMIN) in Intelโ€™s tri-gate CMOS technology, namely process variations and gate-dielectric soft breakdown. We determined that voltage scaling increases the timing window that sequential circuits are vulnerable. Thus, we proposed a new hold-time violation metric to define hold-time VMIN, which has been adopted as a new design standard. Device scaling increases soft errors which affect circuit reliability. Through extensive soft error characterization using two 65nm CMOS test chips, we studied the soft error mechanisms and its dependence on supply voltage and clock frequency. This study laid the foundation of the first 65nm DSP chip design for a NASA spaceflight project. To mitigate such random errors, we proposed a new confidence-driven architecture that effectively enhances the error resiliency of deeply scaled CMOS and post-CMOS circuits. Designing low-power resilient systems can effectively leverage application-specific algorithmic approaches. To explore design opportunities in the algorithmic domain, we demonstrate an application-specific detection and decoding processor for multiple-input multiple-output (MIMO) wireless communication. To enhance the receive error rate for a robust wireless communication, we designed a joint detection and decoding technique by enclosing detection and decoding in an iterative loop to enhance both interference cancellation and error reduction. A proof-of-concept chip design was fabricated for the next-generation 4x4 256QAM MIMO systems. Through algorithm-architecture optimizations and low-power circuit techniques, our design achieves significant improvements in throughput, energy efficiency and error rate, paving the way for future developments in this area.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/110323/1/uchchen_1.pd

    ๋กœ์ง ๋ฐ ํ”ผ์ง€์ปฌ ํ•ฉ์„ฑ์—์„œ์˜ ํƒ€์ด๋ฐ ๋ถ„์„๊ณผ ์ตœ์ ํ™”

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2020. 8. ๊น€ํƒœํ™˜.Timing analysis is one of the necessary steps in the development of a semiconductor circuit. In addition, it is increasingly important in the advanced process technologies due to various factors, including the increase of processโ€“voltageโ€“temperature variation. This dissertation addresses three problems related to timing analysis and optimization in logic and physical synthesis. Firstly, most static timing analysis today are based on conventional fixed flip-flop timing models, in which every flip-flop is assumed to have a fixed clock-to-Q delay. However, setup and hold skews affect the clock-to-Q delay in reality. In this dissertation, I propose a mathematical formulation to solve the problem and apply it to the clock skew scheduling problems as well as to the analysis of a given circuit, with a scalable speedup technique. Secondly, near-threshold computing is one of the promising concepts for energy-efficient operation of VLSI systems, but wide performance variation and nonlinearity to process variations block the proliferation. To cope with this, I propose a holistic hardware performance monitoring methodology for accurate timing prediction in a near-threshold voltage regime and advanced process technology. Lastly, an asynchronous circuit is one of the alternatives to the conventional synchronous style, and asynchronous pipeline circuit especially attractive because of its small design effort. This dissertation addresses the synthesis problem of lightening two-phase bundled-data asynchronous pipeline controllers, in which delay buffers are essential for guaranteeing the correct handshaking operation but incurs considerable area increase.ํƒ€์ด๋ฐ ๋ถ„์„์€ ๋ฐ˜๋„์ฒด ํšŒ๋กœ ๊ฐœ๋ฐœ ํ•„์ˆ˜ ๊ณผ์ • ์ค‘ ํ•˜๋‚˜๋กœ, ์ตœ์‹  ๊ณต์ •์ผ์ˆ˜๋ก ๊ณต์ •-์ „์••-์˜จ๋„ ๋ณ€์ด ์ฆ๊ฐ€๋ฅผ ํฌํ•จํ•œ ๋‹ค์–‘ํ•œ ์š”์ธ์œผ๋กœ ํ•˜์—ฌ๊ธˆ ๊ทธ ์ค‘์š”์„ฑ์ด ์ปค์ง€๊ณ  ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋กœ์ง ๋ฐ ํ”ผ์ง€์ปฌ ํ•ฉ์„ฑ๊ณผ ๊ด€๋ จํ•˜์—ฌ ์„ธ ๊ฐ€์ง€ ํƒ€์ด๋ฐ ๋ถ„์„ ๋ฐ ์ตœ์ ํ™” ๋ฌธ์ œ์— ๋Œ€ํ•ด ๋‹ค๋ฃฌ๋‹ค. ์ฒซ์งธ๋กœ, ์˜ค๋Š˜๋‚  ๋Œ€๋ถ€๋ถ„์˜ ์ •์  ํƒ€์ด๋ฐ ๋ถ„์„์€ ๋ชจ๋“  ํ”Œ๋ฆฝ-ํ”Œ๋กญ์˜ ํด๋Ÿญ-์ถœ๋ ฅ ๋”œ๋ ˆ์ด๊ฐ€ ๊ณ ์ •๋œ ๊ฐ’์ด๋ผ๋Š” ๊ฐ€์ •์„ ๋ฐ”ํƒ•์œผ๋กœ ์ด๋ฃจ์–ด์กŒ๋‹ค. ํ•˜์ง€๋งŒ ์‹ค์ œ ํด๋Ÿญ-์ถœ๋ ฅ ๋”œ๋ ˆ์ด๋Š” ํ•ด๋‹น ํ”Œ๋ฆฝ-ํ”Œ๋กญ์˜ ์…‹์—… ๋ฐ ํ™€๋“œ ์Šคํ์— ์˜ํ–ฅ์„ ๋ฐ›๋Š”๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ด๋Ÿฌํ•œ ํŠน์„ฑ์„ ์ˆ˜ํ•™์ ์œผ๋กœ ์ •๋ฆฌํ•˜์˜€์œผ๋ฉฐ, ์ด๋ฅผ ํ™•์žฅ ๊ฐ€๋Šฅํ•œ ์†๋„ ํ–ฅ์ƒ ๊ธฐ๋ฒ•๊ณผ ๋”๋ถˆ์–ด ์ฃผ์–ด์ง„ ํšŒ๋กœ์˜ ํƒ€์ด๋ฐ ๋ถ„์„ ๋ฐ ํด๋Ÿญ ์Šคํ ์Šค์ผ€์ฅด๋ง ๋ฌธ์ œ์— ์ ์šฉํ•˜์˜€๋‹ค. ๋‘˜์งธ๋กœ, ์œ ์‚ฌ ๋ฌธํ„ฑ ์—ฐ์‚ฐ์€ ์ดˆ๊ณ ์ง‘์  ํšŒ๋กœ ๋™์ž‘์˜ ์—๋„ˆ์ง€ ํšจ์œจ์„ ๋Œ์–ด ์˜ฌ๋ฆด ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์—์„œ ๊ฐ๊ด‘๋ฐ›์ง€๋งŒ, ํฐ ํญ์˜ ์„ฑ๋Šฅ ๋ณ€์ด ๋ฐ ๋น„์„ ํ˜•์„ฑ ๋•Œ๋ฌธ์— ๋„๋ฆฌ ํ™œ์šฉ๋˜๊ณ  ์žˆ์ง€ ์•Š๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์œ ์‚ฌ ๋ฌธํ„ฑ ์ „์•• ์˜์—ญ ๋ฐ ์ตœ์‹  ๊ณต์ • ๋…ธ๋“œ์—์„œ ๋ณด๋‹ค ์ •ํ™•ํ•œ ํƒ€์ด๋ฐ ์˜ˆ์ธก์„ ์œ„ํ•œ ํ•˜๋“œ์›จ์–ด ์„ฑ๋Šฅ ๋ชจ๋‹ˆํ„ฐ๋ง ๋ฐฉ๋ฒ•๋ก  ์ „๋ฐ˜์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ๋น„๋™๊ธฐ ํšŒ๋กœ๋Š” ๊ธฐ์กด ๋™๊ธฐ ํšŒ๋กœ์˜ ๋Œ€์•ˆ ์ค‘ ํ•˜๋‚˜๋กœ, ๊ทธ ์ค‘์—์„œ๋„ ๋น„๋™๊ธฐ ํŒŒ์ดํ”„๋ผ์ธ ํšŒ๋กœ๋Š” ๋น„๊ต์  ์ ์€ ์„ค๊ณ„ ๋…ธ๋ ฅ๋งŒ์œผ๋กœ๋„ ๊ตฌํ˜„ ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์žฅ์ ์ด ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” 2์œ„์ƒ ๋ฌถ์Œ ๋ฐ์ดํ„ฐ ํ”„๋กœํ† ์ฝœ ๊ธฐ๋ฐ˜ ๋น„๋™๊ธฐ ํŒŒ์ดํ”„๋ผ์ธ ์ปจํŠธ๋กค๋Ÿฌ ์ƒ์—์„œ, ์ •ํ™•ํ•œ ํ•ธ๋“œ์…ฐ์ดํ‚น ํ†ต์‹ ์„ ์œ„ํ•ด ์‚ฝ์ž…๋œ ๋”œ๋ ˆ์ด ๋ฒ„ํผ์— ์˜ํ•œ ๋ฉด์  ์ฆ๊ฐ€๋ฅผ ์™„ํ™”ํ•  ์ˆ˜ ์žˆ๋Š” ํ•ฉ์„ฑ ๊ธฐ๋ฒ•์„ ์ œ์‹œํ•˜์˜€๋‹ค.1 INTRODUCTION 1 1.1 Flexible Flip-Flop Timing Model 1 1.2 Hardware Performance Monitoring Methodology 4 1.3 Asynchronous Pipeline Controller 10 1.4 Contributions of this Dissertation 15 2 ANALYSIS AND OPTIMIZATION CONSIDERING FLEXIBLE FLIP-FLOP TIMING MODEL 17 2.1 Preliminaries 17 2.1.1 Terminologies 17 2.1.2 Timing Analysis 20 2.1.3 Clock-to-Q Delay Surface Modeling 21 2.2 Clock-to-Q Delay Interval Analysis 22 2.2.1 Derivation 23 2.2.2 Additional Constraints 26 2.2.3 Analysis: Finding Minimum Clock Period 28 2.2.4 Optimization: Clock Skew Scheduling 30 2.2.5 Scalable Speedup Technique 33 2.3 Experimental Results 37 2.3.1 Application to Minimum Clock Period Finding 37 2.3.2 Application to Clock Skew Scheduling 39 2.3.3 Efficacy of Scalable Speedup Technique 43 2.4 Summary 44 3 HARDWARE PERFORMANCE MONITORING METHODOLOGY AT NTC AND ADVANCED TECHNOLOGY NODE 45 3.1 Overall Flow of Proposed HPM Methodology 45 3.2 Prerequisites to HPM Methodology 47 3.2.1 BEOL Process Variation Modeling 47 3.2.2 Surrogate Model Preparation 49 3.3 HPM Methodology: Design Phase 52 3.3.1 HPM2PV Model Construction 52 3.3.2 Optimization of Monitoring Circuits Configuration 54 3.3.3 PV2CPT Model Construction 58 3.4 HPM Methodology: Post-Silicon Phase 60 3.4.1 Transfer Learning in Silicon Characterization Step 60 3.4.2 Procedures in Volume Production Phase 61 3.5 Experimental Results 62 3.5.1 Experimental Setup 62 3.5.2 Exploration of Monitoring Circuits Configuration 64 3.5.3 Effectiveness of Monitoring Circuits Optimization 66 3.5.4 Considering BEOL PVs and Uncertainty Learning 68 3.5.5 Comparison among Different Prediction Flows 69 3.5.6 Effectiveness of Prediction Model Calibration 71 3.6 Summary 73 4 LIGHTENING ASYNCHRONOUS PIPELINE CONTROLLER 75 4.1 Preliminaries and State-of-the-Art Work 75 4.1.1 Bundled-data vs. Dual-rail Asynchronous Circuits 75 4.1.2 Two-phase vs. Four-phase Bundled-data Protocol 76 4.1.3 Conventional State-of-the-Art Pipeline Controller Template 77 4.2 Delay Path Sharing for Lightening Pipeline Controller Template 78 4.2.1 Synthesizing Sharable Delay Paths 78 4.2.2 Validating Logical Correctness for Sharable Delay Paths 80 4.2.3 Reformulating Timing Constraints of Controller Template 81 4.2.4 Minimally Allocating Delay Buffers 87 4.3 In-depth Pipeline Controller Template Synthesis with Delay Path Reusing 88 4.3.1 Synthesizing Delay Path Units 88 4.3.2 Validating Logical Correctness of Delay Path Units 89 4.3.3 Updating Timing Constraints for Delay Path Units 91 4.3.4 In-depth Synthesis Flow Utilizing Delay Path Units 95 4.4 Experimental Results 99 4.4.1 Environment Setup 99 4.4.2 Piecewise Linear Modeling of Delay Path Unit Area 99 4.4.3 Comparison of Power, Performance, and Area 102 4.5 Summary 107 5 CONCLUSION 109 5.1 Chapter 2 109 5.2 Chapter 3 110 5.3 Chapter 4 110 Abstract (In Korean) 127Docto

    Synthesis of Clock Trees with Useful Skew based on Sparse-Graph Algorithms

    Get PDF
    Computer-aided design (CAD) for very large scale integration (VLSI) involve

    Design and test for timing uncertainty in VLSI circuits.

    Get PDF
    ็”ฑๆ–ผ็‰นๅพตๅฐบๅฏธไธๆ–ท็ธฎๅฐ๏ผŒ้›†ๆˆ้›ป่ทฏๅœจ็”Ÿ็”ข้Ž็จ‹ไธญ็š„ๅทฅ่—ๅๅทฎๅœจ้‹่กŒ็’ฐๅขƒไธญๆบซๅบฆๅ’Œ้›ปๅฃ“็ญ‰ๅƒๆ•ธ็š„ๆณขๅ‹•ไปฅๅŠๅœจไฝฟ็”จ้Ž็จ‹ไธญ็š„่€ๅŒ–็ญ‰ๆ•ˆๆ‡‰่ถŠไพ†่ถŠๅšด้‡๏ผŒๅฐŽ่‡ด่Šฏ็‰‡็š„ๆ™‚ๅบ่กŒ็‚บๅ‡บ็พๅพˆๅคง็š„ไธ็ขบๅฎšๆ€งใ€‚ๅคšๆ•ธๆƒ…ๆณไธ‹๏ผŒ่Šฏ็‰‡็š„้—œ้ต่ทฏๅพ‘ๆœƒไธๆ™‚ๅ‡บ็พๆ™‚ๅบ้Œฏ่ชคใ€‚ๅŠ ๅ…ฅๆ›ดๅคš็š„ๆ™‚ๅบ้ค˜้‡ไธๆ˜ฏไธ€็จฎๅพˆๅฅฝ็š„่งฃๆฑบๆ–นๆกˆ๏ผŒๅ› ็‚บ้€™็จฎไฟๅฎˆ็š„่จญ่จˆๆ–นๆณ•ๆœƒๆŠตๆถˆๅทฅ่—้€ฒๆญฅๅธถไพ†็š„ๆ€ง่ƒฝไธŠ็š„ๅฅฝ่™•ใ€‚้€™ๅฐฑ็‚บ่จญ่จˆไธ€ๅ€‹ๆ™‚ๅบๅฏ้ ็š„็ณป็ตฑๆๅ‡บไบ†ๆฅตๅคง็š„ๆŒ‘ๆˆฐ๏ผŒๅ…ถไธญ็š„ไธ€ไบ›้—œ้ตๅ•้กŒๅŒ…ๆ‹ฌ๏ผš(ไธ€)ๅฆ‚ไฝ•ๆœ‰ๆ•ˆๅœฐๅˆ†้…ๆœ‰้™็š„ๅŠŸ็Ž‡้ ็ฎ—ๅŽปๅ„ชๅŒ–้‚ฃไบ›ๆญฃ็ˆ†็‚ธๅผๅขžๅŠ ็š„้—œ้ต่ทฏๅพ‘็š„ๆ™‚ๅบๆ€ง่ƒฝ๏ผ›(ไบŒ)ๅฆ‚ไฝ•็”ข็”Ÿ่ƒฝๅค ๆ•ๆ‰ๆบ–็ขบ็š„ๆœ€ๅฃžๆƒ…ๆณๆ™‚ๅปถ็š„้ซ˜ๅ“่ณชๆธฌ่ฉฆๅ‘้‡๏ผ›(ไธ‰)็‚บไบ†่ƒฝๅค ๅ–ๅพ—ๆ›ดๅฅฝ็š„ๅŠŸ่€—ๅ’Œๆ€ง่ƒฝไธŠ็š„ๅนณ่กก๏ผŒๆˆ‘ๅ€‘ๅฐ‡ไธๅพ—ไธๅ…่จฑ่Šฏ็‰‡ๅœจไฝฟ็”จ้Ž็จ‹ไธญๅ‡บ็พไธ€ไบ›้ ป็Ž‡ๅพˆไฝŽ็š„ๆ™‚ๅบ้Œฏ่ชคใ€‚้šจไน‹่€Œไพ†็š„ๅ•้กŒๆ˜ฏๅฆ‚ไฝ•ๅšๅˆฐๅœจ็ทš็š„ๆชข้Œฏๅ’Œ็ณพ้Œฏใ€‚็‚บไบ†่งฃๆฑบไธŠ่ฟฐๅ•้กŒ๏ผŒๆˆ‘ๅ€‘้ฆ–ๅ…ˆ็™ผๆ˜Žไบ†ไธ€็จฎๆ–ฐ็š„ๆŠ€่ก“็”จๆ–ผ่ญ˜ๅˆฅๆ‰€่ฌ‚็š„่™›ๅ‡่ทฏๅพ‘๏ผŒ่ฉฒๆ–นๆณ•ไฝฟๆˆ‘ๅ€‘่ƒฝๅค ็™ผ็พๆฏ”ๅ‚ณ็ตฑๆ–นๆณ•ๆ›ดๅคš็š„่™›ๅ‡่ทฏๅพ‘ใ€‚็•ถๅฐ‡ๆ‰€ๆๅ–็š„่™›ๅ‡่ทฏๅพ‘้›†ๆˆๅˆฐ้œๆ…‹ๆ™‚ๅบๅˆ†ๆžๅทฅๅ…ท้‡ŒไปฅๅพŒ๏ผŒๆˆ‘ๅ€‘ๅฏไปฅๅพ—ๅˆฐๆ›ด็‚บๆบ–็ขบ็š„ๆ™‚ๅบๅˆ†ๆž็ตๆžœ๏ผŒๅŒๆ™‚ไนŸ่ƒฝ็ฏ€็œๆœฌไพ†็”จๆ–ผๅ„ชๅŒ–้€™ไบ›่ทฏๅพ‘็š„ๆˆๆœฌใ€‚ๆŽฅ่‘—๏ผŒ่€ƒๆ…ฎๅˆฐ็พๆœ‰็š„ๅปถๆ™‚่‡ชๅ‹•ๅ‘้‡็”Ÿๆˆ(ATPG) ๆ–นๆณ•ๆœƒ็”ข็”ŸๅŠŸ่ƒฝๆจกๅผไธ‹็„กๆณ•ๅ‡บ็พ็š„ๆธฌ่ฉฆๅ‘้‡๏ผŒ้€™็จฎๅ‘้‡ๅฏ่ƒฝๆœƒ้€ ๆˆๆธฌ่ฉฆ้Ž็จ‹ไธญๅœจ่ขซๆฟ€ๆดป็š„่ทฏๅพ‘ๅ‘จๅœๅ‡บ็พ้Žๅคš(ๆˆ–้Žๅฐ‘)็š„้›ปๆบๅ™ช่ฒ(PSN) ๏ผŒๅพž่€ŒๅฐŽ่‡ดๆธฌ่ฉฆ้Žๅบฆๆˆ–่€…ๆธฌ่ฉฆไธ่ถณๆƒ…ๆณใ€‚็‚บๆญค๏ผŒๆˆ‘ๅ€‘ๆๅ‡บไบ†ไธ€็จฎๆ–ฐ็š„ๅฝๅŠŸ่ƒฝATPGๅทฅๅ…ทใ€‚้€š้ŽๅŒๆ™‚่€ƒๆ…ฎๅŠŸ่ƒฝ็ด„ๆŸไปฅๅŠ้›ป่ทฏ็š„็‰ฉ็†ไฝˆๅฑ€ไฟกๆฏ๏ผŒๆˆ‘ๅ€‘ไฝฟ็”จ้กžไผผATPG ็š„็ฎ—ๆณ•็”ข็”Ÿ็‹€ๆ…‹่ทณ่ฎŠไฝฟๅ…ถ่ƒฝๆœ€ๅคงๅŒ–ๅทฒๆฟ€ๆดป็š„่ทฏๅพ‘ๅ‘จๅœ็š„PSNๅฝฑ้Ÿฟใ€‚ๆœ€ๅพŒ๏ผŒๅŸบๆ–ผ่ฟ‘ไผผ้›ป่ทฏ็š„ๅŽŸ็†๏ผŒๆˆ‘ๅ€‘ๆๅ‡บไบ†ไธ€็จฎๆ–ฐ็š„ๅœจ็ทšๅŽŸไฝๆ กๆญฃๆŠ€่ก“๏ผŒๅณInTimeFix๏ผŒ็”จๆ–ผ็ณพๆญฃๆ™‚ๅบ้Œฏ่ชคใ€‚็”ฑๆ–ผๅฏฆ็พ่ฟ‘ไผผ้›ป่ทฏ็š„็ถœๅˆๅƒ…้œ€่ฆ็ฐกๅ–ฎ็š„้›ป่ทฏ็ตๆง‹ๅˆ†ๆž๏ผŒๅ› ๆญค่ฉฒๆŠ€่ก“่ƒฝๅค ๅพˆๅฎนๆ˜“็š„ๆ“ดๅฑ•ๅˆฐๅคงๅž‹้›ป่ทฏ่จญ่จˆไธŠๅŽปใ€‚With technology scaling, integrated circuits (ICs) suffer from increasing process, voltage, and temperature (PVT) variations and aging effects. In most cases, these reliability threats manifest themselves as timing errors on speed-paths (i.e., critical or near-critical paths) of the circuit. Embedding a large design guard band to prevent timing errors to occur is not an attractive solution, since this conservative design methodology diminishes the benefit of technology scaling. This creates several challenges on build a reliable systems, and the key problems include (i) how to optimize circuitโ€™s timing performance with limited power budget for explosively increased potential speed-paths; (ii) how to generate high quality delay test pattern to capture ICsโ€™ accurate worst-case delay; (iii) to have better power and performance tradeoff, we have to accept some infrequent timing errors in circuitโ€™s the usage phase. Therefore, the question is how to achieve online timing error resilience.To address the above issues, we first develop a novel technique to identify so-called false paths, which facilitate us to find much more false paths than conventional methods. By integrating our identified false paths into static timing analysis tool, we are able to achieve more accurate timing information and also save the cost used to optimize false paths. Then, due to the fact that existing delay automated test pattern generation (ATPG) methods may generate test patterns that are functionally-unreachable, and such patterns may incur excessive (or limited) power supply noise (PSN) on sensitized paths in test mode, thus leading to over-testing or under-testing of the circuits, we propose a novel pseudo-functional ATPG tool. By taking both circuit layout information and functional constrains into account, we use ATPG like algorithm to justify transitions that pose the maximized functional PSN effects on sensitized critical paths. Finally, we propose a novel in-situ correction technique to mask timing errors, namely InTimeFix, by introducing redundant approximation circuit with more timing slack for speed-paths into the design. The synthesis of the approximation circuit relies on simple structural analysis of the original circuit, which is easily scalable to large IC designs.Detailed summary in vernacular field only.Detailed summary in vernacular field only.Yuan, Feng.Thesis (Ph.D.)--Chinese University of Hong Kong, 2012.Includes bibliographical references (leaves 88-100).Abstract also in Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Challenges to Solve Timing Uncertainty Problem --- p.2Chapter 1.2 --- Contributions and Thesis Outline --- p.5Chapter 2 --- Background --- p.7Chapter 2.1 --- Sources of Timing Uncertainty --- p.7Chapter 2.1.1 --- Process Variation --- p.7Chapter 2.1.2 --- Runtime Environment Fluctuation --- p.9Chapter 2.1.3 --- Aging Effect --- p.10Chapter 2.2 --- Technical Flow to Solve Timing Uncertainty Problem --- p.10Chapter 2.3 --- False Path --- p.12Chapter 2.3.1 --- Path Sensitization Criteria --- p.12Chapter 2.3.2 --- False Path Aware Timing Analysis --- p.13Chapter 2.4 --- Manufacturing Testing --- p.14Chapter 2.4.1 --- Functional Testing vs. Structural Testing --- p.14Chapter 2.4.2 --- Scan-Based DfT --- p.15Chapter 2.4.3 --- Pseudo-Functional Testing --- p.17Chapter 2.5 --- Timing Error Tolerance --- p.19Chapter 2.5.1 --- Timing Error Detection --- p.19Chapter 2.5.2 --- Timing Error Recover --- p.20Chapter 3 --- Timing-Independent False Path Identification --- p.23Chapter 3.1 --- Introduction --- p.23Chapter 3.2 --- Preliminaries and Motivation --- p.26Chapter 3.2.1 --- Motivation --- p.27Chapter 3.3 --- False Path Examination Considering Illegal States --- p.28Chapter 3.3.1 --- Path Sensitization Criterion --- p.28Chapter 3.3.2 --- Path-Aware Illegal State Identification --- p.30Chapter 3.3.3 --- Proposed Examination Procedure --- p.31Chapter 3.4 --- False Path Identification --- p.32Chapter 3.4.1 --- Overall Flow --- p.34Chapter 3.4.2 --- Static Implication Learning --- p.35Chapter 3.4.3 --- Suspicious Node Extraction --- p.36Chapter 3.4.4 --- S-Frontier Propagation --- p.37Chapter 3.5 --- Experimental Results --- p.38Chapter 3.6 --- Conclusion and Future Work --- p.42Chapter 4 --- PSN Aware Pseudo-Functional Delay Testing --- p.43Chapter 4.1 --- Introduction --- p.43Chapter 4.2 --- Preliminaries and Motivation --- p.45Chapter 4.2.1 --- Motivation --- p.46Chapter 4.3 --- Proposed Methodology --- p.48Chapter 4.4 --- Maximizing PSN Effects under Functional Constraints --- p.50Chapter 4.4.1 --- Pseudo-Functional Relevant Transitions Generation --- p.51Chapter 4.5 --- Experimental Results --- p.59Chapter 4.5.1 --- Experimental Setup --- p.59Chapter 4.5.2 --- Results and Discussion --- p.60Chapter 4.6 --- Conclusion --- p.64Chapter 5 --- In-Situ Timing Error Masking in Logic Circuits --- p.65Chapter 5.1 --- Introduction --- p.65Chapter 5.2 --- Prior Work and Motivation --- p.67Chapter 5.3 --- In-Situ Timing Error Masking with Approximate Logic --- p.69Chapter 5.3.1 --- Equivalent Circuit Construction with Approximate Logic --- p.70Chapter 5.3.2 --- Timing Error Masking with Approximate Logic --- p.72Chapter 5.4 --- Cost-Efficient Synthesis for InTimeFix --- p.75Chapter 5.4.1 --- Overall Flow --- p.76Chapter 5.4.2 --- Prime Critical Segment Extraction --- p.77Chapter 5.4.3 --- Prime Critical Segment Merging --- p.79Chapter 5.5 --- Experimental Results --- p.81Chapter 5.5.1 --- Experimental Setup --- p.81Chapter 5.5.2 --- Results and Discussion --- p.82Chapter 5.6 --- Conclusion --- p.85Chapter 6 --- Conclusion and Future Work --- p.86Bibliography --- p.10

    Approximate logic circuits: Theory and applications

    Get PDF
    CMOS technology scaling, the process of shrinking transistor dimensions based on Moore's law, has been the thrust behind increasingly powerful integrated circuits for over half a century. As dimensions are scaled to few tens of nanometers, process and environmental variations can significantly alter transistor characteristics, thus degrading reliability and reducing performance gains in CMOS designs with technology scaling. Although design solutions proposed in recent years to improve reliability of CMOS designs are power-efficient, the performance penalty associated with these solutions further reduces performance gains with technology scaling, and hence these solutions are not well-suited for high-performance designs. This thesis proposes approximate logic circuits as a new logic synthesis paradigm for reliable, high-performance computing systems. Given a specification, an approximate logic circuit is functionally equivalent to the given specification for a "significant" portion of the input space, but has a smaller delay and power as compared to a circuit implementation of the original specification. This contributions of this thesis include (i) a general theory of approximation and efficient algorithms for automated synthesis of approximations for unrestricted random logic circuits, (ii) logic design solutions based on approximate circuits to improve reliability of designs with negligible performance penalty, and (iii) efficient decomposition algorithms based on approxiiii mate circuits to improve performance of designs during logic synthesis. This thesis concludes with other potential applications of approximate circuits and identifies. open problems in logic decomposition and approximate circuit synthesis

    Algorithmic techniques for nanometer VLSI design and manufacturing closure

    Get PDF
    As Very Large Scale Integration (VLSI) technology moves to the nanoscale regime, design and manufacturing closure becomes very difficult to achieve due to increasing chip and power density. Imperfections due to process, voltage and temperature variations aggravate the problem. Uncertainty in electrical characteristic of individual device and wire may cause significant performance deviations or even functional failures. These impose tremendous challenges to the continuation of Moore's law as well as the growth of semiconductor industry. Efforts are needed in both deterministic design stage and variation-aware design stage. This research proposes various innovative algorithms to address both stages for obtaining a design with high frequency, low power and high robustness. For deterministic optimizations, new buffer insertion and gate sizing techniques are proposed. For variation-aware optimizations, new lithography-driven and post-silicon tuning-driven design techniques are proposed. For buffer insertion, a new slew buffering formulation is presented and is proved to be NP-hard. Despite this, a highly efficient algorithm which runs > 90x faster than the best alternatives is proposed. The algorithm is also extended to handle continuous buffer locations and blockages. For gate sizing, a new algorithm is proposed to handle discrete gate library in contrast to unrealistic continuous gate library assumed by most existing algorithms. Our approach is a continuous solution guided dynamic programming approach, which integrates the high solution quality of dynamic programming with the short runtime of rounding continuous solution. For lithography-driven optimization, the problem of cell placement considering manufacturability is studied. Three algorithms are proposed to handle cell flipping and relocation. They are based on dynamic programming and graph theoretic approaches, and can provide different tradeoff between variation reduction and wire- length increase. For post-silicon tuning-driven optimization, the problem of unified adaptivity optimization on logical and clock signal tuning is studied, which enables us to significantly save resources. The new algorithm is based on a novel linear programming formulation which is solved by an advanced robust linear programming technique. The continuous solution is then discretized using binary search accelerated dynamic programming, batch based optimization, and Latin Hypercube sampling based fast simulation

    VLSI Implementation of a Spiking Neural Network

    Get PDF
    Im Rahmen der vorliegenden Arbeit wurden Konzepte und dedizierte Hardware entwickelt, die es erlauben, groรŸskalige pulsgekoppelte neuronale Netze in Hardware zu realisieren. Die Arbeit basiert auf dem analogen VLSI-Modell eines pulsgekoppelten neuronalen Netzes, welches synaptische Plastizitรคt (STPD) in jeder einzelnen Synapse beinhaltet. Das Modell arbeitet analog mit einem Geschwindigkeitszuwachs von bis zu 10^5 im Vergleich zur biologischen Echtzeit. Aktionspotentiale werden als digitale Ereignisse รผbertragen. Inhalt dieser Arbeit sind vornehmlich die digitale Hardware und die รœbertragung dieser Ereignisse. Das analoge VLSI-Modell wurde in Verbindung mit Digitallogik, welche zur Verarbeitung neuronaler Ereignisse und zu Konfigurationszwecken dient, in einen gemischt analog-digitalen ASIC integriert, wobei zu diesem Zweck ein automatisierter Arbeitsablauf entwickelt wurde. AuรŸerdem wurde eine entsprechende Kontrolleinheit in programmierbarer Logik implementiert und eine Hardware-Plattform zum parallelen Betrieb mehrerer neuronaler Netzwerkchips vorgestellt. Um das VLSI-Modell auf mehrere neuronale Netzwerkchips ausdehnen zu kรถnnen, wurde ein Routing-Algorithmus entwickelt, welcher die รœbertragung von Ereignissen zwischen Neuronen und Synapsen auf unterschiedlichen Chips ermรถglicht. Die zeitlich korrekte รœbertragung der Ereignisse, welche eine zwingende Bedingung fรผr das Funktionieren von Plastizitรคtsmechanismen ist, wird durch diesen Algorithmus sichergestellt. Die Funktionalitรคt des Algorithmus wird mittels Simulationen verifiziert. Weiterhin wird die korrekte Realisierung des gemischt analog-digitalen ASIC in Verbindung mit dem zugehรถrigen Hardware-System demonstriert und die Durchfรผhrbarkeit biologisch realistischer Experimente gezeigt. Das vorgestellte groรŸskalige physikalische Modell eines neuronalen Netzwerks wird aufgrund seiner schnellen und parallelen Arbeitsweise fรผr Experimentierzwecke in den Neurowissenschaften einsetzbar sein. Als Ergรคnzung zu numerischen Simulationen bietet es vor allem die Mรถglichkeit der intuitiven und umfangreichen Suche nach geeigneten Modellparametern

    Radiation Tolerant Electronics, Volume II

    Get PDF
    Research on radiation tolerant electronics has increased rapidly over the last few years, resulting in many interesting approaches to model radiation effects and design radiation hardened integrated circuits and embedded systems. This research is strongly driven by the growing need for radiation hardened electronics for space applications, high-energy physics experiments such as those on the large hadron collider at CERN, and many terrestrial nuclear applications, including nuclear energy and safety management. With the progressive scaling of integrated circuit technologies and the growing complexity of electronic systems, their ionizing radiation susceptibility has raised many exciting challenges, which are expected to drive research in the coming decade.After the success of the first Special Issue on Radiation Tolerant Electronics, the current Special Issue features thirteen articles highlighting recent breakthroughs in radiation tolerant integrated circuit design, fault tolerance in FPGAs, radiation effects in semiconductor materials and advanced IC technologies and modelling of radiation effects

    Low energy digital circuit design using sub-threshold operation

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, February 2006.Includes bibliographical references (p. 189-202).Scaling of process technologies to deep sub-micron dimensions has made power management a significant concern for circuit designers. For emerging low power applications such as distributed micro-sensor networks or medical applications, low energy operation is the primary concern instead of speed, with the eventual goal of harvesting energy from the environment. Sub-threshold operation offers a promising solution for ultra-low-energy applications because it often achieves the minimum energy per operation. While initial explorations into sub-threshold circuits demonstrate its promise, sub-threshold circuit design remains in its infancy. This thesis makes several contributions that make sub-threshold design more accessible to circuit designers. First, a model for energy consumption in sub-threshold provides an analytical solution for the optimum VDD to minimize energy. Fitting this model to a generic circuit allows easy estimation of the impact of processing and environmental parameters on the minimum energy point. Second, analysis of device sizing for sub-threshold circuits shows the trade-offs between sizing for minimum energy and for minimum voltage operation.(cont.) A programmable FIR filter test chip fabricated in 0.18pum bulk CMOS provides measurements to confirm the model and the sizing analysis. Third, a low-overhead method for integrating sub-threshold operation with high performance applications extends dynamic voltage scaling across orders of magnitude of frequency and provides energy scalability down to the minimum energy point. A 90nm bulk CMOS test chip confirms the range of operation for ultra-dynamic voltage scaling. Finally, sub-threshold operation is extended to memories. Analysis of traditional SRAM bitcells and architectures leads to development of a new bitcell for robust sub-threshold SRAM operation. The sub-threshold SRAM is analyzed experimentally in a 65nm bulk CMOS test chip.by Benton H. Calhoun.Ph.D
    corecore