664 research outputs found

    EffiTest: Efficient Delay Test and Statistical Prediction for Configuring Post-silicon Tunable Buffers

    Full text link
    At nanometer manufacturing technology nodes, process variations significantly affect circuit performance. To combat them, post- silicon clock tuning buffers can be deployed to balance timing bud- gets of critical paths for each individual chip after manufacturing. The challenge of this method is that path delays should be mea- sured for each chip to configure the tuning buffers properly. Current methods for this delay measurement rely on path-wise frequency stepping. This strategy, however, requires too much time from ex- pensive testers. In this paper, we propose an efficient delay test framework (EffiTest) to solve the post-silicon testing problem by aligning path delays using the already-existing tuning buffers in the circuit. In addition, we only test representative paths and the delays of other paths are estimated by statistical delay prediction. Exper- imental results demonstrate that the proposed method can reduce the number of frequency stepping iterations by more than 94% with only a slight yield loss.Comment: ACM/IEEE Design Automation Conference (DAC), June 201

    Algorithmic techniques for nanometer VLSI design and manufacturing closure

    Get PDF
    As Very Large Scale Integration (VLSI) technology moves to the nanoscale regime, design and manufacturing closure becomes very difficult to achieve due to increasing chip and power density. Imperfections due to process, voltage and temperature variations aggravate the problem. Uncertainty in electrical characteristic of individual device and wire may cause significant performance deviations or even functional failures. These impose tremendous challenges to the continuation of Moore's law as well as the growth of semiconductor industry. Efforts are needed in both deterministic design stage and variation-aware design stage. This research proposes various innovative algorithms to address both stages for obtaining a design with high frequency, low power and high robustness. For deterministic optimizations, new buffer insertion and gate sizing techniques are proposed. For variation-aware optimizations, new lithography-driven and post-silicon tuning-driven design techniques are proposed. For buffer insertion, a new slew buffering formulation is presented and is proved to be NP-hard. Despite this, a highly efficient algorithm which runs > 90x faster than the best alternatives is proposed. The algorithm is also extended to handle continuous buffer locations and blockages. For gate sizing, a new algorithm is proposed to handle discrete gate library in contrast to unrealistic continuous gate library assumed by most existing algorithms. Our approach is a continuous solution guided dynamic programming approach, which integrates the high solution quality of dynamic programming with the short runtime of rounding continuous solution. For lithography-driven optimization, the problem of cell placement considering manufacturability is studied. Three algorithms are proposed to handle cell flipping and relocation. They are based on dynamic programming and graph theoretic approaches, and can provide different tradeoff between variation reduction and wire- length increase. For post-silicon tuning-driven optimization, the problem of unified adaptivity optimization on logical and clock signal tuning is studied, which enables us to significantly save resources. The new algorithm is based on a novel linear programming formulation which is solved by an advanced robust linear programming technique. The continuous solution is then discretized using binary search accelerated dynamic programming, batch based optimization, and Latin Hypercube sampling based fast simulation

    ๋กœ์ง ๋ฐ ํ”ผ์ง€์ปฌ ํ•ฉ์„ฑ์—์„œ์˜ ํƒ€์ด๋ฐ ๋ถ„์„๊ณผ ์ตœ์ ํ™”

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2020. 8. ๊น€ํƒœํ™˜.Timing analysis is one of the necessary steps in the development of a semiconductor circuit. In addition, it is increasingly important in the advanced process technologies due to various factors, including the increase of processโ€“voltageโ€“temperature variation. This dissertation addresses three problems related to timing analysis and optimization in logic and physical synthesis. Firstly, most static timing analysis today are based on conventional fixed flip-flop timing models, in which every flip-flop is assumed to have a fixed clock-to-Q delay. However, setup and hold skews affect the clock-to-Q delay in reality. In this dissertation, I propose a mathematical formulation to solve the problem and apply it to the clock skew scheduling problems as well as to the analysis of a given circuit, with a scalable speedup technique. Secondly, near-threshold computing is one of the promising concepts for energy-efficient operation of VLSI systems, but wide performance variation and nonlinearity to process variations block the proliferation. To cope with this, I propose a holistic hardware performance monitoring methodology for accurate timing prediction in a near-threshold voltage regime and advanced process technology. Lastly, an asynchronous circuit is one of the alternatives to the conventional synchronous style, and asynchronous pipeline circuit especially attractive because of its small design effort. This dissertation addresses the synthesis problem of lightening two-phase bundled-data asynchronous pipeline controllers, in which delay buffers are essential for guaranteeing the correct handshaking operation but incurs considerable area increase.ํƒ€์ด๋ฐ ๋ถ„์„์€ ๋ฐ˜๋„์ฒด ํšŒ๋กœ ๊ฐœ๋ฐœ ํ•„์ˆ˜ ๊ณผ์ • ์ค‘ ํ•˜๋‚˜๋กœ, ์ตœ์‹  ๊ณต์ •์ผ์ˆ˜๋ก ๊ณต์ •-์ „์••-์˜จ๋„ ๋ณ€์ด ์ฆ๊ฐ€๋ฅผ ํฌํ•จํ•œ ๋‹ค์–‘ํ•œ ์š”์ธ์œผ๋กœ ํ•˜์—ฌ๊ธˆ ๊ทธ ์ค‘์š”์„ฑ์ด ์ปค์ง€๊ณ  ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋กœ์ง ๋ฐ ํ”ผ์ง€์ปฌ ํ•ฉ์„ฑ๊ณผ ๊ด€๋ จํ•˜์—ฌ ์„ธ ๊ฐ€์ง€ ํƒ€์ด๋ฐ ๋ถ„์„ ๋ฐ ์ตœ์ ํ™” ๋ฌธ์ œ์— ๋Œ€ํ•ด ๋‹ค๋ฃฌ๋‹ค. ์ฒซ์งธ๋กœ, ์˜ค๋Š˜๋‚  ๋Œ€๋ถ€๋ถ„์˜ ์ •์  ํƒ€์ด๋ฐ ๋ถ„์„์€ ๋ชจ๋“  ํ”Œ๋ฆฝ-ํ”Œ๋กญ์˜ ํด๋Ÿญ-์ถœ๋ ฅ ๋”œ๋ ˆ์ด๊ฐ€ ๊ณ ์ •๋œ ๊ฐ’์ด๋ผ๋Š” ๊ฐ€์ •์„ ๋ฐ”ํƒ•์œผ๋กœ ์ด๋ฃจ์–ด์กŒ๋‹ค. ํ•˜์ง€๋งŒ ์‹ค์ œ ํด๋Ÿญ-์ถœ๋ ฅ ๋”œ๋ ˆ์ด๋Š” ํ•ด๋‹น ํ”Œ๋ฆฝ-ํ”Œ๋กญ์˜ ์…‹์—… ๋ฐ ํ™€๋“œ ์Šคํ์— ์˜ํ–ฅ์„ ๋ฐ›๋Š”๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ด๋Ÿฌํ•œ ํŠน์„ฑ์„ ์ˆ˜ํ•™์ ์œผ๋กœ ์ •๋ฆฌํ•˜์˜€์œผ๋ฉฐ, ์ด๋ฅผ ํ™•์žฅ ๊ฐ€๋Šฅํ•œ ์†๋„ ํ–ฅ์ƒ ๊ธฐ๋ฒ•๊ณผ ๋”๋ถˆ์–ด ์ฃผ์–ด์ง„ ํšŒ๋กœ์˜ ํƒ€์ด๋ฐ ๋ถ„์„ ๋ฐ ํด๋Ÿญ ์Šคํ ์Šค์ผ€์ฅด๋ง ๋ฌธ์ œ์— ์ ์šฉํ•˜์˜€๋‹ค. ๋‘˜์งธ๋กœ, ์œ ์‚ฌ ๋ฌธํ„ฑ ์—ฐ์‚ฐ์€ ์ดˆ๊ณ ์ง‘์  ํšŒ๋กœ ๋™์ž‘์˜ ์—๋„ˆ์ง€ ํšจ์œจ์„ ๋Œ์–ด ์˜ฌ๋ฆด ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์—์„œ ๊ฐ๊ด‘๋ฐ›์ง€๋งŒ, ํฐ ํญ์˜ ์„ฑ๋Šฅ ๋ณ€์ด ๋ฐ ๋น„์„ ํ˜•์„ฑ ๋•Œ๋ฌธ์— ๋„๋ฆฌ ํ™œ์šฉ๋˜๊ณ  ์žˆ์ง€ ์•Š๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์œ ์‚ฌ ๋ฌธํ„ฑ ์ „์•• ์˜์—ญ ๋ฐ ์ตœ์‹  ๊ณต์ • ๋…ธ๋“œ์—์„œ ๋ณด๋‹ค ์ •ํ™•ํ•œ ํƒ€์ด๋ฐ ์˜ˆ์ธก์„ ์œ„ํ•œ ํ•˜๋“œ์›จ์–ด ์„ฑ๋Šฅ ๋ชจ๋‹ˆํ„ฐ๋ง ๋ฐฉ๋ฒ•๋ก  ์ „๋ฐ˜์„ ์ œ์•ˆํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ๋น„๋™๊ธฐ ํšŒ๋กœ๋Š” ๊ธฐ์กด ๋™๊ธฐ ํšŒ๋กœ์˜ ๋Œ€์•ˆ ์ค‘ ํ•˜๋‚˜๋กœ, ๊ทธ ์ค‘์—์„œ๋„ ๋น„๋™๊ธฐ ํŒŒ์ดํ”„๋ผ์ธ ํšŒ๋กœ๋Š” ๋น„๊ต์  ์ ์€ ์„ค๊ณ„ ๋…ธ๋ ฅ๋งŒ์œผ๋กœ๋„ ๊ตฌํ˜„ ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์žฅ์ ์ด ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” 2์œ„์ƒ ๋ฌถ์Œ ๋ฐ์ดํ„ฐ ํ”„๋กœํ† ์ฝœ ๊ธฐ๋ฐ˜ ๋น„๋™๊ธฐ ํŒŒ์ดํ”„๋ผ์ธ ์ปจํŠธ๋กค๋Ÿฌ ์ƒ์—์„œ, ์ •ํ™•ํ•œ ํ•ธ๋“œ์…ฐ์ดํ‚น ํ†ต์‹ ์„ ์œ„ํ•ด ์‚ฝ์ž…๋œ ๋”œ๋ ˆ์ด ๋ฒ„ํผ์— ์˜ํ•œ ๋ฉด์  ์ฆ๊ฐ€๋ฅผ ์™„ํ™”ํ•  ์ˆ˜ ์žˆ๋Š” ํ•ฉ์„ฑ ๊ธฐ๋ฒ•์„ ์ œ์‹œํ•˜์˜€๋‹ค.1 INTRODUCTION 1 1.1 Flexible Flip-Flop Timing Model 1 1.2 Hardware Performance Monitoring Methodology 4 1.3 Asynchronous Pipeline Controller 10 1.4 Contributions of this Dissertation 15 2 ANALYSIS AND OPTIMIZATION CONSIDERING FLEXIBLE FLIP-FLOP TIMING MODEL 17 2.1 Preliminaries 17 2.1.1 Terminologies 17 2.1.2 Timing Analysis 20 2.1.3 Clock-to-Q Delay Surface Modeling 21 2.2 Clock-to-Q Delay Interval Analysis 22 2.2.1 Derivation 23 2.2.2 Additional Constraints 26 2.2.3 Analysis: Finding Minimum Clock Period 28 2.2.4 Optimization: Clock Skew Scheduling 30 2.2.5 Scalable Speedup Technique 33 2.3 Experimental Results 37 2.3.1 Application to Minimum Clock Period Finding 37 2.3.2 Application to Clock Skew Scheduling 39 2.3.3 Efficacy of Scalable Speedup Technique 43 2.4 Summary 44 3 HARDWARE PERFORMANCE MONITORING METHODOLOGY AT NTC AND ADVANCED TECHNOLOGY NODE 45 3.1 Overall Flow of Proposed HPM Methodology 45 3.2 Prerequisites to HPM Methodology 47 3.2.1 BEOL Process Variation Modeling 47 3.2.2 Surrogate Model Preparation 49 3.3 HPM Methodology: Design Phase 52 3.3.1 HPM2PV Model Construction 52 3.3.2 Optimization of Monitoring Circuits Configuration 54 3.3.3 PV2CPT Model Construction 58 3.4 HPM Methodology: Post-Silicon Phase 60 3.4.1 Transfer Learning in Silicon Characterization Step 60 3.4.2 Procedures in Volume Production Phase 61 3.5 Experimental Results 62 3.5.1 Experimental Setup 62 3.5.2 Exploration of Monitoring Circuits Configuration 64 3.5.3 Effectiveness of Monitoring Circuits Optimization 66 3.5.4 Considering BEOL PVs and Uncertainty Learning 68 3.5.5 Comparison among Different Prediction Flows 69 3.5.6 Effectiveness of Prediction Model Calibration 71 3.6 Summary 73 4 LIGHTENING ASYNCHRONOUS PIPELINE CONTROLLER 75 4.1 Preliminaries and State-of-the-Art Work 75 4.1.1 Bundled-data vs. Dual-rail Asynchronous Circuits 75 4.1.2 Two-phase vs. Four-phase Bundled-data Protocol 76 4.1.3 Conventional State-of-the-Art Pipeline Controller Template 77 4.2 Delay Path Sharing for Lightening Pipeline Controller Template 78 4.2.1 Synthesizing Sharable Delay Paths 78 4.2.2 Validating Logical Correctness for Sharable Delay Paths 80 4.2.3 Reformulating Timing Constraints of Controller Template 81 4.2.4 Minimally Allocating Delay Buffers 87 4.3 In-depth Pipeline Controller Template Synthesis with Delay Path Reusing 88 4.3.1 Synthesizing Delay Path Units 88 4.3.2 Validating Logical Correctness of Delay Path Units 89 4.3.3 Updating Timing Constraints for Delay Path Units 91 4.3.4 In-depth Synthesis Flow Utilizing Delay Path Units 95 4.4 Experimental Results 99 4.4.1 Environment Setup 99 4.4.2 Piecewise Linear Modeling of Delay Path Unit Area 99 4.4.3 Comparison of Power, Performance, and Area 102 4.5 Summary 107 5 CONCLUSION 109 5.1 Chapter 2 109 5.2 Chapter 3 110 5.3 Chapter 4 110 Abstract (In Korean) 127Docto

    ์ €์ „๋ ฅ ๊ณ ์„ฑ๋Šฅ ๋””์ง€ํ„ธ ์‹œ์Šคํ…œ์„ ์œ„ํ•œ ๊ณ ์‹ ๋ขฐ๋„์˜ ํด๋Ÿญ ๋„คํŠธ์›Œํฌ ์„ค๊ณ„ ๋ฐฉ๋ฒ•๋ก 

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2015. 8. ๊น€ํƒœํ™˜.์˜ค๋Š˜๋‚ ์˜ ํšŒ๋กœ ์„ค๊ณ„์—์„œ ๊ณต์ •๋ณ€์ด๊ฐ€ ํšŒ๋กœ ํด๋Ÿญ์˜ ํƒ€์ด๋ฐ์˜ ๋ณ€์ด์— ๋ฏธ์น˜๋Š” ์˜ํ–ฅ์€ ๋งค์šฐ ์ปค์ง์— ๋”ฐ๋ผ, ์ „ํ†ต์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๋˜ ํด๋Ÿญ ํŠธ๋ฆฌ ๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ํด๋Ÿญ ๋„คํŠธ์›Œํฌ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์€ ํ•œ๊ณ„์— ๋ถ€๋”ชํžˆ๊ฒŒ ๋˜์—ˆ๊ณ , ์ด๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•œ ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๊ธฐ์ˆ ๋“ค์ด ์ œ์•ˆ๋˜์—ˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋ณ€์ด์— ๊ฐ•ํ•œ ํด๋Ÿญ ๋„คํŠธ์›Œํฌ๋ฅผ ์„ค๊ณ„ํ•˜๊ธฐ ์œ„ํ•ด, ์—ฐ๊ตฌ ๋ฐ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋Š” ์„ธ ๊ฐ€์ง€ ๊ธฐ์ˆ ์— ๋Œ€ํ•ด ์†Œ๊ฐœํ•˜๊ณ , ์ด๋“ค์„ ๊ฐœ์„ ํ•œ ์—ฐ๊ตฌ๋“ค์„ ์ œ์•ˆํ•œ๋‹ค. ์ฒซ์งธ๋กœ, ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ํด๋Ÿญ์˜ ํƒ€์ด๋ฐ ๋ฌธ์ œ๋ฅผ ํšŒ๋กœ ์ œ์ž‘ ์ดํ›„ ๋‹จ๊ณ„์—์„œ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ๋Š” ํฌ์ŠคํŠธ ์‹ค๋ฆฌ์ฝ˜ ์กฐ์ • ํด๋Ÿญ ๋ฒ„ํผ๋ฅผ ๋ฐฐ์น˜ํ•˜๋Š” ๋ฌธ์ œ์— ๋Œ€ํ•ด ์„œ์ˆ ํ•œ๋‹ค. ํฌ์ŠคํŠธ ์‹ค๋ฆฌ์ฝ˜ ์กฐ์ • ๋ฒ„ํผ๋Š” ํด๋Ÿญ์˜ ์ง€์—ฐ์‹œ๊ฐ„์„ ํšŒ๋กœ๊ฐ€ ์ œ์ž‘๋œ ์ดํ›„์˜ ๋‹จ๊ณ„์—์„œ ์กฐ์ •ํ•˜ ์—ฌ ํด๋Ÿญ์˜ ํƒ€์ด๋ฐ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ๋ฒ„ํผ ์ž์ฒด์˜ ํฌ๊ธฐ ๋•Œ๋ฌธ์— ์ตœ์†Œํ•œ์˜ ๊ฐœ์ˆ˜๋งŒ ๊ฐ€์žฅ ํšจ์œจ์ ์ธ ์œ„์น˜์— ๋ฐฐ์น˜ํ•ด์•ผ ํ•˜๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ด์ „์˜ ์—ฐ๊ตฌ๊ฐ€ ํšŒ๋กœ์˜ ์ˆ˜์œจ์„ ๊ณ„์‚ฐํ•  ๋•Œ ์‹œ๊ฐ„์ด ๋งŽ์ด ๊ฑธ๋ฆฌ๋Š” ๋ชฌํ…Œ-์นด๋ฅผ๋กœ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํƒ์ƒ‰ ๊ฐ€๋Šฅํ•œ ํฌ์ŠคํŠธ ์‹ค๋ฆฌ์ฝ˜ ์กฐ์ • ๋ฒ„ํผ์˜ ๋ฐฐ์น˜๊ฐ€ ์ œํ•œ๋˜๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ์Œ์„ ์ง€์ ํ•œ ํ›„, ๊ธฐ์กด์— ์ œ์•ˆ๋˜์—ˆ๋˜ ๊ทธ๋ž˜ํ”„ ๊ธฐ๋ฐ˜ ํšŒ๋กœ ์ˆ˜์œจ ๊ณ„์‚ฐ ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜์—ฌ ํšจ์œจ์ ์ธ ํฌ์ŠคํŠธ ์‹ค๋ฆฌ์ฝ˜ ์กฐ์ • ๋ฒ„ํผ ๋ฐฐ์น˜๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ๋Š” ์ ์ง„์ ์ด๊ณ  ์ฒด๊ณ„์ ์ธ ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ๋‹ค์Œ์€ ํด๋Ÿญ ์‹œ์ฐจ ์Šค์ผ€์ฅด๋ง ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์—ฐ๊ตฌ๋ฅผ ์„œ์ˆ ํ•œ๋‹ค. ์ตœ๊ทผ์˜ ์—ฐ๊ตฌ์—์„œ ์ œ์•ˆ๋˜์—ˆ๋˜, ํ”Œ๋ฆฝ-ํ”Œ๋กญ์˜ ํด๋Ÿญ์—์„œ ์ถœ๋ ฅ๊นŒ์ง€์˜ ๋”œ๋ ˆ์ด๊ฐ€ ํด๋Ÿญ์˜ ์ค€๋น„์‹œ๊ฐ„๊ณผ ์œ ์ง€์‹œ๊ฐ„์— ์˜์กดํ•œ๋‹ค๋Š” ์œ ์—ฐํ•œ ํ”Œ๋ฆฝ-ํ”Œ๋กญ ํƒ€์ด๋ฐ ๋ชจ๋ธ ์—ฐ๊ตฌ๋Š” ๊ธฐ์กด์˜ ํ”Œ๋ฆฝ-ํ”Œ๋กญ์˜ ํƒ€์ด๋ฐ ํŠน์„ฑ๋“ค์ด ๊ณ ์ •๋œ ๊ฐ’์ด๋ผ๋Š” ๊ฐ€์ •์— ๊ธฐ๋ฐ˜ํ•œ ์ •์  ํƒ€์ด๋ฐ ๋ถ„์„์˜ ์ •ํ™•์„ฑ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š” ์ค‘์š”ํ•œ ์—ฐ๊ตฌ์ด๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ƒˆ๋กœ์šด ๋ชจ๋ธ์„ ๊ณ ๋ คํ•˜์—ฌ, ์ด์ „์— ๊ณ ์ „์ ์ธ ํ”Œ๋ฆฝ-ํ”Œ๋กญ ํƒ€์ด๋ฐ ํŠน์„ฑ ๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์ง„ํ–‰๋˜์—ˆ๋˜ ํด๋Ÿญ ์‹œ์ฐจ ์Šค์ผ€์ฅด๋ง์˜ ์ตœ์ ํ™” ๋ฌธ์ œ๋ฅผ ์œ ์—ฐํ•œ ํ”Œ๋ฆฝ-ํ”Œ๋กญ ํƒ€์ด๋ฐ ๋ชจ๋ธ์„ ๊ณ ๋ คํ•˜์—ฌ ํ•ด๊ฒฐํ•˜์˜€๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ฃผ์–ด์ง„ ํšŒ๋กœ์˜ ์ค€๋น„์‹œ๊ฐ„๊ณผ ์œ ์ง€์‹œ๊ฐ„์˜ ์—ฌ์œ ์‹œ๊ฐ„์„ ๋ฐ˜๋ณต์ ์ด๊ณ  ์ฒด๊ณ„์ ์œผ๋กœ ์ตœ๋Œ€ํ™”ํ•˜์—ฌ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ํด๋Ÿญ ์ŠคํŒŒ์ธ ๋„คํŠธ์›Œํฌ์˜ ํ•ฉ์„ฑ์„ ์ž๋™ํ™”ํ•˜๋Š” ๋ฌธ์ œ์— ๋Œ€ํ•ด ์„œ์ˆ ํ•œ๋‹ค. ์ „ํ†ต์ ์ธ ํด๋Ÿญ ํŠธ๋ฆฌ ๊ตฌ์กฐ๊ฐ€ ๊ณต์ •๋ณ€์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜์ง€ ๋ชปํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ํด๋Ÿญ ๋ฉ”์‰ฌ๋ฅผ ํฌํ•จํ•˜๋Š” ๋‹ค์–‘ํ•œ ๋Œ€์•ˆ์  ๊ตฌ์กฐ๊ฐ€ ์ œ์•ˆ๋˜์—ˆ๋‹ค. ํด๋Ÿญ ๋ฉ”์‰ฌ์˜ ๊ฒฝ์šฐ ๊ณต์ •๋ณ€์ด์— ์˜ํ•œ ํด๋Ÿญ ์‹œ์ฐจ๋ฅผ ์ค„์ผ ์ˆ˜ ์žˆ์—ˆ์ง€๋งŒ ์ด๋ฅผ ์œ„ํ•ด ์™€์ด์–ด๋‚˜ ๋ฒ„ํผ ๋“ฑ์˜ ์ž์›์„ ๋งŽ์ด ์†Œ๋ชจํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ๋‘ ๊ตฌ์กฐ์˜ ์ค‘๊ฐ„์  ๊ตฌ์กฐ์—๋Š” ํด๋Ÿญ ํŠธ๋ฆฌ์˜ ๋…ธ๋“œ๋ฅผ ์—ฐ๊ฒฐํ•˜๋Š” ํฌ๋กœ์Šค ๋งํฌ๋ฅผ ์‚ฝ์ž…ํ•˜๋Š” ๊ตฌ์กฐ์™€ ํด๋Ÿญ ์ŠคํŒŒ์ธ ๊ตฌ์กฐ๊ฐ€ ์žˆ๋‹ค. ํด๋Ÿญ ํŠธ๋ฆฌ์— ์ ์ง„์ ์ธ ์ˆ˜์ •์„ ๊ฐ€ํ•˜์—ฌ ๋งŒ๋“œ๋Š” ํฌ๋กœ์Šค ๋งํฌ์™€ ๋‹ฌ๋ฆฌ, ํด๋Ÿญ ์ŠคํŒŒ์ธ ๊ตฌ์กฐ๋Š” ํŠธ๋ฆฌ๋‚˜ ์ดํ›„์— ์ œ์•ˆ๋œ ๋ฉ”์‰ฌ์™€๋Š” ์™„์ „ํžˆ ๋ณ„๊ฐœ์˜ ๊ตฌ์กฐ๋กœ, ์ด๋ฅผ ํ•ฉ์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•๋„ ๋งค์šฐ ๋‹ค๋ฅด๋‹ค. ๊ทธ๋ ‡๊ธฐ ๋•Œ๋ฌธ์— ํด๋Ÿญ ์ŠคํŒŒ์ธ์„ ํ•ฉ์„ฑํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ํ•„์ˆ˜์ ์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ์œผ๋‚˜, ํ•ฉ์„ฑ ๋ฐฉ๋ฒ•๋ก ์ด๋‚˜ ์ด๋ฅผ ์ž๋™ํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๊ด€ํ•œ ์—ฐ๊ตฌ๋Š” ์•„์ง ์—†๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์šฐ์„ , ํด๋Ÿญ-๊ฒŒ์ดํŒ…์„ ์ง€์›ํ•˜๋Š” ํด๋Ÿญ ์ŠคํŒŒ์ธ์„ ์ฃผ์–ด์ง„ ํด๋Ÿญ ์‹œ์ฐจ ๋ฐ ํด๋Ÿญ ์Šฌ๋ฃจ ์กฐ๊ฑด์„ ๋งŒ์กฑํ•˜๋ฉด์„œ ์ž์› ๋ฐ ์ „๋ ฅ ์†Œ๋ชจ๋Ÿ‰์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฌธ์ œ์— ๋Œ€ํ•ด ์„œ์ˆ ํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ , ํšŒ๋กœ์—์„œ ์ฃผ์–ด์ง„ ํ”Œ๋ฆฝ-ํ”Œ๋กญ๋“ค์„ ํด๋Ÿญ-๊ฒŒ์ดํŒ… ์กฐ๊ฑด์—์„œ์˜ ์—ฐ๊ด€์„ฑ์„ ๊ณ ๋ คํ•˜๊ณ  ์กฐ์งํ™”ํ•˜์—ฌ ํด๋Ÿญ ์ŠคํŒŒ์ธ์„ ์‚ฝ์ž…ํ•œ ํ›„, ํด๋Ÿญ ์‹œ์ฐจ ๋ฐ ์Šฌ๋ฃจ ์กฐ๊ฑด์„ ๊ณ ๋ คํ•˜์—ฌ ๋ฒ„ํผ๋ฅผ ์‚ฝ์ž…ํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์•ˆํ•œ๋‹ค. ์š”์•ฝํ•˜๋ฉด, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ํด๋Ÿญ์˜ ํƒ€์ด๋ฐ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ํฌ์ŠคํŠธ-์‹ค๋ฆฌ์ฝ˜ ์กฐ์ • ํด๋Ÿญ ๋ฒ„ํผ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ํ…Œํฌ๋‹‰๊ณผ ํด๋Ÿญ ์‹œ์ฐจ ์Šค์ผ€์ฅด๋ง์„ ์œ ์—ฐํ•œ ํ”Œ๋ฆฝ-ํ”Œ๋กญ ํƒ€์ด๋ฐ ๋ชจ๋ธ์—์„œ ์ ์šฉํ•˜๋Š” ํ…Œํฌ๋‹‰์„ ์ œ์‹œํ•˜๊ณ , ํด๋Ÿญ์˜ ํƒ€์ด๋ฐ ๋ฌธ์ œ์™€ ์ „๋ ฅ ์†Œ๋ชจ ๋ฌธ์ œ๋ฅผ ํ•œ๋ฒˆ์— ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ์ƒˆ๋กœ์šด ํด๋Ÿญ ์ŠคํŒŒ์ธ ๋„คํŠธ์›Œํฌ๋ฅผ ํ•ฉ์„ฑํ•˜๋Š” ์ž๋™ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์‹œํ•œ๋‹ค.As the process variation is dominating to cause the clock timing variation among chips to be much large, conventional clock tree based clock network is not able to guarantee the timing constraint of a digital system. To overcome the limitations of traditional clock design techniques, various techniques have been studied. This dissertation addresses three techniques that have been widely used for designing robust clock network and proposes developed methods. First, it is widely accepted that post-silicon tunable (PST) clock buffers can effectively resolve the clock timing violation. Since PST buffers, which can reset the clock delay to flip-flops after the chip is manufactured, impose a non-trivial implementation area and control circuitry, it is very important to minimally allocate PST buffers while satisfying the chip yield constraint. In this dissertation, we (1) develop a graph-based chip yield computation technique which can update yields very efficiently and accurately for incremental PST buffer allocation, based on which we (2) propose a systematic (bottom-up and top-down with refinement) PST buffer allocation algorithm that is able to fully explore the design space of PST buffer allocation. Second, clock skew scheduling is one of the essential steps that must be carefully performed during the design process. This dissertation addresses the clock skew optimization problem integrated with the consideration of the interdependent relation between the setup and hold skews, and clk-to-Q delay of flip-flops, so that the time margin is more accurately and reliably set aside over that of the previous methods, which have never taken the integrated problem into account. Precisely, based on an accurate flexible model of setup skew, hold skew, and clk-to-Q delay, we propose a stepwise clock skew scheduling technique in which at each iteration, the worst slack of setup and hold skews is systematically and incrementally relaxed to maximally extend the time margin. Lastly, clock tree with cross links and clock spine have an intermediate characteristics for skew tolerance and power consumption, compared to clock tree and clock mesh which are two extreme structures of clock network. Unlike the clock tree with links between clock nodes, which is a sort of an incremental modification of the structure of clock tree, clock spine network is a completely separated structure from the structures of tree and mesh. Consequently, it is necessary and essential to develop a synthesis algorithm for clock spines, which will be compatible to the existing synthesis algorithms of clock trees and clock meshes. To this end, this dissertation first addresses the problem of automating the synthesis of clock-gated clock spines with the objective of minimizing total clock power while meeting the clock skew and slew constraints. The key idea of our proposed synthesis algorithm is to identify and group the flip-flops with tight correlation of clock-gating operations together to form a spine while accurately predicting and maintaining clock skew and slew variations through the buffer insertion and stub allocation. In summary, this dissertation presents clock tuning techniques with consideration of post-silicon tuning, flexible flip-flop timing model, and clock-gated clock spine synthesis algorithm.Abstract i Chapter 1 INTRODUCTION 1 1.1 Clock Distribution Network . . . . . . . . . . . . . . . . . . . . . 1 1.2 Process Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Flexible Flip-flop Timing Model . . . . . . . . . . . . . . . . . . . 3 1.4 Clock Spine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.5 Contributions of This Dissertation . . . . . . . . . . . . . . . . . 6 Chapter 2 POST-SILICON TUNABLE CLOCK BUFFER ALLOCATION BASED ON FAST CHIP YIELD COMPUTATION 8 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Systematic Exploration of PST Buffer Allocation . . . . . . . . . 10 2.2.1 Observations . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.2 Problem Definition . . . . . . . . . . . . . . . . . . . . . . 15 2.2.3 Allocation Algorithm . . . . . . . . . . . . . . . . . . . . . 16 2.3 Fast Timing Yield Computation . . . . . . . . . . . . . . . . . . 17 2.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.3.2 Incremental Yield Computation . . . . . . . . . . . . . . . 22 2.4 Experimental Result . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.5 PST Buffer Configuration Techniques . . . . . . . . . . . . . . . 31 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Chapter 3 POST-SILICON TUNING BASED ON FLEXIBLE FLIP-FLOP TIMING 34 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.2 Preliminary and Definitions . . . . . . . . . . . . . . . . . . . . . 40 3.2.1 Flexible Flip-Flop Timing Model . . . . . . . . . . . . . . 40 3.2.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.3 Motivational Examples . . . . . . . . . . . . . . . . . . . . . . . . 42 3.4 Clock Skew Scheduling for Slack Relaxation Based on Flexible Flip-Flop Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.4.1 Overall Flow . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.4.2 Finding Local Clock Skew Schedule . . . . . . . . . . . . 48 3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Chapter 4 SYNTHESIS FOR POWER-AWARE CLOCK SPINES 61 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2 Preliminaries and Motivation . . . . . . . . . . . . . . . . . . . . 64 4.2.1 Clock Spine . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.2.2 Activity Patterns . . . . . . . . . . . . . . . . . . . . . . . 67 4.2.3 Power Computation . . . . . . . . . . . . . . . . . . . . . 67 4.3 Algorithm for Clock Spine Synthesis . . . . . . . . . . . . . . . . 68 4.3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . 68 4.3.2 Power-Aware Sink Clustering . . . . . . . . . . . . . . . . 70 4.3.3 Spine Relaxation . . . . . . . . . . . . . . . . . . . . . . . 77 4.3.4 Spine Buffer Allocation . . . . . . . . . . . . . . . . . . . 80 4.3.5 Top-Level Tree Construction . . . . . . . . . . . . . . . . 86 4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Chapter 5 CONCLUSION 95 5.1 Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.2 Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.3 Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Bibliography 97 ์ดˆ๋ก 106Docto

    Variability-Aware VLSI Design Automation For Nanoscale Technologies

    Get PDF
    As technology scaling enters the nanometer regime, design of large scale ICs gets more challenging due to shrinking feature sizes and increasing design complexity. Aggressive scaling causes significant degradation in reliability, increased susceptibility to fabrication and environmental randomness and increased dynamic and leakage power dissipation. In this work, we investigate these scaling issues in large scale integrated systems. This dissertation proposes to develop variability-aware design methodologies by proposing design analysis, design-time optimization, post-silicon tunability and runtime-adaptivity based optimization techniques for handling variability. We discuss our research in the area of variability-aware analysis, specifically focusing on the problem of statistical timing analysis. The first technique presents the concept of error budgeting that achieves significant runtime speedups during statistical timing analysis. The second work presents a general framework for non-linear non-Gaussian statistical timing analysis considering correlations. Further, we present our work on design-time optimization schemes that are applicable during physical synthesis. Firstly, we present a buffer insertion technique that considers wire-length uncertainty and proposes algorithms to perform probabilistic buffer insertion. Secondly, we present a stochastic optimization framework based on Monte-Carlo technique considering fabrication variability. This optimization framework can be applied to problems that can be modeled as linear programs without without imposing any assumptions on the nature of the variability. Subsequently, we present our work on post-silicon tunability based design optimization. This work presents a design management framework that can be used to balance the effort spent on pre-silicon (through gate sizing) and post-silicon optimization (through tunable clock-tree buffers) while maximizing the yield gains. Lastly, we present our work on variability-aware runtime optimization techniques. We look at the problem of runtime supply voltage scaling for dynamic power optimization, and propose a framework to consider the impact of variability on the reliability of such designs. We propose a probabilistic design synthesis technique where reliability of the design is a primary optimization metric

    An Analog Multiphase Self-Calibrating DLL to Minimize the Effects of Process, Supply Voltage, and Temperature Variations

    Get PDF
    Delay locked loops have been found to be useful tools in such applications as computing, TDCs, and communications. These system can be found in space exploration vehicles and satellites, which operate in extreme environments. Unfortunately, in these environments supply voltage and temperature will not be constant, therefore they must be under consideration when designing a DLL. Furthermore, solar radiation in conjunction with the varying environmental aspects, could cause the delay locked loop to lose it locked state. Delay locked loops are inherently good at tracking these environmental aspects, but in order to do so, the voltage controlled delay line must exhibit a very large gain, which translates to a large capture range. Assuming charged particles hit a key node in the DLL (e.g. the control voltage), the DLL would lose lock and would have to recapture it. Depending on the severity of the uctuation, this relocking process could easily take on the order of many microseconds assuming the bandwidth was kept low to minimize jitter. To date, no delay locked loops have been published for extreme environment applications. In many other extreme environment circuits, calibration techniques have been applied to minimize the environmental effects. Whereas there have been multiple calibration methods published related to delay locked loops, none of them were intended for extreme environments. Furthermore, none of these methods are directly suitable for an analog multiphase delay locked loop. The self-calibrating DLL in this work includes an all digital calibration circuit, as well as a system transient monitor. The coarse calibration helps minimize global process, voltage, and temperature errors for an analog multiphase DLL. The system monitor is used to detect any transients that might cause the DLL to unlock, which could be used to allow the DLL to be recalibrated to the new environmental conditions. The presented measurement results will demonstrate that the DLL can be used in extreme environments such as space, or other extreme environment applications
    • โ€ฆ
    corecore