Search CORE

813 research outputs found

EffiTest: Efficient Delay Test and Statistical Prediction for Configuring Post-silicon Tunable Buffers

Author: Chen D. S.
Johnson R. A.
Jolliffe I. T.
Tsai J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/05/2017
Field of study

At nanometer manufacturing technology nodes, process variations significantly affect circuit performance. To combat them, post- silicon clock tuning buffers can be deployed to balance timing bud- gets of critical paths for each individual chip after manufacturing. The challenge of this method is that path delays should be mea- sured for each chip to configure the tuning buffers properly. Current methods for this delay measurement rely on path-wise frequency stepping. This strategy, however, requires too much time from ex- pensive testers. In this paper, we propose an efficient delay test framework (EffiTest) to solve the post-silicon testing problem by aligning path delays using the already-existing tuning buffers in the circuit. In addition, we only test representative paths and the delays of other paths are estimated by statistical delay prediction. Exper- imental results demonstrate that the proposed method can reduce the number of frequency stepping iterations by more than 94% with only a slight yield loss.Comment: ACM/IEEE Design Automation Conference (DAC), June 201

arXiv.org e-Print Archive

Crossref

Sampling-based Buffer Insertion for Post-Silicon Yield Improvement under Process Variability

Author: Li Bing
Schlichtmann Ulf
Zhang Grace Li
Publication venue: 'Research Publishing Services'
Publication date: 01/01/2016
Field of study

At submicron manufacturing technology nodes process variations affect circuit performance significantly. This trend leads to a large timing margin and thus overdesign to maintain yield. To combat this pessimism, post-silicon clock tuning buffers can be inserted into circuits to balance timing budgets of critical paths with their neighbors. After manufacturing, these clock buffers can be configured for each chip individually so that chips with timing failures may be rescued to improve yield. In this paper, we propose a sampling-based method to determine the proper locations of these buffers. The goal of this buffer insertion is to reduce the number of buffers and their ranges, while still maintaining a good yield improvement. Experimental results demonstrate that our algorithm can achieve a significant yield improvement (up to 35%) with only a small number of buffers.Comment: Design, Automation and Test in Europe (DATE), 201

arXiv.org e-Print Archive

Crossref

저전력 고성능 디지털 시스템을 위한 고신뢰도의 클럭 네트워크 설계 방법론

Author: Hyungjung Seo
Publication venue: 서울대학교 대학원
Publication date: 01/08/2015
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 8. 김태환.오늘날의 회로 설계에서 공정변이가 회로 클럭의 타이밍의 변이에 미치는 영향은 매우 커짐에 따라, 전통적으로 사용되던 클럭 트리 구조를 기반으로 한 클럭 네트워크를 사용하는 것은 한계에 부딪히게 되었고, 이를 극복하기 위한 여러가지 기술들이 제안되었다. 본 논문에서는 변이에 강한 클럭 네트워크를 설계하기 위해, 연구 및 사용되고 있는 세 가지 기술에 대해 소개하고, 이들을 개선한 연구들을 제안한다. 첫째로, 이 논문에서는 클럭의 타이밍 문제를 회로 제작 이후 단계에서 조정할 수 있는 포스트 실리콘 조정 클럭 버퍼를 배치하는 문제에 대해 서술한다. 포스트 실리콘 조정 버퍼는 클럭의 지연시간을 회로가 제작된 이후의 단계에서 조정하 여 클럭의 타이밍 문제를 해결할 수 있지만, 버퍼 자체의 크기 때문에 최소한의 개수만 가장 효율적인 위치에 배치해야 하는 문제가 있다. 본 논문에서는 이전의 연구가 회로의 수율을 계산할 때 시간이 많이 걸리는 몬테-카를로 시뮬레이션을 사용하기 때문에 탐색 가능한 포스트 실리콘 조정 버퍼의 배치가 제한되는 문제가 있음을 지적한 후, 기존에 제안되었던 그래프 기반 회로 수율 계산 기법을 사용하여 효율적인 포스트 실리콘 조정 버퍼 배치를 찾을 수 있는 점진적이고 체계적인 방법을 제시한다. 다음은 클럭 시차 스케쥴링 방법에 대한 연구를 서술한다. 최근의 연구에서 제안되었던, 플립-플롭의 클럭에서 출력까지의 딜레이가 클럭의 준비시간과 유지시간에 의존한다는 유연한 플립-플롭 타이밍 모델 연구는 기존의 플립-플롭의 타이밍 특성들이 고정된 값이라는 가정에 기반한 정적 타이밍 분석의 정확성 문제를 해결할 수 있는 중요한 연구이다. 본 논문에서는 새로운 모델을 고려하여, 이전에 고전적인 플립-플롭 타이밍 특성 모델을 기반으로 진행되었던 클럭 시차 스케쥴링의 최적화 문제를 유연한 플립-플롭 타이밍 모델을 고려하여 해결하였다. 본 연구에서는 주어진 회로의 준비시간과 유지시간의 여유시간을 반복적이고 체계적으로 최대화하여 문제를 해결하였다. 마지막으로 클럭 스파인 네트워크의 합성을 자동화하는 문제에 대해 서술한다. 전통적인 클럭 트리 구조가 공정변이 문제를 해결하지 못했기 때문에 클럭 메쉬를 포함하는 다양한 대안적 구조가 제안되었다. 클럭 메쉬의 경우 공정변이에 의한 클럭 시차를 줄일 수 있었지만 이를 위해 와이어나 버퍼 등의 자원을 많이 소모하는 문제를 가지고 있다. 두 구조의 중간적 구조에는 클럭 트리의 노드를 연결하는 크로스 링크를 삽입하는 구조와 클럭 스파인 구조가 있다. 클럭 트리에 점진적인 수정을 가하여 만드는 크로스 링크와 달리, 클럭 스파인 구조는 트리나 이후에 제안된 메쉬와는 완전히 별개의 구조로, 이를 합성하는 방법도 매우 다르다. 그렇기 때문에 클럭 스파인을 합성하는 알고리즘은 필수적이라고 할 수 있으나, 합성 방법론이나 이를 자동화하는 방법에 관한 연구는 아직 없다. 본 논문에서는 우선, 클럭-게이팅을 지원하는 클럭 스파인을 주어진 클럭 시차 및 클럭 슬루 조건을 만족하면서 자원 및 전력 소모량을 최소화하는 문제에 대해 서술한다. 그리고, 회로에서 주어진 플립-플롭들을 클럭-게이팅 조건에서의 연관성을 고려하고 조직화하여 클럭 스파인을 삽입한 후, 클럭 시차 및 슬루 조건을 고려하여 버퍼를 삽입하는 알고리즘을 제안한다. 요약하면, 본 논문에서는 클럭의 타이밍 문제를 해결하기 위해 포스트-실리콘 조정 클럭 버퍼를 사용하는 테크닉과 클럭 시차 스케쥴링을 유연한 플립-플롭 타이밍 모델에서 적용하는 테크닉을 제시하고, 클럭의 타이밍 문제와 전력 소모 문제를 한번에 해결하기 위한 새로운 클럭 스파인 네트워크를 합성하는 자동화 알고리즘을 제시한다.As the process variation is dominating to cause the clock timing variation among chips to be much large, conventional clock tree based clock network is not able to guarantee the timing constraint of a digital system. To overcome the limitations of traditional clock design techniques, various techniques have been studied. This dissertation addresses three techniques that have been widely used for designing robust clock network and proposes developed methods. First, it is widely accepted that post-silicon tunable (PST) clock buffers can effectively resolve the clock timing violation. Since PST buffers, which can reset the clock delay to flip-flops after the chip is manufactured, impose a non-trivial implementation area and control circuitry, it is very important to minimally allocate PST buffers while satisfying the chip yield constraint. In this dissertation, we (1) develop a graph-based chip yield computation technique which can update yields very efficiently and accurately for incremental PST buffer allocation, based on which we (2) propose a systematic (bottom-up and top-down with refinement) PST buffer allocation algorithm that is able to fully explore the design space of PST buffer allocation. Second, clock skew scheduling is one of the essential steps that must be carefully performed during the design process. This dissertation addresses the clock skew optimization problem integrated with the consideration of the interdependent relation between the setup and hold skews, and clk-to-Q delay of flip-flops, so that the time margin is more accurately and reliably set aside over that of the previous methods, which have never taken the integrated problem into account. Precisely, based on an accurate flexible model of setup skew, hold skew, and clk-to-Q delay, we propose a stepwise clock skew scheduling technique in which at each iteration, the worst slack of setup and hold skews is systematically and incrementally relaxed to maximally extend the time margin. Lastly, clock tree with cross links and clock spine have an intermediate characteristics for skew tolerance and power consumption, compared to clock tree and clock mesh which are two extreme structures of clock network. Unlike the clock tree with links between clock nodes, which is a sort of an incremental modification of the structure of clock tree, clock spine network is a completely separated structure from the structures of tree and mesh. Consequently, it is necessary and essential to develop a synthesis algorithm for clock spines, which will be compatible to the existing synthesis algorithms of clock trees and clock meshes. To this end, this dissertation first addresses the problem of automating the synthesis of clock-gated clock spines with the objective of minimizing total clock power while meeting the clock skew and slew constraints. The key idea of our proposed synthesis algorithm is to identify and group the flip-flops with tight correlation of clock-gating operations together to form a spine while accurately predicting and maintaining clock skew and slew variations through the buffer insertion and stub allocation. In summary, this dissertation presents clock tuning techniques with consideration of post-silicon tuning, flexible flip-flop timing model, and clock-gated clock spine synthesis algorithm.Abstract i Chapter 1 INTRODUCTION 1 1.1 Clock Distribution Network . . . . . . . . . . . . . . . . . . . . . 1 1.2 Process Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Flexible Flip-flop Timing Model . . . . . . . . . . . . . . . . . . . 3 1.4 Clock Spine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.5 Contributions of This Dissertation . . . . . . . . . . . . . . . . . 6 Chapter 2 POST-SILICON TUNABLE CLOCK BUFFER ALLOCATION BASED ON FAST CHIP YIELD COMPUTATION 8 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Systematic Exploration of PST Buffer Allocation . . . . . . . . . 10 2.2.1 Observations . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.2 Problem Definition . . . . . . . . . . . . . . . . . . . . . . 15 2.2.3 Allocation Algorithm . . . . . . . . . . . . . . . . . . . . . 16 2.3 Fast Timing Yield Computation . . . . . . . . . . . . . . . . . . 17 2.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.3.2 Incremental Yield Computation . . . . . . . . . . . . . . . 22 2.4 Experimental Result . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.5 PST Buffer Configuration Techniques . . . . . . . . . . . . . . . 31 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Chapter 3 POST-SILICON TUNING BASED ON FLEXIBLE FLIP-FLOP TIMING 34 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.2 Preliminary and Definitions . . . . . . . . . . . . . . . . . . . . . 40 3.2.1 Flexible Flip-Flop Timing Model . . . . . . . . . . . . . . 40 3.2.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.3 Motivational Examples . . . . . . . . . . . . . . . . . . . . . . . . 42 3.4 Clock Skew Scheduling for Slack Relaxation Based on Flexible Flip-Flop Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.4.1 Overall Flow . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.4.2 Finding Local Clock Skew Schedule . . . . . . . . . . . . 48 3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Chapter 4 SYNTHESIS FOR POWER-AWARE CLOCK SPINES 61 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2 Preliminaries and Motivation . . . . . . . . . . . . . . . . . . . . 64 4.2.1 Clock Spine . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.2.2 Activity Patterns . . . . . . . . . . . . . . . . . . . . . . . 67 4.2.3 Power Computation . . . . . . . . . . . . . . . . . . . . . 67 4.3 Algorithm for Clock Spine Synthesis . . . . . . . . . . . . . . . . 68 4.3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . 68 4.3.2 Power-Aware Sink Clustering . . . . . . . . . . . . . . . . 70 4.3.3 Spine Relaxation . . . . . . . . . . . . . . . . . . . . . . . 77 4.3.4 Spine Buffer Allocation . . . . . . . . . . . . . . . . . . . 80 4.3.5 Top-Level Tree Construction . . . . . . . . . . . . . . . . 86 4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Chapter 5 CONCLUSION 95 5.1 Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.2 Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.3 Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Bibliography 97 초록 106Docto

SNU Open Repository and Archive

Variability-Aware VLSI Design Automation For Nanoscale Technologies

Author: Khandelwal Vishal
Publication venue
Publication date: 16/05/2007
Field of study

As technology scaling enters the nanometer regime, design of large scale ICs gets more challenging due to shrinking feature sizes and increasing design complexity. Aggressive scaling causes significant degradation in reliability, increased susceptibility to fabrication and environmental randomness and increased dynamic and leakage power dissipation. In this work, we investigate these scaling issues in large scale integrated systems. This dissertation proposes to develop variability-aware design methodologies by proposing design analysis, design-time optimization, post-silicon tunability and runtime-adaptivity based optimization techniques for handling variability. We discuss our research in the area of variability-aware analysis, specifically focusing on the problem of statistical timing analysis. The first technique presents the concept of error budgeting that achieves significant runtime speedups during statistical timing analysis. The second work presents a general framework for non-linear non-Gaussian statistical timing analysis considering correlations. Further, we present our work on design-time optimization schemes that are applicable during physical synthesis. Firstly, we present a buffer insertion technique that considers wire-length uncertainty and proposes algorithms to perform probabilistic buffer insertion. Secondly, we present a stochastic optimization framework based on Monte-Carlo technique considering fabrication variability. This optimization framework can be applied to problems that can be modeled as linear programs without without imposing any assumptions on the nature of the variability. Subsequently, we present our work on post-silicon tunability based design optimization. This work presents a design management framework that can be used to balance the effort spent on pre-silicon (through gate sizing) and post-silicon optimization (through tunable clock-tree buffers) while maximizing the yield gains. Lastly, we present our work on variability-aware runtime optimization techniques. We look at the problem of runtime supply voltage scaling for dynamic power optimization, and propose a framework to consider the impact of variability on the reliability of such designs. We propose a probabilistic design synthesis technique where reliability of the design is a primary optimization metric

Digital Repository at the University of Maryland

Algorithmic techniques for nanometer VLSI design and manufacturing closure

Author: Hu Shiyan
Publication venue: Texas A&M University
Publication date: 10/10/2008
Field of study

As Very Large Scale Integration (VLSI) technology moves to the nanoscale regime, design and manufacturing closure becomes very difficult to achieve due to increasing chip and power density. Imperfections due to process, voltage and temperature variations aggravate the problem. Uncertainty in electrical characteristic of individual device and wire may cause significant performance deviations or even functional failures. These impose tremendous challenges to the continuation of Moore's law as well as the growth of semiconductor industry. Efforts are needed in both deterministic design stage and variation-aware design stage. This research proposes various innovative algorithms to address both stages for obtaining a design with high frequency, low power and high robustness. For deterministic optimizations, new buffer insertion and gate sizing techniques are proposed. For variation-aware optimizations, new lithography-driven and post-silicon tuning-driven design techniques are proposed. For buffer insertion, a new slew buffering formulation is presented and is proved to be NP-hard. Despite this, a highly efficient algorithm which runs > 90x faster than the best alternatives is proposed. The algorithm is also extended to handle continuous buffer locations and blockages. For gate sizing, a new algorithm is proposed to handle discrete gate library in contrast to unrealistic continuous gate library assumed by most existing algorithms. Our approach is a continuous solution guided dynamic programming approach, which integrates the high solution quality of dynamic programming with the short runtime of rounding continuous solution. For lithography-driven optimization, the problem of cell placement considering manufacturability is studied. Three algorithms are proposed to handle cell flipping and relocation. They are based on dynamic programming and graph theoretic approaches, and can provide different tradeoff between variation reduction and wire- length increase. For post-silicon tuning-driven optimization, the problem of unified adaptivity optimization on logical and clock signal tuning is studied, which enables us to significantly save resources. The new algorithm is based on a novel linear programming formulation which is solved by an advanced robust linear programming technique. The continuous solution is then discretized using binary search accelerated dynamic programming, batch based optimization, and Latin Hypercube sampling based fast simulation

Texas A&M Repository

Ultra-low Voltage Digital Circuits and Extreme Temperature Electronics Design

Author: Arthurs Aaron J
Publication venue: ScholarWorks@UARK
Publication date: 01/08/2012
Field of study

Certain applications require digital electronics to operate under extreme conditions e.g., large swings in ambient temperature, very low supply voltage, high radiation. Such applications include sensor networks, wearable electronics, unmanned aerial vehicles, spacecraft, and energyharvesting systems. This dissertation splits into two projects that study digital electronics supplied by ultra-low voltages and build an electronic system for extreme temperatures. The first project introduces techniques that improve circuit reliability at deep subthreshold voltages as well as determine the minimum required supply voltage. These techniques address digital electronic design at several levels: the physical process, gate design, and system architecture. This dissertation analyzes a silicon-on-insulator process, Schmitt-trigger gate design, and asynchronous logic at supply voltages lower than 100 millivolts. The second project describes construction of a sensor digital controller for the lunar environment. Parts of the digital controller are an asynchronous 8031 microprocessor that is compatible with synchronous logic, memory with error detection and correction, and a robust network interface. The digitial sensor ASIC is fabricated on a silicon-germanium process and built with cells optimized for extreme temperatures

ScholarWorks@UARK

UARK (University of Arkansas )