Search CORE

100 research outputs found

Pre-bond testable low-power clock tree design for 3D stacked ICs

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2009
Field of study

High-performance and Low-power Clock Network Synthesis in the Presence of Variation.

Author: Lee Dong Jin
Publication venue
Publication date: 01/01/2011
Field of study

Semiconductor technology scaling requires continuous evolution of all aspects of physical design of integrated circuits. Among the major design steps, clock-network synthesis has been greatly affected by technology scaling, rendering existing methodologies inadequate. Clock routing was previously sufficient for smaller ICs, but design difficulty and structural complexity have greatly increased as interconnect delay and clock frequency increased in the 1990s. Since a clock network directly influences IC performance and often consumes a substantial portion of total power, both academia and industry developed synthesis methodologies to achieve low skew, low power and robustness from PVT variations. Nevertheless, clock network synthesis under tight constraints is currently the least automated step in physical design and requires significant manual intervention, undermining turn-around-time. The need for multi-objective optimization over a large parameter space and the increasing impact of process variation make clock network synthesis particularly challenging. Our work identifies new objectives, constraints and concerns in the clock-network synthesis for systems-on-chips and microprocessors. To address them, we generate novel clock-network structures and propose changes in traditional physical-design flows. We develop new modeling techniques and algorithms for clock power optimization subject to tight skew constraints in the presence of process variations. In particular, we offer SPICE-accurate optimizations of clock networks, coordinated to reduce nominal skew below 5 ps, satisfy slew constraints and trade-off skew, insertion delay and power, while tolerating variations. To broaden the scope of clock-network-synthesis optimizations, we propose new techniques and a methodology to reduce dynamic power consumption by 6.8%-11.6% for large IC designs with macro blocks by integrating clock network synthesis within global placement. We also present a novel non-tree topology that is 2.3x more power-efficient than mesh structures. We fuse several clock trees to create large-scale redundancy in a clock network to bridge the gap between tree-like and mesh-like topologies. Integrated optimization techniques for high-quality clock networks described in this dissertation strong empirical results in experiments with recent industry-released benchmarks in the presence of process variation. Our software implementations were recognized with the first-place awards at the ISPD 2009 and ISPD 2010 Clock-Network Synthesis Contests organized by IBM Research and Intel Research.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/89711/1/ejdjsy_1.pd

CiteSeerX

Deep Blue Documents at the University of Michigan

Timing Closure in Chip Design

Author: Held Stephan
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Achieving timing closure is a major challenge to the physical design of a computer chip. Its task is to find a physical realization fulfilling the speed specifications. In this thesis, we propose new algorithms for the key tasks of performance optimization, namely repeater tree construction; circuit sizing; clock skew scheduling; threshold voltage optimization and plane assignment. Furthermore, a new program flow for timing closure is developed that integrates these algorithms with placement and clocktree construction. For repeater tree construction a new algorithm for computing topologies, which are later filled with repeaters, is presented. To this end, we propose a new delay model for topologies that not only accounts for the path lengths, as existing approaches do, but also for the number of bifurcations on a path, which introduce extra capacitance and thereby delay. In the extreme cases of pure power optimization and pure delay optimization the optimum topologies regarding our delay model are minimum Steiner trees and alphabetic code trees with the shortest possible path lengths. We presented a new, extremely fast algorithm that scales seamlessly between the two opposite objectives. For special cases, we prove the optimality of our algorithm. The efficiency and effectiveness in practice is demonstrated by comprehensive experimental results. The task of circuit sizing is to assign millions of small elementary logic circuits to elements from a discrete set of logically equivalent, predefined physical layouts such that power consumption is minimized and all signal paths are sufficiently fast. In this thesis we develop a fast heuristic approach for global circuit sizing, followed by a local search into a local optimum. Our algorithms use, in contrast to existing approaches, the available discrete layout choices and accurate delay models with slew propagation. The global approach iteratively assigns slew targets to all source pins of the chip and chooses a discrete layout of minimum size preserving the slew targets. In comprehensive experiments on real instances, we demonstrate that the worst path delay is within 7% of its lower bound on average after a few iterations. The subsequent local search reduces this gap to 2% on average. Combining global and local sizing we are able to size more than 5.7 million circuits within 3 hours. For the clock skew scheduling problem we develop the first algorithm with a strongly polynomial running time for the cycle time minimization in the presence of different cycle times and multi-cycle paths. In practice, an iterative local search method is much more efficient. We prove that this iterative method maximizes the worst slack, even when restricting the feasible schedule to certain time intervals. Furthermore, we enhance the iterative local approach to determine a lexicographically optimum slack distribution. The clock skew scheduling problem is then generalized to allow for simultaneous data path optimization. In fact, this is a time-cost tradeoff problem. We developed the first combinatorial algorithm for computing time-cost tradeoff curves in graphs that may contain cycles. Starting from the lowest-cost solution, the algorithm iteratively computes a descent direction by a minimum cost flow computation. The maximum feasible step length is then determined by a minimum ratio cycle computation. This approach can be used in chip design for several optimization tasks, e.g. threshold voltage optimization or plane assignment. Finally, the optimization routines are combined into a timing closure flow. Here, the global placement is alternated with global performance optimization. Netweights are used to penalize the length of critical nets during placement. After the global phase, the performance is improved further by applying more comprehensive optimization routines on the most critical paths. In the end, the clock schedule is optimized and clocktrees are inserted. Computational results of the design flow are obtained on real-world computer chips

bonndoc – Der Publikationsserver der Universität Bonn

Design and automation of voltage-scaled clock networks

Author: Sitik Ahmet Can
Publication venue: Drexel University
Publication date: 01/12/2015
Field of study

In this dissertation, a vital step of VLSI physical design flow, synthesis of clock distribution networks, is investigated. Clock network synthesis (CNS) involves large and complex optimization problems to achieve high performance and low power demands of current integrated circuits (ICs). Ineffectiveness of existing methodologies to provide high performance at lower voltage nodes is the main driver for this dissertation research. A design and automation flow for voltage-scaled clock networks is proposed to satisfy tight timing constraints at high frequency (for high performance) and low voltage (for low power) operation. One implementation of voltage-scaled clock networks is low (voltage) swing clocking, which is a known technique, yet its applicability remains limited to designs with low performance demands. In this dissertation, novel methodologies are introduced to i) apply low swing clocking to legacy designs as a power saving methodology, ii) develop a complete CNS flow for low swing clocking of high performance ICs. These methodologies include slew-driven approaches that are better suited to future transistor and interconnect technologies. Second implementation of voltage-scaled clock networks is multi-voltage clocking, which is another known technique, yet its applicability remains limited to clock tree topology. In this dissertation, multi-voltage clocking with a clock mesh topology is investigated in order to address a missing aspect in the current IC design flows. Practical considerations of the current IC design flows are also investigated in this dissertation to expand the applicability of the proposed CNS flow. A novel methodology is introduced to facilitate clock gating within low swing clocking. The applicability of low swing clocking to FinFET technology, which is currently the industry norm, is shown to be effective.Ph.D., Electrical Engineering -- Drexel University, 201

Drexel Libraries E-Repository and Archives

Clock Tree and Flip-flop Co-optimization for Reducing Power Consumption and Power/Ground Noise of Integrated Circuits and Systems

Author: Joohan Kim
Publication venue: 서울대학교 대학원
Publication date: 01/08/2017
Field of study

학위논문 (박사)-- 서울대학교 대학원 공과대학 전기·컴퓨터공학부, 2017. 8. 김태환.For very-large-scale integration (VLSI) circuits, the activation of all flip-flops that are used to store data is synchronized by clock signals delivered through clock networks. Due to very high frequency of clock signal switches, the dynamic power consumed on clock networks takes a considerable portion of the total power consumption of the circuits. In addition, the largest amount of power consumption in the clock networks comes from the flip-flops and the buffers that drive the flip-flops at the clock network boundary. In addition, the requirement of simultaneously activating all flip-flops for synchronous circuits induces a high peak power/ground noise (i.e., voltage drop) at the clock boundary. In this regards, this thesis addresses two new problems: the problem of reducing the clock power consumption at the clock network boundary, and the problem of reducing the peak current at the clock network boundary. Unlike the prior works which have considered the optimization of flip-flops and clock buffers separately, our approach takes into account the co-optimization of flip-flops and clock buffers. Precisely, we propose four different types of hardware component that can implement a set of flip-flops and their driving buffer as a single unit. The key idea for the derivation of the four types of clock boundary component is that one of the inverters in the driving buffer and one of the inverters in each flip-flop can be combined and removed without changing the functionality of the flip-flops. Consequently, we have a more freedom to select (i.e., allocate) clock boundary components that is able to reduce the power consumption or peak current under timing constraint. We have implemented our approach of clock boundary optimization under bounded clock skew constraint and tested it with ISCAS 89 benchmark circuits. The experimental results confirm that our approach is able to reduce the clock power consumption by 7.9∼10.2% and power/ground noise by 27.7%∼30.9% on average.Chapter 1 Introduction 1 1.1 Clock Signal 1 1.2 Metrics of Clock Design 2 1.3 Clock Network Topologies 4 1.4 Multibit Flip-flop 5 1.5 Simultaneous Switching Noise 6 1.6 Contributions of This Dissertation 6 Chapter 2 Clock Tree and Flip-flop Co-optimization for Reducing Power Consumption 8 2.1 Introduction 8 2.2 Types of Boundary Optimization 9 2.3 Analysis of Four Types of Flip-flop 12 2.3.1 Internal Power Comparison 12 2.3.2 Characterization of Power Consumption 14 2.4 Problem Formulation 15 2.5 The Proposed Algorithm 17 2.5.1 Independence Assumption 17 2.5.2 BoundaryMin Algorithm 17 2.6 Experimental Results 29 2.6.1 Experimental Setup 29 2.6.2 Clock Tree Boundary Optimization Results 33 2.6.3 Capacitance Analysis on Flip-flops 38 2.6.4 Slew and Skew Analysis 39 2.6.5 Window Width Analysis 39 2.7 Conclusions 41 Chapter 3 Clock Tree and Flip-flop Co-optimization for Reducing Power/Ground Noise 42 3.1 Introduction 42 3.2 Current Characteristic of Four Types of Flip-flop 45 3.3 Motivational Example 47 3.4 Problem Formulation 52 3.5 Proposed Algorithm 54 3.5.1 An Overview 54 3.5.2 Superposition of Current Flows 55 3.5.3 Formulation to Instance of MOSP Problem 57 3.5.4 Selecting Target Power Grid Points 59 3.5.5 Consideration of Reducing Power Consumption 62 3.6 Experimental Results 62 3.7 Summary 65 Chapter 4 Conclusion 68 4.1 Clock Buffer and Flip-flop Co-optimization for Reducing Power Consumption 68 4.2 Clock Buffer and Flip-flop Co-optimization for Reducing Power/Ground Noise 69 초록 78Docto

SNU Open Repository and Archive

Clock tree synthesis for prescribed skew specifications

Author: Chaturvedi Rishi
Publication venue: Texas A&M University
Publication date: 29/08/2005
Field of study

In ultra-deep submicron VLSI designs, clock network layout plays an increasingly important role in determining circuit performance including timing, power consumption, cost, power supply noise and tolerance to process variations. It is required that a clock layout algorithm can achieve any prescribed skews with the minimum wire length and acceptable slew rate. Traditional zero-skew clock routing methods are not adequate to address this demand, since they tend to yield excessive wire length for prescribed skew targets. The interactions among skew targets, sink location proximities and capacitive load balance are analyzed. Based on this analysis, a maximum delay-target ordering merging scheme is suggested to minimize wire and buﬀer area, which results in lesser cost, power consumption and vulnerability to process variations. During the clock routing, buﬀers are inserted simultaneously to facilitate a proper slew rate level and reduce wire snaking. The proposed algorithm is simple and fast for practical applications. Experimental results on benchmark circuits show that the algorithm can reduce the total wire and buﬀer capacitance by 60% over an extension of the existing zero-skew routing method

Texas A&M Repository

Power Reductions with Energy Recovery Using Resonant Topologies

Author: Bezzam Ignatius S.A.
Publication venue: Scholar Commons
Publication date: 05/05/2015
Field of study

The problem of power densities in system-on-chips (SoCs) and processors has become more exacerbated recently, resulting in high cooling costs and reliability issues. One of the largest components of power consumption is the low skew clock distribution network (CDN), driving large load capacitance. This can consume as much as 70% of the total dynamic power that is lost as heat, needing elaborate sensing and cooling mechanisms. To mitigate this, resonant clocking has been utilized in several applications over the past decade. An improved energy recovering reconfigurable generalized series resonance (GSR) solution with all the critical support circuitry is developed in this work. This LC resonant clock driver is shown to save about 50% driver power (\u3e40% overall), on a 22nm process node and has 50% less skew than a non-resonant driver at 2GHz. It can operate down to 0.2GHz to support other energy savings techniques like dynamic voltage and frequency scaling (DVFS). As an example, GSR can be configured for the simpler pulse series resonance (PSR) operation to enable further power saving for double data rate (DDR) applications, by using de-skewing latches instead of flip-flop banks. A PSR based subsystem for 40% savings in clocking power with 40% driver active area reduction xii is demonstrated. This new resonant driver generates tracking pulses at each transition of clock for dual edge operation across DVFS. PSR clocking is designed to drive explicit-pulsed latches with negative setup time. Simulations using 45nm IBM/PTM device and interconnect technology models, clocking 1024 flip-flops show the reductions, compared to non-resonant clocking. DVFS range from 2GHz/1.3V to 200MHz/0.5V is obtained. The PSR frequency is set \u3e3× the clock rate, needing only 1/10th the inductance of prior-art LC resonance schemes. The skew reductions are achieved without needing to increase the interconnect widths owing to negative set-up times. Applications in data circuits are shown as well with a 90nm example. Parallel resonant and split-driver non-resonant configurations as well are derived from GSR. Tradeoffs in timing performance versus power, based on theoretical analysis, are compared for the first time and verified. This enables synthesis of an optimal topology for a given application from the GSR

Scholar Commons - Santa Clara University

저전력 고성능 디지털 시스템을 위한 고신뢰도의 클럭 네트워크 설계 방법론

Author: Hyungjung Seo
Publication venue: 서울대학교 대학원
Publication date: 01/08/2015
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 8. 김태환.오늘날의 회로 설계에서 공정변이가 회로 클럭의 타이밍의 변이에 미치는 영향은 매우 커짐에 따라, 전통적으로 사용되던 클럭 트리 구조를 기반으로 한 클럭 네트워크를 사용하는 것은 한계에 부딪히게 되었고, 이를 극복하기 위한 여러가지 기술들이 제안되었다. 본 논문에서는 변이에 강한 클럭 네트워크를 설계하기 위해, 연구 및 사용되고 있는 세 가지 기술에 대해 소개하고, 이들을 개선한 연구들을 제안한다. 첫째로, 이 논문에서는 클럭의 타이밍 문제를 회로 제작 이후 단계에서 조정할 수 있는 포스트 실리콘 조정 클럭 버퍼를 배치하는 문제에 대해 서술한다. 포스트 실리콘 조정 버퍼는 클럭의 지연시간을 회로가 제작된 이후의 단계에서 조정하 여 클럭의 타이밍 문제를 해결할 수 있지만, 버퍼 자체의 크기 때문에 최소한의 개수만 가장 효율적인 위치에 배치해야 하는 문제가 있다. 본 논문에서는 이전의 연구가 회로의 수율을 계산할 때 시간이 많이 걸리는 몬테-카를로 시뮬레이션을 사용하기 때문에 탐색 가능한 포스트 실리콘 조정 버퍼의 배치가 제한되는 문제가 있음을 지적한 후, 기존에 제안되었던 그래프 기반 회로 수율 계산 기법을 사용하여 효율적인 포스트 실리콘 조정 버퍼 배치를 찾을 수 있는 점진적이고 체계적인 방법을 제시한다. 다음은 클럭 시차 스케쥴링 방법에 대한 연구를 서술한다. 최근의 연구에서 제안되었던, 플립-플롭의 클럭에서 출력까지의 딜레이가 클럭의 준비시간과 유지시간에 의존한다는 유연한 플립-플롭 타이밍 모델 연구는 기존의 플립-플롭의 타이밍 특성들이 고정된 값이라는 가정에 기반한 정적 타이밍 분석의 정확성 문제를 해결할 수 있는 중요한 연구이다. 본 논문에서는 새로운 모델을 고려하여, 이전에 고전적인 플립-플롭 타이밍 특성 모델을 기반으로 진행되었던 클럭 시차 스케쥴링의 최적화 문제를 유연한 플립-플롭 타이밍 모델을 고려하여 해결하였다. 본 연구에서는 주어진 회로의 준비시간과 유지시간의 여유시간을 반복적이고 체계적으로 최대화하여 문제를 해결하였다. 마지막으로 클럭 스파인 네트워크의 합성을 자동화하는 문제에 대해 서술한다. 전통적인 클럭 트리 구조가 공정변이 문제를 해결하지 못했기 때문에 클럭 메쉬를 포함하는 다양한 대안적 구조가 제안되었다. 클럭 메쉬의 경우 공정변이에 의한 클럭 시차를 줄일 수 있었지만 이를 위해 와이어나 버퍼 등의 자원을 많이 소모하는 문제를 가지고 있다. 두 구조의 중간적 구조에는 클럭 트리의 노드를 연결하는 크로스 링크를 삽입하는 구조와 클럭 스파인 구조가 있다. 클럭 트리에 점진적인 수정을 가하여 만드는 크로스 링크와 달리, 클럭 스파인 구조는 트리나 이후에 제안된 메쉬와는 완전히 별개의 구조로, 이를 합성하는 방법도 매우 다르다. 그렇기 때문에 클럭 스파인을 합성하는 알고리즘은 필수적이라고 할 수 있으나, 합성 방법론이나 이를 자동화하는 방법에 관한 연구는 아직 없다. 본 논문에서는 우선, 클럭-게이팅을 지원하는 클럭 스파인을 주어진 클럭 시차 및 클럭 슬루 조건을 만족하면서 자원 및 전력 소모량을 최소화하는 문제에 대해 서술한다. 그리고, 회로에서 주어진 플립-플롭들을 클럭-게이팅 조건에서의 연관성을 고려하고 조직화하여 클럭 스파인을 삽입한 후, 클럭 시차 및 슬루 조건을 고려하여 버퍼를 삽입하는 알고리즘을 제안한다. 요약하면, 본 논문에서는 클럭의 타이밍 문제를 해결하기 위해 포스트-실리콘 조정 클럭 버퍼를 사용하는 테크닉과 클럭 시차 스케쥴링을 유연한 플립-플롭 타이밍 모델에서 적용하는 테크닉을 제시하고, 클럭의 타이밍 문제와 전력 소모 문제를 한번에 해결하기 위한 새로운 클럭 스파인 네트워크를 합성하는 자동화 알고리즘을 제시한다.As the process variation is dominating to cause the clock timing variation among chips to be much large, conventional clock tree based clock network is not able to guarantee the timing constraint of a digital system. To overcome the limitations of traditional clock design techniques, various techniques have been studied. This dissertation addresses three techniques that have been widely used for designing robust clock network and proposes developed methods. First, it is widely accepted that post-silicon tunable (PST) clock buffers can effectively resolve the clock timing violation. Since PST buffers, which can reset the clock delay to flip-flops after the chip is manufactured, impose a non-trivial implementation area and control circuitry, it is very important to minimally allocate PST buffers while satisfying the chip yield constraint. In this dissertation, we (1) develop a graph-based chip yield computation technique which can update yields very efficiently and accurately for incremental PST buffer allocation, based on which we (2) propose a systematic (bottom-up and top-down with refinement) PST buffer allocation algorithm that is able to fully explore the design space of PST buffer allocation. Second, clock skew scheduling is one of the essential steps that must be carefully performed during the design process. This dissertation addresses the clock skew optimization problem integrated with the consideration of the interdependent relation between the setup and hold skews, and clk-to-Q delay of flip-flops, so that the time margin is more accurately and reliably set aside over that of the previous methods, which have never taken the integrated problem into account. Precisely, based on an accurate flexible model of setup skew, hold skew, and clk-to-Q delay, we propose a stepwise clock skew scheduling technique in which at each iteration, the worst slack of setup and hold skews is systematically and incrementally relaxed to maximally extend the time margin. Lastly, clock tree with cross links and clock spine have an intermediate characteristics for skew tolerance and power consumption, compared to clock tree and clock mesh which are two extreme structures of clock network. Unlike the clock tree with links between clock nodes, which is a sort of an incremental modification of the structure of clock tree, clock spine network is a completely separated structure from the structures of tree and mesh. Consequently, it is necessary and essential to develop a synthesis algorithm for clock spines, which will be compatible to the existing synthesis algorithms of clock trees and clock meshes. To this end, this dissertation first addresses the problem of automating the synthesis of clock-gated clock spines with the objective of minimizing total clock power while meeting the clock skew and slew constraints. The key idea of our proposed synthesis algorithm is to identify and group the flip-flops with tight correlation of clock-gating operations together to form a spine while accurately predicting and maintaining clock skew and slew variations through the buffer insertion and stub allocation. In summary, this dissertation presents clock tuning techniques with consideration of post-silicon tuning, flexible flip-flop timing model, and clock-gated clock spine synthesis algorithm.Abstract i Chapter 1 INTRODUCTION 1 1.1 Clock Distribution Network . . . . . . . . . . . . . . . . . . . . . 1 1.2 Process Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Flexible Flip-flop Timing Model . . . . . . . . . . . . . . . . . . . 3 1.4 Clock Spine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.5 Contributions of This Dissertation . . . . . . . . . . . . . . . . . 6 Chapter 2 POST-SILICON TUNABLE CLOCK BUFFER ALLOCATION BASED ON FAST CHIP YIELD COMPUTATION 8 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Systematic Exploration of PST Buffer Allocation . . . . . . . . . 10 2.2.1 Observations . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.2 Problem Definition . . . . . . . . . . . . . . . . . . . . . . 15 2.2.3 Allocation Algorithm . . . . . . . . . . . . . . . . . . . . . 16 2.3 Fast Timing Yield Computation . . . . . . . . . . . . . . . . . . 17 2.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.3.2 Incremental Yield Computation . . . . . . . . . . . . . . . 22 2.4 Experimental Result . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.5 PST Buffer Configuration Techniques . . . . . . . . . . . . . . . 31 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Chapter 3 POST-SILICON TUNING BASED ON FLEXIBLE FLIP-FLOP TIMING 34 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.2 Preliminary and Definitions . . . . . . . . . . . . . . . . . . . . . 40 3.2.1 Flexible Flip-Flop Timing Model . . . . . . . . . . . . . . 40 3.2.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.3 Motivational Examples . . . . . . . . . . . . . . . . . . . . . . . . 42 3.4 Clock Skew Scheduling for Slack Relaxation Based on Flexible Flip-Flop Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.4.1 Overall Flow . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.4.2 Finding Local Clock Skew Schedule . . . . . . . . . . . . 48 3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Chapter 4 SYNTHESIS FOR POWER-AWARE CLOCK SPINES 61 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2 Preliminaries and Motivation . . . . . . . . . . . . . . . . . . . . 64 4.2.1 Clock Spine . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4.2.2 Activity Patterns . . . . . . . . . . . . . . . . . . . . . . . 67 4.2.3 Power Computation . . . . . . . . . . . . . . . . . . . . . 67 4.3 Algorithm for Clock Spine Synthesis . . . . . . . . . . . . . . . . 68 4.3.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . 68 4.3.2 Power-Aware Sink Clustering . . . . . . . . . . . . . . . . 70 4.3.3 Spine Relaxation . . . . . . . . . . . . . . . . . . . . . . . 77 4.3.4 Spine Buffer Allocation . . . . . . . . . . . . . . . . . . . 80 4.3.5 Top-Level Tree Construction . . . . . . . . . . . . . . . . 86 4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Chapter 5 CONCLUSION 95 5.1 Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.2 Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.3 Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Bibliography 97 초록 106Docto

SNU Open Repository and Archive