10 research outputs found

    Improved algorithms for link-based non-tree clock networks for skew variability reduction

    Full text link
    In the nanometer VLSI technology, the variation effects like manufacturing variation, power supply noise, temperature etc. become very significant. As one of the most vital nets in any synchronous VLSI chip, the Clock Distribution Network (CDN) is especially sensitive to these variations. Recently proposed link-based non-tree [1] addresses this problem by constructing a non-tree that is significantly more tolerant to variations when compared to a clock tree. Although the two algorithms proposed in [1] are effective in reducing the skew variability, they have a few drawbacks including high com-plexity, lengthy links and uneven link distribution across the clock network. In this paper, we propose two new algorithms that can overcome these disadvantages. The effectiveness of the proposed algorithms has been validated using HSPICE based Monte Carlo simulations. Experimental results show that the new algorithms are able to achieve the same or better skew reduction with an average of 5 % wire length increase when compared to the 15 % wire length increase of the existing algorithms in [1]. Moreover, the new algorithms scale extremely well to big clock networks, i.e., the bigger the clock network, the less overall link cost (less than 2 % for the biggest benchmark we have)

    Case Studies on Clock Gating and Local Routign for VLSI Clock Mesh

    Get PDF
    The clock is the important synchronizing element in all synchronous digital systems. The difference in the clock arrival time between sink points is called the clock skew. This uncertainty in arrival times will limit operating frequency and might cause functional errors. Various clock routing techniques can be broadly categorized into 'balanced tree' and 'fixed mesh' methods. The skew and delay using the balanced tree method is higher compared to the fixed mesh method. Although fixed mesh inherently uses more wire length, the redundancy created by loops in a mesh structure reduces undesired delay variations. The fixed mesh method uses a single mesh over the entire chip but it is hard to introduce clock gating in a single clock mesh. This thesis deals with the introduction of 'reconfigurability' by using control structures like transmission gates between sub-clock meshes, thus enabling clock gating in clock mesh. By using the optimum value of size for PMOS and NMOS of transmission gate (SZF) and optimum number of transmission gates between sub-clock meshes (NTG) for 4x4 reconfigurable mesh, the average of the maximum skew for all benchmarks is reduced by 18.12 percent compared to clock mesh structure when no transmission gates are used between the sub-clock meshes (reconfigurable mesh with NTG =0). Further, the research deals with a ‘modified zero skew method' to connect synchronous flip-flops or sink points in the circuit to the clock grids of clock mesh. The wire length reduction algorithms can be applied to reduce the wire length used for a local clock distribution network. The modified version of ‘zero skew method’ of local clock routing which is based on Elmore delay balancing aims at minimizing wire length for the given bounded skew of CDN using clock mesh and H-tree. The results of ‘modified zero skew method' (HC_MZSK) show average local wire length reduction of 17.75 percent for all ISPD benchmarks compared to direct connection method. The maximum skew is small for HC_MZSK in most of the test cases compared to other methods of connections like direct connections and modified AHHK. Thus, HC_MZSK for local routing reduces the wire length and maximum skew

    Reducing Clock Skew Variability via Cross Links

    No full text
    Increasingly significant variational e#ects present a great challenge for delivering desired clock skew reliably. Non-tree clock network has been recognized as a promising approach to overcome the variation problem. Existing non-tree clock routing methods are restricted to a few simple or regular structures, and often consume excessive amount of wires. In this paper, we suggest to construct a low cost non-tree clock network by inserting cross links in a given clock tree. The e#ect of the link insertion on clock skew variability is analyzed. Based on the analysis, two link insertion schemes are proposed. These methods can quickly convert a clock tree to a non-tree with significantly lower skew variability and very small amount of extra wires. Further, they can be applied to the recently popular non-zero skew routing easily. Experimental results on benchmark circuits show that this approach can achieve significant skew variability reduction with less than 2% increase of wirelength

    Design methodologies for variation-aware integrated circuits

    Get PDF
    The scaling of VLSI technology has spurred a rapid growth in the semiconductor industry. With the CMOS device dimension scaling to and beyond 90nm technology, it is possible to achieve higher performance and to pack more complex functionalities on a single chip. However, the scaling trend has introduced drastic variation of process and design parameters, leading to severe variability of chip performance in nanometer regime. Also, the manufacturing community projects CMOS will scale for three to four more generations. Since the uncertainties due to variations are expected to increase in each generation, it will significantly impact the performance of design and consequently the yield. Another challenging issue in the nanometer IC design is the high power consumption due to the greater packing density, higher frequency of operation and excessive leakage power. Moreover, the circuits are usually over-designed to compensate for uncertainties due to variations. The over-designed circuits not only make timing closure difficult but also cause excessive power consumption. For portable electronics, excessive power consumption may reduce battery life; for non-portable systems it may impose great difficulties in cooling and packaging. The objective of my research has been to develop design methodologies to address variations and power dissipation for reliable circuit operation. The proposed work has been divided into three parts: the first part addresses the issues related with power/ground noise induced by clock distribution network and proposes techniques to reduce power/ground noise considering the effects of process variations. The second part proposes an elastic pipeline scheme for random circuits with feedback loops. The proposed scheme provides a low-power solution that has the same variation tolerance as the conventional approaches. The third section deals with discrete buffer and wire sizing for link-based non-tree clock network, which is an energy efficient structure for skew tolerance to variations. For the power/ground noise problem, our approach could reduce the peak current and the delay variations by 50% and 51% respectively. Compared to conventional approach, the elastic timing scheme reduces power dissipation by 20% − 27%. The sizing method achieves clock skew reduction of 45% with a small increase in power dissipation

    Reducing clock skew variability via cross links

    No full text
    Abstract — Increasingly significant variational effects present a great challenge for delivering desired clock skew reliably. Non-tree clock network has been recognized as a promising approach to overcome the variation problem. Existing non-tree clock routing methods are restricted to a few simple or regular structures, and often consume excessive amount of wirelength. In this paper, we suggest to construct a low cost non-tree clock network by inserting cross links in a given clock tree. The effects of the link insertion on clock skew variability are analyzed. Based on the analysis, we propose two link insertion schemes that can quickly convert a clock tree to a non-tree with significantly lower skew variability and very limited wirelength increase. In these schemes, the complicated non-tree delay computation is circumvented. Further, they can be applied to the recently popular non-zero skew routing easily. The effectiveness of the proposed techniques has been validated through SPICE based Monte Carlo simulations. I

    Layout optimization in ultra deep submicron VLSI design

    Get PDF
    As fabrication technology keeps advancing, many deep submicron (DSM) effects have become increasingly evident and can no longer be ignored in Very Large Scale Integration (VLSI) design. In this dissertation, we study several deep submicron problems (eg. coupling capacitance, antenna effect and delay variation) and propose optimization techniques to mitigate these DSM effects in the place-and-route stage of VLSI physical design. The place-and-route stage of physical design can be further divided into several steps: (1) Placement, (2) Global routing, (3) Layer assignment, (4) Track assignment, and (5) Detailed routing. Among them, layer/track assignment assigns major trunks of wire segments to specific layers/tracks in order to guide the underlying detailed router. In this dissertation, we have proposed techniques to handle coupling capacitance at the layer/track assignment stage, antenna effect at the layer assignment, and delay variation at the ECO (Engineering Change Order) placement stage, respectively. More specifically, at layer assignment, we have proposed an improved probabilistic model to quickly estimate the amount of coupling capacitance for timing optimization. Antenna effects are also handled at layer assignment through a linear-time tree partitioning algorithm. At the track assignment stage, timing is further optimized using a graph based technique. In addition, we have proposed a novel gate splitting methodology to reduce delay variation in the ECO placement considering spatial correlations. Experimental results on benchmark circuits showed the effectiveness of our approaches

    Fast interconnect optimization

    Get PDF
    As the continuous trend of Very Large Scale Integration (VLSI) circuits technology scaling and frequency increases, delay optimization techniques for interconnect are increasingly important for achieving timing closure of high performance designs. For the gigahertz microprocessor and multi-million gate ASIC designs it is crucial to have fast algorithms in the design automation tools for many classical problems in the field to shorten time to market of the VLSI chip. This research presents algorithmic techniques and constructive models for two such problems: (1) Fast buffer insertion for delay optimization, (2) Wire sizing for delay optimization and variation minimization on non-tree networks. For the buffer insertion problem, this dissertation proposes several innovative speedup techniques for different problem formulations and the realistic requirement. For the basic buffer insertion problem, an O(n log2 n) optimal algorithm that runs much faster than the previous classical van GinnekenÂs O(n2) algorithm is proposed, where n is the number of buffer positions. For modern design libraries that contain hundreds of buffers, this research also proposes an optimal algorithm in O(bn2) time for b buffer types, a significant improvement over the previous O(b2n2) algorithm by Lillis, Cheng and Lin. For nets with small numbers of sinks and large numbers of buffer positions, a simple O(mn) optimal algorithm is proposed, where m is the number of sinks. For the buffer insertion with minimum cost problem, the problem is first proved to be NP-complete. Then several optimal and approximation techniques are proposed to further speed up the buffer insertion algorithm with resource control for big industrial designs. For the wire sizing problem, we propose a systematic method to size the wires of general non-tree RC networks. The new method can be used for delay optimization and variation reduction

    High-performance and Low-power Clock Network Synthesis in the Presence of Variation.

    Full text link
    Semiconductor technology scaling requires continuous evolution of all aspects of physical design of integrated circuits. Among the major design steps, clock-network synthesis has been greatly affected by technology scaling, rendering existing methodologies inadequate. Clock routing was previously sufficient for smaller ICs, but design difficulty and structural complexity have greatly increased as interconnect delay and clock frequency increased in the 1990s. Since a clock network directly influences IC performance and often consumes a substantial portion of total power, both academia and industry developed synthesis methodologies to achieve low skew, low power and robustness from PVT variations. Nevertheless, clock network synthesis under tight constraints is currently the least automated step in physical design and requires significant manual intervention, undermining turn-around-time. The need for multi-objective optimization over a large parameter space and the increasing impact of process variation make clock network synthesis particularly challenging. Our work identifies new objectives, constraints and concerns in the clock-network synthesis for systems-on-chips and microprocessors. To address them, we generate novel clock-network structures and propose changes in traditional physical-design flows. We develop new modeling techniques and algorithms for clock power optimization subject to tight skew constraints in the presence of process variations. In particular, we offer SPICE-accurate optimizations of clock networks, coordinated to reduce nominal skew below 5 ps, satisfy slew constraints and trade-off skew, insertion delay and power, while tolerating variations. To broaden the scope of clock-network-synthesis optimizations, we propose new techniques and a methodology to reduce dynamic power consumption by 6.8%-11.6% for large IC designs with macro blocks by integrating clock network synthesis within global placement. We also present a novel non-tree topology that is 2.3x more power-efficient than mesh structures. We fuse several clock trees to create large-scale redundancy in a clock network to bridge the gap between tree-like and mesh-like topologies. Integrated optimization techniques for high-quality clock networks described in this dissertation strong empirical results in experiments with recent industry-released benchmarks in the presence of process variation. Our software implementations were recognized with the first-place awards at the ISPD 2009 and ISPD 2010 Clock-Network Synthesis Contests organized by IBM Research and Intel Research.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/89711/1/ejdjsy_1.pd

    Cross link insertion for variation driven clock network construction.

    Get PDF
    Clock skew caused by variation is one of the most important problems in clock network synthesis today. Even if a clock network is designed to have zero skew, variation such as capacitive load and power supply will cause differences in arrival time of a clock signal. Non-tree clock network is considered to be an effective way to address the skew variation problem. Due to its inherent redundancy, clock mesh is very tolerant to variation. However, it costs much excessive amount of power compared to a clock tree. Link based non-tree clock network is an economic way to reduce clock skew caused by variation. Instead of using a dense mesh, only a number of links are inserted into a tree, so the power increase is small. Several existing works focus on the effect of cross link as well as the construction of such cross link structure. However, it is still not very clear where cross links should be inserted to achieve the most clock skew reduction with small wire resources. In this thesis, we propose a new method using linear program to solve this problem. In our approach, clock skew in a non-tree clock network is computed using an idea of load redistribution and non-tree decomposition. The delay information obtained is then used to select the node pairs for cross link insertion. Our methodology tries to insert cross links where skew can be reduced most effectively. Our method also considers tradeoff between cross link length and skew reduction effect. We compare our result with the most similar work on this problem [1] and a recent work [4] which inserts links between internal nodes of a tree. Experiments show that our method can reduce skew under variation effectively. We achieve 28% clock skew reduction with only 40% link resources.Qian, Fuqiang.Thesis (M.Phil.)--Chinese University of Hong Kong, 2012.Includes bibliographical references (leaves 51-55).Abstract --- p.iAcknowledgement --- p.iiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Clock Distribution Network --- p.1Chapter 1.2 --- Our Contributions --- p.6Chapter 1.3 --- Organization of the Thesis --- p.8Chapter 2 --- Literature Review --- p.9Chapter 2.1 --- Exact Zero Skew --- p.9Chapter 2.2 --- DME Algorithm --- p.11Chapter 2.3 --- Combinatorial Algorithms for Fast Clock Mesh Optimization --- p.12Chapter 2.4 --- MeshWorks: An Efficient Framework for Planning, Synthesis and Optimization of Clock Mesh Networks --- p.14Chapter 2.5 --- Reducing Clock Skew variability via Cross Links --- p.16Chapter 2.6 --- Statistical Based Link Insertion for Robust Clock Network Design --- p.18Chapter 2.7 --- Variation Tolerant Buffered Clock Network Synthesis with Cross Links --- p.20Chapter 2.8 --- Cross Link Insertion for Improving Tolerance to Variations in Clock Network Synthesis --- p.22Chapter 3 --- Clock Network Construction with Cross Links --- p.24Chapter 3.1 --- Signal Delay and Clock Skew in Non-tree Clock Network --- p.24Chapter 3.1.1 --- Computing Delay in Non-tree Network --- p.25Chapter 3.1.2 --- Effect of a Cross Link on Clock Skew --- p.27Chapter 3.2 --- Link Insertion for Non-tree Clock Network --- p.28Chapter 3.2.1 --- Motivation of Computing Delay for Link Insertion --- p.29Chapter 3.2.2 --- Overall Flow for Cross Link Insertion --- p.30Chapter 3.2.3 --- Linear Program for Selecting Node Pairs --- p.31Chapter 3.2.4 --- Reducing the Number of Optimizations --- p.35Chapter 3.2.5 --- Experimental Results --- p.37Chapter 4 --- Buffered Clock Network with Cross Links --- p.41Chapter 4.1 --- Link Insertion in Buffered Clock Network --- p.41Chapter 4.1.1 --- Delay Calculation in Buffered Clock Network --- p.42Chapter 4.1.2 --- Linear Program Formulation for Buffered Clock Network --- p.43Chapter 4.2 --- Experimental Results and Comparison --- p.44Chapter 4.3 --- Possible Extensions --- p.46Chapter 4.3.1 --- Link Insertion at Internal Nodes --- p.46Chapter 4.3.2 --- Modeling Clock Buffer Delay Variation --- p.47Chapter 5 --- Conclusion --- p.49Bibliography --- p.5
    corecore