8 research outputs found

    Steiner network construction for signal net routing with double-sided timing constraints

    Get PDF
    Compared to conventional Steiner tree signal net routing, non-tree topology is often superior in many aspects including timing performance, tolerance to open faults and variations. In nano-scale VLSI designs, interconnect delay is a performance bottleneck and variation effects are increasingly problematic. Therefore the advantages of non-tree topology are particularly appealing for timing critical net routings in nano-scale VLSI designs. We propose Steiner network construction heuristics which can generate either tree or non-tree of signal net with different slack wirelength tradeoffs, and handle both long path and short path constraints. Extensive experiments in different scenarios show that our heuristics usually improve timing slack by hundreds of pico seconds compared to traditional tree approaches while increasing only slightly in wirelength. These results show that our algorithm is a very promising approach for timing critical net routings

    Improved algorithms for link-based non-tree clock networks for skew variability reduction

    Full text link
    In the nanometer VLSI technology, the variation effects like manufacturing variation, power supply noise, temperature etc. become very significant. As one of the most vital nets in any synchronous VLSI chip, the Clock Distribution Network (CDN) is especially sensitive to these variations. Recently proposed link-based non-tree [1] addresses this problem by constructing a non-tree that is significantly more tolerant to variations when compared to a clock tree. Although the two algorithms proposed in [1] are effective in reducing the skew variability, they have a few drawbacks including high com-plexity, lengthy links and uneven link distribution across the clock network. In this paper, we propose two new algorithms that can overcome these disadvantages. The effectiveness of the proposed algorithms has been validated using HSPICE based Monte Carlo simulations. Experimental results show that the new algorithms are able to achieve the same or better skew reduction with an average of 5 % wire length increase when compared to the 15 % wire length increase of the existing algorithms in [1]. Moreover, the new algorithms scale extremely well to big clock networks, i.e., the bigger the clock network, the less overall link cost (less than 2 % for the biggest benchmark we have)

    Buffer insertion in large circuits using look-ahead and back-off techniques

    Get PDF
    Buffer insertion is an essential technique for reducing interconnect delay in submicron circuits. Though it is a well researched area, there is a need for robust and effective algorithms to perform buffer insertion at the circuit level. This thesis proposes a new buffer insertion algorithm for large circuits. The algorithm finds a buffering solution for the entire circuit such that buffer cost is minimized and the timing requirements of the circuit are satisfied. The algorithm iteratively inserts buffers in the circuit improving the circuit delay step by step. At the core of this algorithm are very simple but extremely effective techniques that constructively guide the search for a good buffering solution. A flexibility to adapt to the user's requirements and the ability to reduce the number of buffers are the strengths of this algorithm. Experimental results on ISCAS85 benchmark circuits show that the proposed algorithm, on average, yields 36% reduction in the number of buffers, and runs three times faster than one of the best known previously researched algorithms

    Analysis and optimization of VLSI Clock Distribution Networks for skew variability reduction

    Get PDF
    As VLSI technology moves into the Ultra-Deep Sub-Micron (UDSM) era, manufacturing variations, power supply noise and temperature variations greatly affect the performance and yield of VLSI circuits. Clock Distribution Network (CDN), which is one of the biggest and most important nets in any synchronous VLSI chip, is especially sensitive to these variations. To address this problem variability-aware analysis and optimization techniques for VLSI circuits are needed. In the first part of this thesis an analytical bound for the unwanted skew due to interconnect variation is established. Experimental results show that this bound is safer, tighter and computationally faster than existing approaches. This bound could be used in variation-aware clock tree synthesis.The second part of the thesis deals with optimizing a given clock tree to minimize the unwanted skew variations. Non-tree CDNs have been recognized as a promising approach to overcome the variation problem. We propose a novel non-tree CDN obtained by adding cross links in an existing clock tree. We analyze the effect of the link insertion on clock skew variability and propose link insertion schemes. The non-tree CDNs so obtained are shown to be highly tolerant to skew variability with very little increase in total wire-length. This can be used in applications such as ASIC design where a significant increase in the total wire-length is unacceptable

    Clock routing for high performance microprocessor designs.

    Get PDF
    Tian, Haitong.Chinese abstract is on unnumbered page.Thesis (M.Phil.)--Chinese University of Hong Kong, 2011.Includes bibliographical references (p. 65-74).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.iiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivations --- p.1Chapter 1.2 --- Our Contributions --- p.2Chapter 1.3 --- Organization of the Thesis --- p.3Chapter 2 --- Background Study --- p.4Chapter 2.1 --- Traditional Clock Routing Problem --- p.4Chapter 2.2 --- Tree-Based Clock Routing Algorithms --- p.5Chapter 2.2.1 --- Clock Routing Using H-tree --- p.5Chapter 2.2.2 --- Method of Means and Medians(MMM) --- p.6Chapter 2.2.3 --- Geometric Matching Algorithm (GMA) --- p.8Chapter 2.2.4 --- Exact Zero-Skew Algorithm --- p.9Chapter 2.2.5 --- Deferred Merge Embedding (DME) --- p.10Chapter 2.2.6 --- Boundary Merging and Embedding (BME) Algorithm --- p.14Chapter 2.2.7 --- Planar Clock Routing Algorithm --- p.17Chapter 2.2.8 --- Useful-skew Tree Algorithm --- p.18Chapter 2.3 --- Non-Tree Clock Distribution Networks --- p.19Chapter 2.3.1 --- Grid (Mesh) Structure --- p.20Chapter 2.3.2 --- Spine Structure --- p.20Chapter 2.3.3 --- Hybrid Structure --- p.21Chapter 2.4 --- Post-grid Clock Routing Problem --- p.22Chapter 2.5 --- Limitations of the Previous Work --- p.24Chapter 3 --- Post-Grid Clock Routing Problem --- p.26Chapter 3.1 --- Introduction --- p.26Chapter 3.2 --- Problem Definition --- p.27Chapter 3.3 --- Our Approach --- p.30Chapter 3.3.1 --- Delay-driven Path Expansion Algorithm --- p.31Chapter 3.3.2 --- Pre-processing to Connect Critical ports --- p.34Chapter 3.3.3 --- Post-processing to Reduce Capacitance --- p.36Chapter 3.4 --- Experimental Results --- p.39Chapter 3.4.1 --- Experiment Setup --- p.39Chapter 3.4.2 --- Validations of the Delay and Slew Estimation --- p.39Chapter 3.4.3 --- Comparisons with the Tree Grow (TG) Approach --- p.41Chapter 3.4.4 --- Lowest Achievable Delays --- p.42Chapter 3.4.5 --- Simulation Results --- p.42Chapter 4 --- Non-tree Based Post-Grid Clock Routing Problem --- p.44Chapter 4.1 --- Introduction --- p.44Chapter 4.2 --- Handling Ports with Large Load Capacitances --- p.46Chapter 4.2.1 --- Problem Ports Identification --- p.47Chapter 4.2.2 --- Non-Tree Construction --- p.47Chapter 4.2.3 --- Wire Link Selection --- p.48Chapter 4.3 --- Path Expansion in Non-tree Algorithm --- p.51Chapter 4.4 --- Limitations of the Non-tree Algorithm --- p.51Chapter 4.5 --- Experimental Results --- p.51Chapter 4.5.1 --- Experiment Setup --- p.51Chapter 4.5.2 --- Validations of the Delay and Slew Estimation --- p.52Chapter 4.5.3 --- Lowest Achievable Delays --- p.53Chapter 4.5.4 --- Results on New Benchmarks --- p.53Chapter 4.5.5 --- Simulation Results --- p.55Chapter 5 --- Efficient Partitioning-based Extension --- p.57Chapter 5.1 --- Introduction --- p.57Chapter 5.2 --- Partition-based Extension --- p.58Chapter 5.3 --- Experimental Results --- p.61Chapter 5.3.1 --- Experiment Setup --- p.61Chapter 5.3.2 --- Running Time Improvement with Partitioning Technique --- p.61Chapter 6 --- Conclusion --- p.63Bibliography --- p.6

    Layout optimization in ultra deep submicron VLSI design

    Get PDF
    As fabrication technology keeps advancing, many deep submicron (DSM) effects have become increasingly evident and can no longer be ignored in Very Large Scale Integration (VLSI) design. In this dissertation, we study several deep submicron problems (eg. coupling capacitance, antenna effect and delay variation) and propose optimization techniques to mitigate these DSM effects in the place-and-route stage of VLSI physical design. The place-and-route stage of physical design can be further divided into several steps: (1) Placement, (2) Global routing, (3) Layer assignment, (4) Track assignment, and (5) Detailed routing. Among them, layer/track assignment assigns major trunks of wire segments to specific layers/tracks in order to guide the underlying detailed router. In this dissertation, we have proposed techniques to handle coupling capacitance at the layer/track assignment stage, antenna effect at the layer assignment, and delay variation at the ECO (Engineering Change Order) placement stage, respectively. More specifically, at layer assignment, we have proposed an improved probabilistic model to quickly estimate the amount of coupling capacitance for timing optimization. Antenna effects are also handled at layer assignment through a linear-time tree partitioning algorithm. At the track assignment stage, timing is further optimized using a graph based technique. In addition, we have proposed a novel gate splitting methodology to reduce delay variation in the ECO placement considering spatial correlations. Experimental results on benchmark circuits showed the effectiveness of our approaches

    Fast interconnect optimization

    Get PDF
    As the continuous trend of Very Large Scale Integration (VLSI) circuits technology scaling and frequency increases, delay optimization techniques for interconnect are increasingly important for achieving timing closure of high performance designs. For the gigahertz microprocessor and multi-million gate ASIC designs it is crucial to have fast algorithms in the design automation tools for many classical problems in the field to shorten time to market of the VLSI chip. This research presents algorithmic techniques and constructive models for two such problems: (1) Fast buffer insertion for delay optimization, (2) Wire sizing for delay optimization and variation minimization on non-tree networks. For the buffer insertion problem, this dissertation proposes several innovative speedup techniques for different problem formulations and the realistic requirement. For the basic buffer insertion problem, an O(n log2 n) optimal algorithm that runs much faster than the previous classical van GinnekenÂs O(n2) algorithm is proposed, where n is the number of buffer positions. For modern design libraries that contain hundreds of buffers, this research also proposes an optimal algorithm in O(bn2) time for b buffer types, a significant improvement over the previous O(b2n2) algorithm by Lillis, Cheng and Lin. For nets with small numbers of sinks and large numbers of buffer positions, a simple O(mn) optimal algorithm is proposed, where m is the number of sinks. For the buffer insertion with minimum cost problem, the problem is first proved to be NP-complete. Then several optimal and approximation techniques are proposed to further speed up the buffer insertion algorithm with resource control for big industrial designs. For the wire sizing problem, we propose a systematic method to size the wires of general non-tree RC networks. The new method can be used for delay optimization and variation reduction
    corecore