8 research outputs found
Steiner network construction for signal net routing with double-sided timing constraints
Compared to conventional Steiner tree signal net routing, non-tree topology is often superior in many aspects including timing performance, tolerance to open faults and variations. In nano-scale VLSI designs, interconnect delay is a performance bottleneck and variation effects are increasingly problematic. Therefore the advantages of non-tree topology are particularly appealing for timing critical net routings in nano-scale VLSI designs. We propose Steiner network construction heuristics which can generate either tree or non-tree of signal net with different slack wirelength tradeoffs, and handle both long path and short path constraints. Extensive experiments in different scenarios show that our heuristics usually improve timing slack by hundreds of pico seconds compared to traditional tree approaches while increasing only slightly in wirelength. These results show that our algorithm is a very promising approach for timing critical net routings
Improved algorithms for link-based non-tree clock networks for skew variability reduction
In the nanometer VLSI technology, the variation effects like manufacturing variation, power supply noise, temperature etc. become very significant. As one of the most vital nets in any synchronous VLSI chip, the Clock Distribution Network (CDN) is especially sensitive to these variations. Recently proposed link-based non-tree [1] addresses this problem by constructing a non-tree that is significantly more tolerant to variations when compared to a clock tree. Although the two algorithms proposed in [1] are effective in reducing the skew variability, they have a few drawbacks including high com-plexity, lengthy links and uneven link distribution across the clock network. In this paper, we propose two new algorithms that can overcome these disadvantages. The effectiveness of the proposed algorithms has been validated using HSPICE based Monte Carlo simulations. Experimental results show that the new algorithms are able to achieve the same or better skew reduction with an average of 5 % wire length increase when compared to the 15 % wire length increase of the existing algorithms in [1]. Moreover, the new algorithms scale extremely well to big clock networks, i.e., the bigger the clock network, the less overall link cost (less than 2 % for the biggest benchmark we have)
Buffer insertion in large circuits using look-ahead and back-off techniques
Buffer insertion is an essential technique for reducing interconnect delay in submicron
circuits. Though it is a well researched area, there is a need for robust and
effective algorithms to perform buffer insertion at the circuit level. This thesis proposes
a new buffer insertion algorithm for large circuits. The algorithm finds a buffering
solution for the entire circuit such that buffer cost is minimized and the timing
requirements of the circuit are satisfied. The algorithm iteratively inserts buffers in
the circuit improving the circuit delay step by step. At the core of this algorithm are
very simple but extremely effective techniques that constructively guide the search
for a good buffering solution. A flexibility to adapt to the user's requirements and the
ability to reduce the number of buffers are the strengths of this algorithm. Experimental
results on ISCAS85 benchmark circuits show that the proposed algorithm, on
average, yields 36% reduction in the number of buffers, and runs three times faster
than one of the best known previously researched algorithms
Analysis and optimization of VLSI Clock Distribution Networks for skew variability reduction
As VLSI technology moves into the Ultra-Deep Sub-Micron (UDSM) era, manufacturing variations, power supply noise and temperature variations greatly affect the performance and yield of VLSI circuits. Clock Distribution Network (CDN), which is one of the biggest and most important nets in any synchronous VLSI chip, is especially sensitive to these variations. To address this problem variability-aware analysis and optimization techniques for VLSI circuits are needed. In the first part of this thesis an analytical bound for the unwanted skew due to interconnect variation is established. Experimental results show that this bound is safer, tighter and computationally faster than existing approaches. This bound could be used in variation-aware clock tree synthesis.The second part of the thesis deals with optimizing a given clock tree to minimize the unwanted skew variations. Non-tree CDNs have been recognized as a promising approach to overcome the variation problem. We propose a novel non-tree CDN obtained by adding cross links in an existing clock tree. We analyze the effect of the link insertion on clock skew variability and propose link insertion schemes. The non-tree CDNs so obtained are shown to be highly tolerant to skew variability with very little increase in total wire-length. This can be used in applications such as ASIC design where a significant increase in the total wire-length is unacceptable
Clock routing for high performance microprocessor designs.
Tian, Haitong.Chinese abstract is on unnumbered page.Thesis (M.Phil.)--Chinese University of Hong Kong, 2011.Includes bibliographical references (p. 65-74).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.iiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivations --- p.1Chapter 1.2 --- Our Contributions --- p.2Chapter 1.3 --- Organization of the Thesis --- p.3Chapter 2 --- Background Study --- p.4Chapter 2.1 --- Traditional Clock Routing Problem --- p.4Chapter 2.2 --- Tree-Based Clock Routing Algorithms --- p.5Chapter 2.2.1 --- Clock Routing Using H-tree --- p.5Chapter 2.2.2 --- Method of Means and Medians(MMM) --- p.6Chapter 2.2.3 --- Geometric Matching Algorithm (GMA) --- p.8Chapter 2.2.4 --- Exact Zero-Skew Algorithm --- p.9Chapter 2.2.5 --- Deferred Merge Embedding (DME) --- p.10Chapter 2.2.6 --- Boundary Merging and Embedding (BME) Algorithm --- p.14Chapter 2.2.7 --- Planar Clock Routing Algorithm --- p.17Chapter 2.2.8 --- Useful-skew Tree Algorithm --- p.18Chapter 2.3 --- Non-Tree Clock Distribution Networks --- p.19Chapter 2.3.1 --- Grid (Mesh) Structure --- p.20Chapter 2.3.2 --- Spine Structure --- p.20Chapter 2.3.3 --- Hybrid Structure --- p.21Chapter 2.4 --- Post-grid Clock Routing Problem --- p.22Chapter 2.5 --- Limitations of the Previous Work --- p.24Chapter 3 --- Post-Grid Clock Routing Problem --- p.26Chapter 3.1 --- Introduction --- p.26Chapter 3.2 --- Problem Definition --- p.27Chapter 3.3 --- Our Approach --- p.30Chapter 3.3.1 --- Delay-driven Path Expansion Algorithm --- p.31Chapter 3.3.2 --- Pre-processing to Connect Critical ports --- p.34Chapter 3.3.3 --- Post-processing to Reduce Capacitance --- p.36Chapter 3.4 --- Experimental Results --- p.39Chapter 3.4.1 --- Experiment Setup --- p.39Chapter 3.4.2 --- Validations of the Delay and Slew Estimation --- p.39Chapter 3.4.3 --- Comparisons with the Tree Grow (TG) Approach --- p.41Chapter 3.4.4 --- Lowest Achievable Delays --- p.42Chapter 3.4.5 --- Simulation Results --- p.42Chapter 4 --- Non-tree Based Post-Grid Clock Routing Problem --- p.44Chapter 4.1 --- Introduction --- p.44Chapter 4.2 --- Handling Ports with Large Load Capacitances --- p.46Chapter 4.2.1 --- Problem Ports Identification --- p.47Chapter 4.2.2 --- Non-Tree Construction --- p.47Chapter 4.2.3 --- Wire Link Selection --- p.48Chapter 4.3 --- Path Expansion in Non-tree Algorithm --- p.51Chapter 4.4 --- Limitations of the Non-tree Algorithm --- p.51Chapter 4.5 --- Experimental Results --- p.51Chapter 4.5.1 --- Experiment Setup --- p.51Chapter 4.5.2 --- Validations of the Delay and Slew Estimation --- p.52Chapter 4.5.3 --- Lowest Achievable Delays --- p.53Chapter 4.5.4 --- Results on New Benchmarks --- p.53Chapter 4.5.5 --- Simulation Results --- p.55Chapter 5 --- Efficient Partitioning-based Extension --- p.57Chapter 5.1 --- Introduction --- p.57Chapter 5.2 --- Partition-based Extension --- p.58Chapter 5.3 --- Experimental Results --- p.61Chapter 5.3.1 --- Experiment Setup --- p.61Chapter 5.3.2 --- Running Time Improvement with Partitioning Technique --- p.61Chapter 6 --- Conclusion --- p.63Bibliography --- p.6
Layout optimization in ultra deep submicron VLSI design
As fabrication technology keeps advancing, many deep submicron (DSM) effects have become
increasingly evident and can no longer be ignored in Very Large Scale Integration
(VLSI) design. In this dissertation, we study several deep submicron problems (eg. coupling
capacitance, antenna effect and delay variation) and propose optimization techniques
to mitigate these DSM effects in the place-and-route stage of VLSI physical design.
The place-and-route stage of physical design can be further divided into several steps:
(1) Placement, (2) Global routing, (3) Layer assignment, (4) Track assignment, and (5) Detailed
routing. Among them, layer/track assignment assigns major trunks of wire segments
to specific layers/tracks in order to guide the underlying detailed router. In this dissertation,
we have proposed techniques to handle coupling capacitance at the layer/track assignment
stage, antenna effect at the layer assignment, and delay variation at the ECO (Engineering
Change Order) placement stage, respectively. More specifically, at layer assignment, we
have proposed an improved probabilistic model to quickly estimate the amount of coupling
capacitance for timing optimization. Antenna effects are also handled at layer assignment
through a linear-time tree partitioning algorithm. At the track assignment stage, timing is
further optimized using a graph based technique. In addition, we have proposed a novel
gate splitting methodology to reduce delay variation in the ECO placement considering
spatial correlations. Experimental results on benchmark circuits showed the effectiveness
of our approaches
Fast interconnect optimization
As the continuous trend of Very Large Scale Integration (VLSI) circuits technology
scaling and frequency increases, delay optimization techniques for interconnect
are increasingly important for achieving timing closure of high performance designs.
For the gigahertz microprocessor and multi-million gate ASIC designs it is crucial to
have fast algorithms in the design automation tools for many classical problems in
the field to shorten time to market of the VLSI chip. This research presents algorithmic
techniques and constructive models for two such problems: (1) Fast buffer
insertion for delay optimization, (2) Wire sizing for delay optimization and variation
minimization on non-tree networks.
For the buffer insertion problem, this dissertation proposes several innovative
speedup techniques for different problem formulations and the realistic requirement.
For the basic buffer insertion problem, an O(n log2 n) optimal algorithm that runs
much faster than the previous classical van GinnekenÂs O(n2) algorithm is proposed,
where n is the number of buffer positions. For modern design libraries that contain
hundreds of buffers, this research also proposes an optimal algorithm in O(bn2) time
for b buffer types, a significant improvement over the previous O(b2n2) algorithm
by Lillis, Cheng and Lin. For nets with small numbers of sinks and large numbers
of buffer positions, a simple O(mn) optimal algorithm is proposed, where m is the
number of sinks. For the buffer insertion with minimum cost problem, the problem is first proved to be NP-complete. Then several optimal and approximation techniques
are proposed to further speed up the buffer insertion algorithm with resource control
for big industrial designs.
For the wire sizing problem, we propose a systematic method to size the wires of
general non-tree RC networks. The new method can be used for delay optimization
and variation reduction