Buffer insertion is a popular technique to reduce the interconnect delay. The classic buffer insertion algorithm of van has complexity where n is the number of buffer positions.
Introduction
Delay optimization techniques for interconnect are increasingly important for achieving timing closure of high performance designs. One popular technique for reducing interconnect delay is buffer insertion. A recent study by Saxena et al.
projects that 35% of all cells will be intra-block repeaters for the 45 node. Consequently, algorithms that can efficiently insert buffers are essential for the design automation tools.
In 1990, van proposed an optimal buffer insertion algorithm for one buffer type. His algorithm has time complexity where n is the number of candidate buffer positions. Lillis, Cheng and Lin [7] extended van Ginneken's algorithm to allow buffer types in time 
Weiping Shi Dept. of Electrical Engineering
Texas University College Station, Texas 77843, USA.
wshi@ee.tamu.edu
Recently, Shi and Li presented a new algorithm with time complexity O(nlog n) for 2-pin nets, and O(n n)for multi-pin nets, for one buffer type. Several works have built upon van Ginneken's algorithmand its extension for multiple buffer types to include wire sizing simultaneous tree construction [8, 6, 5, 9, noise constraints [2] and resource minimization [7, Modern design libraries may contain hundreds of different buffers with different input capacitances, driving resistance, intrinsic delay, power level, etc. If every buffer available for the given technology is supplied, it is stated in [3] that the current algorithms could possibly take days or even weeks for large designs since all these algorithms are quadratic in terms of b. Alpert et. [3] studied how to reduce the size of the buffer library with a clustering algorithm. Though the buffer library size is reduced, the solution quality is often degraded accordingly.
In this paper, we propose a new algorithm that performs optimal buffer insertion with buffer types in time. Our speedup is achieved by the observation that the candidates that generate new buffered candidates must lie on the convex hull of (Q,C). Experimental results show that our algorithm is significantly faster than previous best algorithms.
Section 2 formulates the problem. Section 3 describes the new algorithm. Simulation results are given in Section4 and conclusions are given in Section 5.
Preliminary
A net is given as a routing tree T = (V,E ) , where 
New Algorithm
The previous best algorithm for multiple buffer types by Lillis, Cheng and Lin consists of three major operations: 1) adding buffers at a buffer position in time, 2) adding a wire in time, and 3) merging two branches in + time, where and are the numbers of buffer positions in the two branches. As a result, their algorithm has time complexity
In this section, we show that the time complexity of the first operation, addingbuffers at a buffer position, can be reducedto and thus our algorithm can achieve total time complexity Assume we have computed the set of nonredundant candidates for and now reach a buffer position see Fig. 1 . Wire (v, has 0 resistance and capacitance. Proof: Since are in the nondecreasing order of capacitance and the given set of nonredundant candidates are in nondecreasing order of it takes
+
Since the operation of adding a buffer is reduced to time from Theorem 1 and 2, it is easy to see that buffer insertion with b buffer types can be done in worst case time with our new algorithm.
the index in time.
= time to merge the two sorted lists.
Simulation
Both the algorithm of Lillis et al. [7] and the new algorithm are implemented in C and runon a Sun SPARC workstations with 400 and 2 GB memory. The device and interconnect parameters are based on TSMC 180 nm technology. We have 4 different buffer libraries, with the size 8, 16, 32 and 64 respectively. is chosen from 180 to 7000 Q, is chosen from 0.7 to 23 and is chosen from 29 ps to 36.4 ps. The sink capacitances range from 2 to 41 The wire resistance is 0.076 and the wire capacitance is 0.118 Table  shows for large industrial circuits, the new algorithm is up to 11times faster than Lillis'
The memory usage is not shown in the table, but there is only almost 2% memory overhead due to the double linked list used by the new algorithm. When b is small, algorithm has a little time overhead compared to Lillis' algorithm. due to function of the two algorithms for the net with 1944 sinks, with respect to the number of buffer positions n. The buffer library size is 32. In the figure, the y axis is normalized to the running time of the case with 1943 buffer positions. We can see that while Lillis' and our algorithms both behave quadratically, our algorithm shows much slower growing trend since the operation of adding a buffer becomes more dominant among three major operations when n increases.
Conclusion
We presented a new algorithm for optimal buffer insertion with buffer types of worst case time This is an improvement of the previous best algorithm Simulation results show our new algorithm is significantly faster than algorithms for large industrial circuits with large buffer libraries. Our algorithm can also be applied to reduce buffer cost. We leave the details to the journal version.
References
C. Alpert and A. Devgan. Wire segmenting for improved buffer insertion. In DAC, pages 588-593, 1997. 
