The repeater tree construction problem by Bartoschek, Christoph et al.
 
 
 
 
 
 
 
The Repeater Tree Construction 
Problem 
 
Preprint No. M 09/23 
 
Bartoschek, Christoph; Held, Stephan; 
Maßberg, Jens; Rautenbach, Dieter; Vygen, 
Jens  
 
 
 
 
 
 
 
 
 
 2009 
Impressum: 
Hrsg.: Leiter des Instituts für Mathematik 
Weimarer Straße 25 
98693 Ilmenau 
Tel.: +49 3677 69 3621 
Fax: +49 3677 69 3270 
http://www.tu-ilmenau.de/ifm/ 
Technische Universität Ilmenau 
Institut für Mathematik 
ISSN xxxx-xxxx 
The Repeater Tree Construction Problem
C. Bartoschek1, S. Held1, J. Maßberg1, D. Rautenbach2, and J. Vygen1
1 Forschungsinstitut fu¨r Diskrete Mathematik,
Universita¨t Bonn, Lenne´str. 2, D-53113 Bonn, Germany,
emails: {bartosch,held,massberg,vygen}@or.uni-bonn.de
2Institut fu¨r Mathematik,
TU Ilmenau, Postfach 100565, D-98684 Ilmenau, Germany
email: {dieter.rautenbach}@tu-ilmenau.de
Abstract
A tree-like substructure on a computer chip whose task it is to carry a signal from a
source circuit to possibly many sink circuits and which consists only of wires and so-called
repeater circuits is called a repeater tree. We present a mathematical formulation of the
optimization problems related to the construction of such repeater trees. Furthermore,
we prove theoretical properties of a simple iterative procedure for these problems which
was successfully applied in practice.
Keywords: VLSI design; repeater tree; Steiner tree; minimum spanning tree
AMS subject classification: 05C05, 05C85, 68W25, 68W35
1 Introduction
During every computation cycle of a modern highly complex computer chip millions of signals
have to travel between circuits at different locations on the chip area. While for most of
these signals the distances are relatively small and can be bridged by a pure metal connection
between the circuits, there are still many signals which have to travel a relatively long distance.
Elementary physical considerations [5] imply that the delay of an electrical signal propagating
along a metal connection approximately grows quadratically with the traversed distance.
Traditionally, the circuit delay dominated the wire delay and this quadratic growth did not
represent a problem. Nowadays though, due to the continuous shrinking of feature sizes [4, 10],
an ever growing part of the total delay is caused by wires, and long metal connections have to
be split into several parts by inserting so-called repeaters. These repeaters just evaluate the
boolean identity function and serve no logical purpose within the computation of the chip.
Their task is only to linearize the delay as a function of the distance. It is estimated [11]
that for the upcoming 45nm and 32nm technologies up to 35% and 70%, respectively, of all
circuits on a chip might have to be repeaters.
A tree-like substructure on a chip whose task it is to carry a signal from a source circuit to
possibly many sink circuits and which consists only of wires and repeaters is called a repeater
tree. In [2, 3] we proposed algorithms for the construction of repeater tree topologies and
for the actual insertion of repeater circuits into these topologies. During this research we
conceived a simple yet relatively accurate delay model which allows a concise mathematical
1
0 0.5 1 1.5 2
estimated delay (ns)
0
0.5
1
1.5
2
ex
ac
t d
el
ay
 a
fte
r b
uf
fe
rin
g 
an
d 
siz
in
g 
(ns
)
Figure 1: Quality of the delay model
formulation of the repeater tree problem. The purpose of the present paper is to present this
formulation, to explain the main optimization goals, and to prove some theoretical properties
of the algorithms in [2, 3].
2 The Repeater Tree Problem
An instance of the repeater tree (topology) problem consists of
• a source r ∈ R2,
• a finite non-empty set S ⊆ R2 of sinks,
• a required arrival time as ∈ R for every sink s ∈ S, and
• two numbers c, d ∈ R>0.
A feasible solution of such an instance is
• a rooted tree T = (V (T ), E(T )) with vertex set {r} ∪ S ∪ I where I ⊆ R2 is a set of
|S| − 1 points such that r is the root of T and has exactly one child, the elements of I
are the internal vertices of T and have exactly two children each, and the elements of
S are the leaves of T .
In [2, 3] such a feasible solution was called a repeater tree topology, because the number, types,
and positions of the actual repeaters are not yet determined.
The optimization goals for a repeater tree are related to the wiring, to the number of
repeater circuits, and to the timing. We assume that every edge e = (u, v) ∈ E(T ) of T is
realized along a path between the two points u and v in the plane which is shortest with
respect to some norm || · || on R2. Furthermore, we assume that repeaters are inserted in a
relatively uniform way into all wires in order to linearize the delay within the repeater tree.
Hence the wiring and also the number of repeater circuits needed for the physical realization
of the edge e are proportional to ||u− v||. For the entire repeater tree topology, this result in
a total cost of
l(T ) :=
∑
(u,v)∈E(T )
||u− v||.
2
The delay of the signal starting at the root and travelling through T to the sinks has two
components. Let E[r, s] denote the set of edges on the path P in T between the root r and
some sink s ∈ S. The linearized delay along the edges of P is modelled by∑
(u,v)∈E[r,s]
d||u− v||.
Furthermore, every internal vertex on P corresponds to a bifurcation which causes an additive
delay of c along P . For the entire path P , these additional delays sum up to
c(|E[r, s]| − 1).
In practice there is sometimes a certain degree of freedom how to distribute the additional
delay caused by a bifurcation to the two branches [9].
Altogether, we estimate the delay of the signal along P by the sum of these two compo-
nents.
Assuming that the signal starts at time 0 at the root, the slack at some sink s ∈ S in T
is estimated by
σ(T, s) := as −
∑
(u,v)∈E[r,s]
d||u− v|| − c(|E[r, s]| − 1)
and the worst slack equals
σ(T ) := min{σ(T, s) | s ∈ S}.
The restrictions on the number of children of the root and the internal vertices of T imply that
the number of sinks contributes logarithmically to the delay, which corresponds to physical
experience. The accuracy of our simple delay estimation is shown in Figure 1, which compares
our estimation with the real physical delay once the repeater tree has been realized and
optimized. The parameters c and d are technology-dependent. For the 65nm technology their
values are about c = 20ps and d = 220ps/mm.
In principle, a repeater tree topology is acceptable with respect to timing if σ(T ) is non-
negative, i.e. the signal arrives at every sink s ∈ S not later than as. Nevertheless, in order to
account for inaccurate estimations and manufacturing variation, the worst slack σ(T ) should
have at least some reasonable positive value σmin or should even be maximized.
We can formulate three main optimization scenarios: Determine T such that
(O1) σ(T ) is maximized, or
(O2) l(T ) is minimized, or
(O3) for suitable constants α, β, σmin > 0, the expression
αmin{σ(T ), σmin} − βl(T )
is maximized.
While scenario (O1) is reasonable for instances which are very timing critical, scenario (O2) is
reasonable for very timing uncritical instances. Scenario (O3) is probably the practically most
relevant one. In the next section, we will show that (O1) can be solved exactly in polynomial
time. In contrast to that, (O2) is hard even for restricted choices of the norm such as the
l1-norm, since it is essentially the Steiner tree problem [6].
3
3 A Simple Procedure and its Properties
In [2, 3] we considered the following very simple procedure for the construction of repeater
tree topologies.
Choose a sink s1 ∈ S;
V (T1)← {r, s1};
E(T1)← {(r, s1)};
T1 ← (V (T1), E(T1));
n← |S|;
for i = 2 to n do
Choose a sink si ∈ S \ {s1, s2, . . . , si−1}, an edge ei = (u, v) ∈ E(Ti−1), and an
internal vertex xi ∈ R
2;
V (Ti)← V (Ti−1)
.
∪ {xi}
.
∪ {si};
E(Ti)← (E(Ti−1) \ {(u, v)}) ∪ {(u, xi), (xi, v), (xi, si)};
Ti ← (V (Ti), E(Ti));
end
The procedure inserts the sinks one by one according to some order s1, s2, . . . , sn starting
with a tree containing only the root r and the first sink s1. The sinks si for i ≥ 2 are
inserted by subdividing an edge ei with a new internal vertex xi and connecting xi to si. The
behaviour of the procedure clearly depends on the choice of the order, the choice of the edge
ei, and the choice of the point xi ∈ R
2.
In view of the large number of instances which have to be solved in an acceptable time [2, 3]
the simplicity of the above procedure is an important advantage for its practical application.
Furthermore, implementing suitable rules for the choice of si, ei, and xi allows to pursue and
balance various practical optimization goals.
We present two variants (P1) and (P2) of the procedure corresponding to the above
optimization scenarios (O1) and (O2), respectively.
(P1) The sinks are inserted in an order of non-increasing criticality, where the criticality of
a sink s ∈ S is quantified by
−(as − d||r − s||).
(Note that this is the estimated worst slack of a repeater tree topology containing only
the one sink s. Since a sink s can be critical because its required arrival time as is small
and/or because its distance ||r− s|| to the root is large, this is a reasonable measure for
its criticality.)
During the i-th execution of the for-loop, the new internal vertex xi is always chosen
at the same position as r — formally this turns V (Ti) into a multiset — and the edge
ei is chosen such that σ(Ti) is maximized.
(P2) s1 is chosen such that ||r − s1|| = min{||r − s|| | s ∈ S} and during the i-th execution
of the for-loop, si, ei = (u, v), and xi are chosen such that
l(Ti) = l(Ti−1) + ||u− xi||+ ||xi − v||+ ||xi − si|| − ||u− v||
is minimized.
4
Theorem 1 The largest achievable worst slack σopt equals
σ∗(S) := max
{
σ ∈ R |
∑
s∈S
2−b
1
c
(as−d||r−s||−σ)c ≤ 1
}
,
and (P1) generates a repeater tree topology T(P1) with σ
(
T(P1)
)
= σopt.
Proof: Let a′s = as − d||r − s|| for s ∈ S. Let T be an arbitrary repeater tree topology. By
the definition of σ(T ) and the triangle-inequality for || · ||, we obtain
|E[r, s]| − 1 ≤
1
c

as − ∑
(u,v)∈E[r,s]
d||u− v|| − σ(T )


 ≤ ⌊1
c
(
a′s − σ(T )
)⌋
for every s ∈ S. Since the unique child of the root r is itself the root of a binary subtree of
T in which each sink s ∈ S has depth exactly |E[r, s]| − 1, Kraft’s inequality [8] implies∑
s∈S
2−b
1
c
(a′
s
−σ(T ))c ≤
∑
s∈S
2−|E[r,s]|+1 ≤ 1.
By the definition of σ∗(S), this implies σ(T ) ≤ σ∗(S). Since T was arbitrary, we obtain
σopt ≤ σ
∗(S).
It remains to prove that σ
(
T(P1)
)
= σopt = σ
∗(S), which we will do by induction on
n = |S|. For n = 1, the statement is trivial. Now let n ≥ 2. Let sn be the last sink inserted
by (P1), i.e. a′sn = max{a
′
s | s ∈ S}. Let S
′ = S \ {sn}.
Claim
frac
(
σ∗(S)
c
)
∈
{
frac
(
a′s
c
)
| s ∈ S′
}
(1)
where frac(x) := x− bxc denotes the fractional part of x ∈ R.
Proof of the claim: Note that the definition of σ∗(S) implies that 1
c
(a′s − σ
∗(S)) is an integer
for at least one s ∈ S. If the claim is false, then 1
c
(
a′sn − σ
∗(S)
)
∈ Z and 1
c
(a′s − σ
∗(S)) 6∈ Z
for every s ∈ S′. Since a′sn − σ
∗(S) ≥ a′s − σ
∗(S) for every s ∈ S′, this implies⌊
1
c
(
a′sn − σ
∗(S)
)⌋
> max
{⌊
1
c
(
a′s − σ
∗(S)
)⌋
| s ∈ S′
}
and hence ∑
s∈S
2−b
1
c
(a′
s
−σ∗(S))c ≤ 1− 2−b
1
c
(a′sn−σ
∗(S))c.
Now, for some sufficiently small  > 0, we obtain∑
s∈S
2−b
1
c
(a′
s
−(σ∗(S)+))c = 2−b
1
c
(a′sn−σ
∗(S))c+1 +
∑
s∈S′
2−b
1
c
(a′
s
−σ∗(S))c ≤ 1
which contradicts the definition of σ∗(S) and completes the proof of the claim. 2
5
Let T ′(P1) denote the tree produced by (P1) just before the insertion of the last sink sn.
By induction, σ
(
T ′(P1)
)
= σ∗(S′).
First, we assume that there is some sink s′ ∈ S′ such that within T ′(P1)
|E[r, s′]| − 1 <
⌊
1
c
(
a′s′ − σ
∗(S′)
)⌋
.
Choosing en as the edge of T
′
(P1) leading to s
′, results in a tree T such that
σ∗(S) ≥ σopt ≥ σ
(
T(P1)
)
≥ σ(T ) = σ∗(S′) ≥ σ∗(S),
which implies σ
(
T(P1)
)
= σopt = σ
∗(S).
Next, we assume that within T ′(P1)
|E[r, s]| − 1 =
⌊
1
c
(
a′s − σ
∗(S′)
)⌋
for every s ∈ S′. This implies∑
s∈S
2−b
1
c
(a′
s
−σ∗(S′))c >
∑
s∈S′
2−b
1
c
(a′
s
−σ∗(S′))c = 1
and hence σ∗(S) < σ∗(S′). By (1), we obtain
σ∗(S) ≤ max
{
σ | σ < σ∗(S′), frac
(σ
c
)
∈
{
frac
(
a′s
c
)
| s ∈ S′
}}
= max
{
σ | σ < σ∗(S′), frac
(
σ − σ∗(S′)
c
)
∈
{
frac
(
a′s − σ
∗(S′)
c
)
| s ∈ S′
}}
= cmax
{
x | x <
σ∗(S′)
c
, frac
(
x−
σ∗(S′)
c
)
∈
{
frac
(
a′s − σ
∗(S′)
c
)
| s ∈ S′
}}
= c
(
σ∗(S′)
c
− 1 + max
{
frac
(
a′s − σ
∗(S′)
c
)
| s ∈ S′
})
= σ∗(S′)− c(1− δ)
for
δ = max
{
frac
(
a′s − σ
∗(S′)
c
)
| s ∈ S′
}
.
If s′ ∈ S′ is such that
δ = frac
(
a′s′ − σ
∗(S′)
c
)
,
then choosing en as the edge of T
′
(P1) leading to s
′, results in a tree T such that
σ∗(S) ≥ σopt ≥ σ
(
T(P1)
)
≥ σ(T ) = σ∗(S′)− c(1− δ) ≥ σ∗(S),
which implies σ
(
T(P1)
)
= σopt = σ
∗(S) and completes the proof. 2
Theorem 2 (P2) generates a repeater tree topology T for which l(T ) is at most the total
length of a minimum spanning tree on {r} ∪ S with respect to || · ||.
6
Proof: Let n = |S| and for i = 0, 1, . . . , n, let T i denote the forest which is the union of the
tree produced by (P2) after the insertion of the first i sinks and the remaining n− i sinks as
isolated vertices. Note that T 0 has vertex set {r} ∪ S and no edge, while for 1 ≤ i ≤ n, T i
has vertex set {r} ∪ S ∪ {xj | 2 ≤ j ≤ i} and 2i− 1 edges.
Let F0 = (V (F0), E(F0)) be a spanning tree on V (F0) = {r} ∪ S such that
l(F0) =
∑
uv∈E(F0)
||u− v||
is minimum. For i = 1, 2, . . . , n, let Fi = (V (Fi), E(Fi)) arise from(
V
(
T i
)
, E(Fi−1) ∪ E
(
T i
))
by deleting an edge e ∈ E(Fi−1) ∩ E(F0) which has exactly one endvertex in V (Ti−1) such
that Fi is a tree. (Note that this uniquely determines Fi.)
Since (P2) has the freedom to use the edges of F0, the specification of the insertion order
and the locations of the internal vertices in (P2) imply that
l(F0) ≥ l(F1) ≥ l(F2) ≥ . . . ≥ l(Fn).
Since Fn = Tn the proof is complete. 2
For the l1-norm, the well-known result of Hwang [7] together with Theorem 2 imply that
(P2) is an approximation algorithm for the l1-minimum Steiner tree on the set {r} ∪ S with
approximation guarantee 3/2.
We have seen in Theorems 1 and 2 that different insertion orders are favourable for different
optimization scenarios such as (O1) and (O2).
Alon and Azar [1] gave an example showing that for the online rectilinear Steiner tree
problem the best approximation ratio we can achieve is Θ(log n/ log log n), where n is the
number of terminals. Hence inserting the sinks in an order disregarding the locations, like in
(P1), can lead to long Steiner trees, no matter how we decide where to insert the sinks.
The next example shows that inserting the sinks in an order different from the one con-
sidered in (P1) but still choosing the edge ei as in (P1) results in a repeater tree topology
whose worst slack can be much smaller than the largest achievable worst slack.
Example 3 Let c = 1, d = 0 and a ∈ N. We consider the following sequences of −a’s and
0’s
A(1) = (−a, 0),
A(2) = (A(1),−a, 0),
A(3) = (A(2),−a, 0, . . . . . . , 0︸ ︷︷ ︸
1+(21−1)(a+2)
),
A(4) = (A(3),−a, 0, . . . . . . . . . , 0︸ ︷︷ ︸
1+(22−1)(a+2)
), . . . ,
i.e. for l ≥ 2, the sequence A(l) is the concatenation of A(l − 1), one −a, and a sequence of
0’s of length 1 +
(
2l−2 − 1
)
(a+ 2).
7
If the entries of A(l) are considered as the requires arrival times of an instance of the
repeater tree topology problem, then Theorem 1 together with the choice of c and d imply
that the largest achievable worst slack for this instance equals⌊
− log2
(
l2a +
(
1 +
l∑
i=2
(
1 + (2i−2 − 1)(a+ 2)
))
20
)⌋
.
For l = a+ 1 this is at least −2− a− log2(a+ 2).
If we insert the sinks in the order as specified by the sequences A(l), and always choose
the edge into which we insert the next internal vertex such that the worst slack is maximized,
then the following sequence of topologies can arise: T (1) is the topology with two exactly
sinks at depth 2. The worst slack of T (1) is −(a+2). For l ≥ 2, T (l) arises from T (l− 1) by
(a) subdividing the edge of T (l−1) incident with the root with a new vertex x, (b) appending
an edge (x, y) to x, (c) attaching to y a complete binary tree B of depth l− 2, (d) attaching
to one leaf of B two new leaves corresponding to sinks with required arrival times −a and
0, and (e) attaching to each of the remaining 2l−2 − 1 many leaves of B a binary tree ∆
which has a + 2 leaves, all corresponding to sinks of arrival times 0, whose depths in ∆ are
1, 2, 3, . . . , a− 1, a, a + 1, a+ 1. Note that this uniquely determines T (l).
Clearly, the worst slack in T (l) equals −a− (l + 1). Hence for l = a+ 1, the worst slack
equals −2a−2, which differs approximately by a factor of 2 from the largest achievable worst
slack as calculated above.
This example, however, does not show that there is no online algorithm for approximately
maximizing the worst slack, say up to an additive constant of c. It is an open question to
find a bicriteria approximation algorithm, or an algorithm for (O3).
References
[1] N. Alon and Y. Azar, On-line Steiner trees in the Euclidean plane, Discrete and Com-
putational Geometry 10 (1993), 113–121.
[2] C. Bartoschek, S. Held, D. Rautenbach, and J. Vygen, Efficient generation of short and
fast repeater tree topologies, in: Proceedings of the International Symposium on Physical
Design (2006), 120–127.
[3] C. Bartoschek, S. Held, D. Rautenbach, and J. Vygen, Fast buffering for optimizing worst
slack and resource consumption in repeater trees, in: Proceedings of the International
Symposium on Physical Design (2009), 43–50.
[4] J. Cong, An interconnect-centric design flow for nanometer technologies, in: Proceedings
of the IEEE 89 (2001), 505–528.
[5] W.C. Elmore, The transient response of damped linear networks with particular regard
to wideband amplifiers, Journal of Applied Physics 19 (1948), 55–63.
[6] M.R. Garey, and D.S. Johnson, The rectilinear Steiner tree problem is NP-complete,
SIAM Journal on Applied Mathematics 32 (1977), 826–834.
8
[7] F.K. Hwang, On steiner minimal trees with rectilinear distance, SIAM Journal of Applied
Mathematics 30 (1976), 104–114.
[8] L.G. Kraft, A device for quantizing grouping and coding amplitude modulated pulses,
Master thesis, EE Dept., MIT, Cambridge 1949.
[9] J. Maßberg and D. Rautenbach, Binary trees with choosable edge lengths, to appear in
Information Processing Letters.
[10] G.E. Moore, Cramming more components onto integrated circuits, Electronics 38 (1965),
114–117.
[11] P. Saxena, N. Menezes, P. Cocchini, and D. Kirkpatrick, The scaling challenge: can
correct-by-construction design help?, in: Proceedings of the International Symposium
on Physical Design (2003), 51–58.
9
