Abstract-Clock distribution topologies in a three-tier 3-D integrated circuit are explored. Models of three different clock topologies are applied to determine the root to leaf delay. The models incorporate the impedance of the 3-D via between planes based on closed-form expressions of the resistance, inductance, and capacitance of a through silicon via (TSV). The resulting modeled delays are compared to experimental data. Good agree ment between simulation and experimental data is achieved.
I. INTRODUCTION
The era of rapid technology scaling has brought revolu tionary advancements in systems level integration. A potential technology that continues the evolution towards gigascale complexity is three-dimensional integration.
Three-dimensional integration is a novel technology of growing importance with the potential to offer significant performance and functional benefits as compared to conven tional 2-D ICs [1]. 3-D integration provides enhanced inter connectivity, a high device integration density, a reduction in the number and length of the long global wires, and the potential to combine disparate heterogenous technologies [2] . The primary technological innovation required to exploit the benefits of 3-D integration is the through silicon 3-D via (TSV). Models characterizing the electrical behavior of both single [3] and bundled [4] TSVs have been developed.
A focus on circuit level design of 3-D integrated systems is a topic of great urgency. One such critical component is circuit synchronization. The complexity of delivering the clock signal is further exacerbated in 3-D ICs as sequential elements synchronized by the same clock signal can be located on multiple planes. In addition, since the clock network dissipates a significant portion of the total power consumed by a synchronous circuit [5] , the design of a 3-D clock distribution network is further constrained due to greater power density and related thermal concerns.
Symmetric interconnect structures, such as H-and X-trees, are often utilized to distribute the clock signal across a circuit [6] . The symmetry of these structures permits the clock signal to arrive at the leaves of the tree at approximately the same time, resulting in reduced clock skew between leaves. Main taining this symmetry within a 3-D circuit is a significantly more complex task that requires additional design resources.
In this paper, models of three different clock distribution topologies are presented. The root to leaf delay on each plane of a three-plane 3-D integrated circuit is also determined. A brief description of the test circuit and fabrication technology is provided in Section II. An overview of the closed-form expressions that characterize the electrical properties of TSV s to propagate the clock signal between planes is presented in Section III. The clock distribution models are discussed in Section IV. The resulting root to leaf delay (for the leaves on each of the three planes) for the three different clock topologies is presented in Section V. A comparison of the modeled results with experimental data from a 3-D test circuit manufactured by MIT Lincoln Labs (MITLL) is also described in Section V. Finally, some concluding remarks are provided in the final section of the paper.
II. OVERVIEW OF 3-D TEST CIRCUIT
The test circuit consists of three blocks. Each block includes the same logic circuit but implements a different clock distri bution architecture. The total area of the test circuit is 3 mm The manufacturing process developed by MITLL for fully depleted silicon-on-insulator (FDSOI) 3-D circuits is summa rized here [7] , [8] . The MITLL process is a wafer level 3-D integration technology with up to three FDSOI wafers bonded to form a 3-D circuit. The diameter of the wafers is 150 mm. The minimum feature size of the devices is 180 nm, with one polysilicon layer and three metal layers interconnecting the devices on each wafer. A backside metal layer also exists on the upper two planes, providing the starting and landing pads for the TSVs, and the 110, power supply, and ground pads for the entire 3-D circuit. An attractive feature of this process is the high density TSVs. The dimensions of these vias are 1.75 [,Lm X 1.75 [,Lm.
III. CLOSED-FORM EXP RESSIONS OF TSV ELECTRICAL PARAMETERS
Closed-form expressions of the TSV resistance, inductance, and capacitance are presented in this section. The TSV diam eter, length, and plane-to-plane distance are based on the 3-D vias manufactured by MITLL. The resulting resistance, induc tance, and capacitance are compared to numerical simulation, and listed in Ta ble I.
The resistance of a 3-D via is [3] 1
200 to 800 MHz where Vl and ,£ are the TSV radius and length, respectively.
O"w is the conductivity of tungsten. The skin depth 0 reduces the cross-sectional area of the 3-D via. An empirical constant ex is used to fit the 1 GHz resistance to the simulation data, and is based on the physical parameters ,£ and diameter D. Both 0 and ex are provided, respectively, in (4), and (5) and (6).
0= 1
Jrr f f1.oO"w' For frequencies other than DC and GHz, the resistance described by (1)- (3) is adjusted by (7), RJ,,, w = (RlGHz -RDc ) IE: + Roc·
The inductance of a 3-D via is described by (8) - (11) [3]. These four equations express the self-(L 11 ) and mutual inductance (L 2 1 ) of a via at both DC and high frequency. The expressions for the high frequency inductance describe the asymptotic value of the inductance. The range of frequencies for which the closed-form inductance expressions is valid is depicted in Figure 1 . The DC and lasym self-inductance of a TSV is described, respectively, by (8) and (10), while the DC and lasym mutual inductance of a TSV are described, respectively, by (9) and (11) .
The inductance expressions are dependent on the length ,£ and radius Vl of the TSV. The radius is replaced by the pitch P for the expressions characterizing the mutual inductance L 2 1 between two TSVs. ex is used to adjust the partial self inductance, and approaches unity at DC and 0.94 at high frequencies with increasing aspect ratio �. The � parameter, used to adjust the partial mutual inductance, is unity at DC and ranges between 0.49 and 0.93 at high frequencies with increasing aspect ratio [3] . Both ex and � are described, re spectively, by (12) and (13), and (14) 
IV. CLOCK DISTRIBUTION MODELS
Several clock network topologies for 3-D ICs are described and modeled in this section. These architectures combine different topologies common in 2-D circuits, such as H-trees, rings, and meshes [6] . Each of the three blocks includes a different clock distribution structure, schematically illustrated in Figure 3 . The dashed lines depict vertical interconnects implemented by groups of through silicon vias. Multiple TSVs at the connection points between the clock networks are used to lower the resistance of the vertical path while enhancing reliability.
As shown in Figure 3 , these topologies range from purely symmetric to highly asymmetric networks. The effect that these topological choices have on the clock delay is mod eled and experimentally verified. Since the clock signal is distributed in three dimensions, achieving equidistant signal propagation in a 3-D system is not straightforward. This task is further complicated by the different impedance characteristics of the vertical and horizontal interconnects. The symmetry of an H-tree topology and the load balancing characteristics of rings and meshes are thereby exploited.
In each of the circuit blocks, the clock driver for the entire clock network is located on the second plane. The location of the clock driver is chosen to ensure that the clock signal propagates through identical vertical interconnect paths to the first and third planes, ideally resulting in the same delay. The clock driver is implemented with a traditional chain of tapered buffers [9] , [10] . Additionally, buffers are inserted at the leaves of each H-tree in all three topologies. The width of the branches within the H-tree is halved at each branch point [II] , with an initial width of 811m.
The architectures employed in the blocks are: Block A: All of the planes contain a four level H-tree (equivalent to 16 leaves) with identical interconnect charac teristics. All of the H-trees are connected through a group of TSVs at the output of the clock driver. Note that in Figure 3 (a) the H-tree on the second plane is rotated by 90° with respect to the H-trees on the other two planes. This rotation eliminates inductive coupling between the H-trees. All of the H-trees are shielded with two parallel lines connected to ground. Block B: A four level H-tree is included in the second plane. All of the leaves of this H-tree are connected by four TSVs to small local rings on the first and third planes, as illustrated in Figure 3(b) . As in Block A, the H-tree is shielded with two parallel lines connected to ground. Additional interconnect resources form local rings. Block C: The clock distribution network for the second plane is a shielded four level H-tree. Two global rings are utilized for the other two planes, as shown in Figure 3(c) . Buffers are inserted to drive each ring, which are connected by TSV s to the four branch points on the second level of the H tree. The rings on planes A and C are connected to the second level of the H-tree to avoid an unnecessary long ring that would result in a significant capacitive load and to maintain a ring with sides of equal length. Additionally, connecting the ring to the leaves at the perimeter of the H-tree results in a considerable difference in the load among the sinks of the tree, since only the outer leaves are connected to the ring.
The electrical characteristics of the clock distribution net work on each plane are determined through numerical simu lation. Trend lines for the capacitance, DC resistance, 1 GHz resistance, DC self-and mutual inductance, and the asymptotic self-and mutual inductance approximate the electrical param eters of different length interconnect segments within the clock network. These simulations include two ground return paths spaced 2 11m from either side of the clock line. These return paths behave as ground for the electrical field lines emanating from the clock line, resulting in a more accurate estimate of the capacitance.
The electrical path of the clock signals propagating from the root to the leaves of each plane for the H-tree clock topology (see Figure 3(a) ) is depicted in Figure 4 . The size of the source follower NMOS transistor and the dimensions of the clock buffers at the root, leaves, and output circuitry are listed in Ta ble II. The clock network on each plane is composed of 50 11m segments, where a 1t-model represents the electrical properties of each segment. These 50 11m segments model the distributive electrical properties of the interconnect. Similarly, when either meshes (Figure 3(b) ) or rings (Figure 3(c) ) are used on planes A and C (see Figure 4) , each 50 11m segment is replaced with an equivalent 1t-model to more accurately represent the single mesh and ring structure within the test circuit. Note that for the mesh structures, the clock signal is distributed to planes A and C from the leaves of the H-tree in plane B. For the rings topology, the clock signal distributed to planes A and C is driven by buffers at the second level of the H-tree. Both the global rings and local meshes are square topologies with lengths of 500 and 200 f1.m, respectively. These lengths are composed of 50 f1.m long segments. Each segment is replaced with an equivalent 1t-model for a line width of 4 f1.m. The source follower NMOS transistor located in the output circuitry has a length of 180 nm and a width of 12 f1.m. The interconnect connecting the output pads and circuitry to the leaves on each of the three device planes varies in length from 0 to 150 f1.m depending upon the clock topology (with a line width of 2 f1.m), and is also modeled by an equivalent 1t-network. The delay from the root to the leaves of each plane is determined from these models. 
V. COMPARISON OF MODEL WITH EXP ERIMENTAL DATA
The clock delay of the three different 3-D clock distribution topologies is reviewed in this section. A list of the delay from the root to the leafs of each plane and the resulting per cent error as compared to experimental data for each topology is provided in Table III . Good agreement between the model and experimental data is demonstrated. A maximum error of less than 10% is achieved for all clock paths within any specific topology. There is one significant discrepancy between the model and experimental data, which occurs in the bottom device plane (tier A) for the local ring topology. This larger error is due to an inaccurate estimate of the capacitive load on the clock drivers at the leaf of this plane. The capacitance is not easily extracted due to a non-systematic approach to placing the decoupling capacitors, and related interplane capacitive coupling. 
VI. CONCLUSIONS
The design of a clock distribution network for application to 3-D circuits is considerably more complex than the de sign of a 2-D clock distribution network. Three topologies to globally distribute a clock signal within a 3-D circuit have been evaluated. Clock delay simulations incorporating both numerical simulation and analytic expressions produce comparable results to experimentally extracted clock delay measurements from a fabricated 3-D test circuit, exhibiting less than 10% error.
