Micro-channel based liquid cooling has significant capability of removing high density heat in 3D-ICs. The conventional micro-channel structures investigated for cooling 3D-ICs use straight channels. However, the presence of TSVs which form obstacles to the micro-channels prevents distribution of straight micro-channels. In this paper, we investigate the methodology of designing TSV-constrained micro-channel infrastructure. Specifically, we decide the locations and geometry of micro-channels with bended structure so that the cooling effectiveness is maximized. Our micro-channel structure could achieve up to 87% pumping power savings compared with the structure using straight micro-channels.
INTRODUCTION AND MOTIVATION
The three-dimensional integrated circuit (3D-IC) consists of two or more layers of active electronic components which are stacked vertically. Despite its significant performance improvement over 2D circuits such as fast on-chip communications, 3D-IC also exhibits thermal issues due to its high power density caused by the stacked architecture.
While the conventional air cooling might be not enough for stacked 3D-ICs, the micro-channel based liquid cooling provides a better option to address this problem. In the 3D-IC, active (silicon) layers consist of functional units such as cores and memories which dissipate power and are stacked vertically. Micro-channel heat sinks are embedded below each silicon layer and the coolant fluid is pumped through the micro-channels, and takes away the heat generated in the silicon layer [4] . Micro-channels have significant capability of cooling high heat density (as much as 700W/cm 2 [13] ) and therefore are very appropriate for cooling 3D-ICs.
Many works have investigated the thermal modeling of 3D-ICs with micro-channels heat sinks [4] [12] . Some other * This work is partly supported by NSF grants CCF 0937865 and CCF 0917057
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ISPD '12, March 25-28, 2012 , Napa, California, USA. works try to find the best dimensional parameters such as channel width and height so as to improve the overall cooling effectiveness of the micro-channel system [5] [13] .
The conventional micro-channel structures investigated for cooling 3D-ICs use straight channels that spread on the whole chip or in areas that demand high cooling capacity. If the spatial distribution of micro-channels is unconstrained then such an approach results in the best cooling efficiency with the minimum cooling energy (power dissipated to pump the fluid). However 3D-ICs impose significant constraints on how and where the micro-channels could be located due to the presence of TSVs, which allow different layers to communicate. A 3D-IC usually contains thousands of TSVs which are incorporated with clustered or distributed topologies [7] . These TSVs form obstacles to the micro-channels since the channels cannot be placed at the locations of TSVs. Therefore the presence of TSVs prevents distribution of straight micro-channels. This results in the following problems.
1. As illustrated in figure 1(a), micro-channels would fail to reach thermally critical areas thereby resulting in thermal violations and hotspots. 2. To fix the thermal hotspots in areas where microchannels cannot reach, we need to increase the fluid flow rate resulting in a significant increase in cooling energy. To address this problem, we investigate micro-channel with bends as illustrated in figure 1(b) . With bended structure, the micro-channels can reach those TSV-blocked hotspot regions which straight micro-channels cannot reach. This results in better coverage of hotspots and therefore better cooling efficiency and reduced cooling energy. While micro-channels with bends (or serpentine organization of micro-channels) have been investigated in the past [9] , our work is the first one to investigate this structure from the context of 3D-ICs and more specifically address the constraint imposed by TSVs towards spreading of straight microchannels. In this paper, we investigate the methodology of designing TSV-constrained micro-channel infrastructure. Specifically, we decide the locations and geometry of microchannels with bended structure so that its cooling effectiveness is maximized. Our micro-channel structure could achieve up to 87% pumping power savings compared with the micro-channel structure using straight channels.
The organization of this paper is as follows. In section 2, we introduce the thermal and power model of 3D-IC with micro-channel cooling. We investigate the TSV-constrained micro-channel infrastructure design methodology in section 3. The experimental result is given in section 4.
THERMAL AND POWER MODEL FOR 3D-IC WITH MICRO-CHANNELS 2.1 Thermal modeling
The thermal behavior of a 3D-IC with micro-channels could be modeled as an RC network as illustrated in [12] . In the RC network, the resistance corresponds to thermal conduction and capacitance corresponds to heat capacity. The power profile represents current sources in this RC network. In several cases, we are mostly interested in the steady state thermal behavior of the 3D-IC, hence, enabling us to capture the thermal behavior as a pure resistive network [4] . In this case, for a given 3D-IC power profile, the thermal profile could be estimated by solving a system of linear equations of the form GT = Q where G is the thermal conductivity matrix and Q is the power profile. The thermal conductivity matrix G depends on many factors including the material properties, location of channels and TSVs, fluid flow rate etc. Interested readers are referred to [4] [11] [13] for details.
Micro-channel power consumption
The power used by micro-channels for performing chip cooling comes from the work done by the fluid pump to push the coolant fluid into micro-channels. The pumping power Qpump is decided by the pressure drop and volumetric flow rate of micro-channels: Qpump = ∑ N n=1 fn∆Pn, where N is the total number of channels, ∆Pn and fn are the pressure drop and fluid flow rate of the n-th micro-channel. In this paper, we use single-phase laminar liquid flow as the working fluid. Pressure drop and fluid flow rate are interdependent and also related to other micro-channel parameters such as length and width. The pressure drop in a straight microchannel is decided by:
Here L is the length of micro-channel, D h is hydraulic diameter, v is fluid velocity, µ is fluid viscosity and γ is determined by the micro-channel dimension (given in [5] ). Usually fluid pumps are designed to work such that all the micro-channels experience the same pressure drop ∆P . For a given ∆P that the pump delivers across all the channels, fluid velocity v could be estimated by equation 1. The fluid flow rate f = vwaw b could be estimated as well (wa, w b are micro-channel width and height). Also, flow rate could be controlled by changing the pressure drop. Higher pressure drop results in higher flow rate and better cooling.
Modeling Micro-channels with bends
Consider the channel structure shown in figure 2 . The existence of a bend causes a change in the flow properties which impact the cooling effectiveness and pressure drop. An otherwise fully developed laminar flow in the straight part of the channel, when comes across a 90
• bend becomes turbulent/developing around the corner and settles down after traveling some distance downstream into laminar fully developed again (see figure 2) . So a channel with bends has three distinct regions, 1) fully developed laminar flow region 2) the bend corner 3) the developing/turbulent region after the bend [3] [9] . The length of flow developing region is [10] :
where β = w b /wa is the channel aspect ratio. Re is the Raynolds number defined by Re = ρvD h /µ, where ρ, µ and v are the fluid density, viscosity and velocity. The rectangular bend impacts the pressure drop. Due to the presence of bends, the pressure drop in the channel is greater than an equivalent straight channel with exactly the same dimensions. The total pressure drop in a channel with bends is the sum of the pressure drop in the three regions described above (which finally depend on how many bends the channel has). Assuming L is the total channel length, and m is the bend count. Hence m · L d is the total length that has developing/turbulent flow and m · wa is the total length attributed to corners (see figure 2) . Hence the effective channel length attributed to fully developed laminar The total pressure drop in fully developed laminar region is [5] :
wa is the total length of the fully developed laminar region which is explained above, the other parameters are the same as in equation 1. Pressure drop in flow developing region: The pressure drop in each flow developing region is: 
where
2 is a constant associated with the aspect ratio β. Please refer to [3] [6] for details. Pressure drop in corner region: The total pressure drop at all the 90
• bends in a micro-channel is decided by:
where m is the number of corners in the channel, ∆p 90 • is the pressure drop at each bend corner and K90 is the pressure loss coefficient for 90
• bend whose value can be found in [3] . Total pumping power: The total pressure drop in a microchannel with bends is the sum of the pressure drop in the three regions discussed above:
From equations 6, the total pressure drop of a microchannel is a quadratic function of the fluid velocity v. For a given pressure difference applied on a micro-channel, we can calculate the associated fluid velocity by solving equation 6. With the fluid velocity, we can then estimate the fluid flow rate f , and thus estimate the thermal resistance and pumping power for this channel. Hence the pumping power as well as cooling effectiveness of micro-channels with bends is a function of 1) number of bends, 2) location of channels 3) pressure drop across the channel.
Comparing equations 1 and 6, due to the presence of bends, if the same pressure drop is applied on a straight and a bended micro-channel of the same length, the bended channel will have lower fluid velocity, which leads to a lower cooling capability. Therefore, to provide sufficient cooling, we will need to increase the overall pressure drop that the pump delivers, which results in increase of pumping power. But bends allow for better coverage in the presence of TSVs.
MICRO-CHANNEL INFRASTRUCTURE DESIGN: ALGORITHMS
Designing 3D-IC micro-channel infrastructure is a very complex problem. For example there are exponentially many ways to incorporate micro-channels with bends whose impact on the silicon temperature requires us to solve complex system of thermal equations. The specific problem formulation is as follows. Figure 3 represents the problem formulation graphically. Given a set of stacked silicon layers, some of the intermediate layers between silicon layers would have micro-channels (as shown in figure 3(a) , two intermediate layers comprise of micro-channels). The locations of input and output orifices for the micro-channels are assumed known. We would like to find micro-channel routes from one side to the other such that the routes do not intersect, avoid TSVs and provide sufficient cooling at minimum pumping energy.
We impose a graph on each micro-channel layer as indicated in figure 3(b) . In the graph, each grid is represented by a node, and the edges define the immediate neighbors of a node. The micro-channel routing would be performed on this graph. If there is a TSV located on a grid, then its corresponding neighborhood edges are removed since microchannels cannot be routed through TSVs. Let g l i,j = 1 represents the fact that there is a channel connecting grids i and j in the l-th micro-channel layer of the 3D-IC (so i and j must be neighboring nodes in the grid graph and g
. Neither i nor j should have a TSV (because TSVs will not allow channels to go through them). In the first constraint, {CI, CO} represents the set of input and output orifice nodes, I(i) represents the set of i's neighboring nodes. So the first constraint imposes that the input and output orifice nodes must have a neighboring grid they are connected to so that their incoming/outgoing fluid can be pushed into/out-of the micro-channel layer. The next constraint imposes that either a channel goes through a grid (and therefore
In the third constraint, {T SV } represents the set of grids containing TSVs, so micro-channels cannot be routed through these nodes. The following constraint imposes that the temperature is within acceptable limits and the objective tries to minimize the pumping power. 
Overall micro-channel design flow
This is a very complex problem since 1) the variables need to be discrete, and 2) the thermal and pumping power models are highly nonlinear. In this paper we investigate such a methodology as illustrated in figure 4 . Our methodology follows a sequence of logical steps. First the severity of the thermal problem and the need for having micro-channels is evaluated by performing a full scale thermal analysis. Based on the severity of the thermal problem (location, intensity of hotspots) an initial micro-channel design is developed. This design is further improved for reducing the cooling power footprint and improving the thermal effectiveness using iterative methods. Now we go into the details of these individual steps.
Mincost flow based micro-channel design
The full scale 3D thermal analysis would identify locations of hotspots in different layers which cannot be removed by conventional package/air cooling based approaches. These are the areas which require sufficient proximity to the microchannels. Since solving the formulation in equation 7 is intractable, we use simple models to come up with a sufficiently good initial micro-channel infrastructure which is iteratively improved subsequently. In order to develop this initial solution we use the minimum cost flow formulation.
Initialization of the minimum cost flow network: Consider the 3D-IC and the corresponding grid graph of each micro-channel layer as illustrated in figure 3(a)(b) . For each micro-channel layer, we instantiate a minimum cost flow problem as follows (see figure 3(c) for illustration) . The nodes corresponding to the input/output orifices for the given micro-channel layer are assigned a supply/demand of one flow unit. All nodes in the grid graph have a capacity one. The edges have unlimited capacity and are bi-directional (can take fluid flow in either direction). As indicated earlier the edges between two neighboring nodes exist only if neither of the nodes has a TSV. This enforces the routing constraint imposed by TSVs. Figure 3(c) indicates the flow network for the two micro-channel layers.
Each node has a cost whose assignment would be discussed subsequently. We would like to send flow from inlet nodes to outlet nodes such that the capacity constraints are not violated and the cost is minimum. Assigning the node capacity to be 1 would ensure that all the flow from inlet to outlet follows simple paths (non-intersecting and non-cyclic). A minimum cost flow formulation with a well defined node capacity could be solved using very similar methods as a formulation with edge capacity alone [8] . It is noteworthy that because there is an edge between each pair of neighboring nodes, the flow path could take several bends if necessary.
Cost assignment: The cost assignment should be such that the minimum cost flow formulation develops an initial infrastructure that distributes the micro-channels with higher density in areas that demand more cooling. The chip scale thermal analysis would identify locations of grids in the silicon layers that are in dire need of cooling (see figure 3(a) ). A silicon layer would be cooled by the micro-channels both above and below (unless the silicon layer is at the very top or very bottom of the stack). For example, the middle silicon layer in figure 3 (a) could be cooled by two micro-channel layers unlike the top and bottom silicon layers.
As illustrated in figure 3(b) , each micro-channel layer is represented as a grid graph. The amount of cooling required at a certain node in this graph is a function of how hot the top and bottom grids in the silicon layers are. It also depends on how we chose to distribute the cooling demand at a certain location in the silicon layer between the microchannel layers just above and just below. Let us suppose a certain location in the silicon layer has temperature T ≥ Tmax and requires cooling (estimated by full scale thermal analysis). Let uT (with 0 ≤ u ≤ 1) represent the fraction of this cooling demand assigned to the micro-channel grid right above and (1 − u)T represent the cooling demand from the micro-channel grid just below. If u is set too low then most of the cooling will be done by the channel layer below and vice versa for large u. Let u l i be the heat load partitioning factor of grid i in silicon layer l, it is assigned as follows. Case 1: If l is the topmost (bottommost) layer, then u l i = 0(u l i = 1) so that all the cooling demand goes to the microchannel layer right below (above) l, which is layer l−1 (l+1). Case 2: If l is neither top nor bottom layer, 0 ≤ u l i ≤ 1, implying that the heat generated in grid i of silicon layer l needs to be distributed in the two micro-channels layers right above and below. If the channel layers above and below (layers l + 1 and l − 1) have the same number of TSVs then u l i = 1/2, else it is scaled linearly such that more cooling demand is assigned to the micro-channel layer with lesser TSVs.
Given the partitioning factor u l i , the cost is assigned as follows. (See figure 5 for an illustration.) Let cost(i, l) denote the cost for node i in micro-channel layer l (hence layers l − 1 and l + 1 correspond to silicon layers just below and above the micro-channel layer l), three cases are considered depending on whether there is hotspot below and above this node in the silicon layers l − 1 and l + 1. > Tmax), the micro-channel should provide cooling to both sides (above and below), so the cost is:
Here the first component inside the square bracket indicates the cooling demand from the silicon grid above and the second component corresponds to the cooling demand from the silicon grid just below. Higher demand leads to lower cost since we would like micro-channels to pass through high cooling demand regions. See figure 5 for an illustration. Case 2: Hotspot in one side. When the silicon grid i on Figure 5 : Cost assignment only one side (l − 1 or l + 1) is in hotspot region (but not both), the cost is assigned as
Case 3: No hotspot in either side. When there is no hotspot in either side, then the node cost is assigned to a small positive value cost(i, l) = ϵ > 0. The minimum cost flow formulation would therefore route flows such that maximum number of high cooling demand grids are touched by the channels. The non-hotspot regions are assigned a small positive cost. This would enable the minimum cost flow formulation to avoid areas that do not demand high cooling.
Micro-channel refinement
The primary objective of the minimum cost flow formulation is to come up with an initial micro-channel design that carries cooling in sufficient proximity of hot areas. This is not enough to guarantee effective cooling. For example, some channels have several bends and/or may be routed over disproportionately large number of hotspots. Both of these situations cause a degradation in the overall cooling quality. In this section we present approaches for iteratively refining the design for improved cooling effectiveness. The micro-channel infrastructure refinement process works as illustrated in figure 4.
Temperature and Pumping Power Analysis
The impact of micro-channels on the 3D-IC thermal profile is a function of how the micro-channels are routed and also how much fluid flow they carry. The initial design generated using minimum cost flow technique does not prescribe the pressure drop and the fluid flow rate that the channels need to work at. Hence given the micro-channel design, we then need to estimate the smallest pressure drop that the pump needs to work at such that thermal constraints are satisfied. Given the micro-channel design, the smallest pressure drop value results in the smallest pumping energy. As indicated earlier, we assume that all channels are subjected to the same pressure drop by the pump, hence the minimum pressure drop can be determined by linearly increasing ∆P and calculating the thermal profile for each value until the thermal constraints are met. For a given pressure drop across the pump and a given micro-channel design, equation 6 could be used to determine the velocity (fluid flow rate) in each channel. Note that because each channel has different number of bends and total length, the flow rate would be different too. Based on the flow rate information which is computed for a given pressure drop, the associated thermal conductance matrix G could be computed. This information could be used to estimate the thermal profile of the 3D-IC for a given pressure drop. After finding the minimum required ∆P , we could calculate the required pumping power. This technique is highlighted in Algorithm 1.
Iterative micro-channel optimization
The objective of minimum cost flow formulation did not capture cooling energy and/or number of bends in the channels. Figure 6 illustrates typical situations that can occur. In figure 6 , the two micro-channels have significantly different cooling demands ( figure 6(a) ) and number of bends ( figure 6(b) ). Such imbalance (in cooling demand and bend (a) (b) Figure 6 : Examples of (a) unbalanced cooling demand, (b) different number of bends count) leads to increase in the required pressure drop and thereby increasing the pumping energy. The basic idea is that all the channels should have similar levels of heat load, length and number of bends. Hence if a channel has too many bends or goes through many hotspots while others are shorter, then other channels could be made longer thereby more uniformly distributing the heat load and also reducing the number of bends in the most critical micro-channel.
Based on these considerations, we try to refine the initial design by 1) balancing the heat loads among micro-channels and 2)reducing unnecessary bends.
Micro-channel heat load balancing:
Starting from the initial design we identify the microchannels which have disproportionately high heat removal load and spread their heat load into neighboring channels.
Algorithm 2 highlights the iterative pairwise micro-channel cooling load balance process. In the first iteration of pairwise micro-channel cooling workload balance, we start from the channel with the highest cooling workload. Here the cooling workload is measured by the total heat absorbed by the micro-channel, which could be calculated using q = (Tout − Tin)/Rio. Here Tin is the fluid supply temperature at micro-channel inlet, and Tout is the fluid temperature at micro-channel outlet, Rio is the total thermal resistance between the fluid inlet and outlet of that specific channel. Given the pressure drop, power profile of the 3D-IC and the location and dimensions of the micro-channels, these parameters could be easily calculated (see discussion in section 2 and reference [11] ). Assuming i is the channel with the highest cooling workload, we then pick one of i's neighbors (either left or right) with lower cooling workload, say channel k, and balance the workload between channels i and k.
To balance the workload of channels i and k, we firstly partition the hotspot regions covered by channels i and k. Basically, we would like the resultant two parts have similar total amount of heat load (cooling demand). As indicated earlier, the cost of a node i at the l-th micro-channel layer (defined in section 3.2) signifies the degree of cooling desired there. Therefore the total cooling needed in the region covered by channels i and k is simply the sum total of the cost in all the associated grids. We would like each channel to be assigned half of this total cooling load in that region. Hence we partition this region into two subregions with the same total cooling load (that is, same total grid cost). To find the exact route of the micro-channels we can remove the edges connecting the two subregions and solve the minimum cost flow formulation once again. This would ensure that channels i and k do not encroach on each others regions.
The min-cost flow gives a refined micro-channel design. We then redo the temperature analysis and find the required pumping power for the new design using algorithm 1.
In the next iteration of optimization, we find the currently highest workload micro-channel in the new design and balance workload for this channel using the new graph updated in the previous iteration. We repeat this process iteratively Algorithm 2 Pairwise micro-channel cooling load balance Repeat: 1. Pick the micro-channel with highest cooling load i; 2. Pick a micro-channel k from i's neighbor with smaller cooling load, that is, k = argmin k∈{i−1,i+1} (load(k)); 3. Equally divide the hotspot region covered by channels i and k, and assign one of the region to channel i, the other to channel k; 4. Remove some edges on the boundary between these two regions from the grid graph; 5. Resolve the minimum cost flow based on new graph; 6. Temperature analysis and calculating minimum required pumping power using algorithm 1; 7. If no further pumping power saving could be achieved, stop.
until no further pumping power saving could be achieved.
Bend Elimination
As shown in section 2.3, the corners/bends in the microchannel will introduce considerable pressure drop, which increases the pumping power. Bends in micro-channels allow us to reach areas which cannot be directly connected by straight channels due to the presence of TSV obstacles. But unnecessary bends which have been incorporated due to the heuristic nature of our algorithm provide little benefit while impacting the cooling quality. As a final refinement step, we identify all unnecessary bends and replace them with equivalent straight channels or patterns with lesser corners. Note that removing corners in the hotspot region might lead to reduction in the micro-channel cooling performance since it reduces the level of coverage. Hence we only remove those corners in the non-hotspot regions which can easily be identified by the thermal analysis.
EXPERIMENTAL RESULTS
We test our method on a two-tier stacked 3D-IC, and each tier contains a micro-channel layer. For the sake of experiment, each tier contains a four-core CPU. We assume a typical floor plan for each tier. To obtain the power data for each core, we simulated a high performance out-of-order processor with SPEC 2000 CPU benchmarks [2] (using Wattch [1] ). For each benchmark, we simulated a representative 250M instructions and sampled the chip power dissipation values using uniform time intervals. We simulated 20 such benchmarks and the resultant power data gave us the power profile for each core of the CPU on each tier. To generate the power profile of a four-core CPU, we randomly choose 4 of these profiles and arrange them according to the typical floor plan, so each of the resultant power profile represents the power profile on one tier. In the experiment on this twotier 3D chip, we choose 2 of these power profiles and each of them represents the power profile on one tier. That is, the benchmark we use in this experiment is a combination of two of these power profiles, each power profile corresponds to one tier. The area of each chip stack is 1.2 × 1.2cm 2 , and the grid size is 200 × 200µm 2 (so 60 × 60 grids in each layer). The channel dimensions are wa = 100µm , w b = 400µm. The maximum temperature constraint Tmax is 85℃. The maximum available pressure drop is 500kP a.
Comparison with straight channels
We evaluate our method by comparing our micro-channel design with the micro-channel structure with straight channels. In our design, we use 20 micro-channels. For each benchmark, we tested on different number of TSVs and these TSVs are randomly distributed across each layer of the chip. We find the minimum pressure drop required to cool the 3D chip below thermal constraint for our design and then calculate the associated pumping power Qpump. While in the straight channel design we place as many channels as possible to maximize its cooling efficiency. Note that in the micro-channel system with only straight channels, the number and location of channels are constrained by the TSVs. We also find the minimum required pressure drop and associated pumping power for the straight channel design. The comparison is shown in table 1. Note that for the cases when the cooling demand is so large that even using the maximum pressure drop will lead to temperature violation, we label "violation" in these cases.
In the table, Q chip is the total power dissipation for each benchmark, and Qpump is the pumping power, T peak is the maximum temperature achieved under the given pumping power. In the results of our design, we show the results of both the initial design (after solving minimum cost flow), and also the design after pairwise balancing and bend elimination refinement.
As we can see from the table, our algorithm could achieve higher cooling effectiveness compared with the straight channel (with pumping power savings from 13% to 87%). This is because, the presence of TSV constrained the count and locations of straight channels. Especially when the number of TSVs increases, the available locations for straight micro-channels reduce dramatically. Therefore, to provide sufficient cooling to the hotspots blocked by TSVs, the flow rate should increase significantly. Note that, even though the percentage of grids containing TSVs are small (no more than around 2.1%), its impact on straight micro-channel design can be significant since the whole row will become unavailable even when there is only one TSV in this row. However, the presence of TSVs will have much less impact on our design. So we could use a smaller pressure drop (flow rates) to provide sufficient cooling. Moreover, as the number of TSVs increases, the improvement of our design over straight channel becomes more significant. Figure 7 shows the micro-channel infrastructures for benchmark 1 generated by our algorithm. TVSs, which are represented as black squares in the figure, are placed in 1.4% of the grids. The gray area are hotspot regions. Figure 7(a) shows the initial design generated by minimum cost flow, and figures 7(b) and 7(c) are the refined design after pairwise balancing and bend elimination.
CONCLUSION
In this paper, we investigated the methodology of designing TSV-constrained micro-channel infrastructure. We decide the locations and geometry of micro-channels with bends so that sufficient cooling could be provided using minimum pumping power. Our design could achieve up to 87% pumping power saving compared with the micro-channel structure using straight channels.
