Abstract-Three-dimensional IC (3D IC) exhibits various advantages over traditional two-dimensional IC, including heterogeneous integration, reduced delay and power dissipation, smaller chip area, etc. Wafer-on-wafer stacking is attractive for 3D IC fabrication, but it suffers from low compound yield of the stacked chips. To improve the compound yield, a novel manipulation of sector symmetry and cut (SSC) is proposed. In this method, wafers with rotational symmetry are cut into identical sectors, which are then used to replenish the repositories. The SSC method is combined with best-pair matching algorithm for compound yield evaluation. Simulation results show that: 1) For wafers with radially clustered defects, plain rotation of wafers offers trivial benefits in yield. 2) SSC shows significantly higher yield than that for existing methods under various conditions. The advantage becomes even more obvious with increased repository size, larger number of stacked layers, and the decreased wafer yield. 3) A cut number of 4 is always optimal in increasing the final production size of good 3D ICs.
I. INTRODUCTION
Despite many challenges, the three-dimensional integrated circuit (3D IC) is a hot topic in semiconductor industry these days [2] , [6] , [8] . To achieve higher levels of integration, multiple layers of active electronic component are stacked vertically in a 3D IC. Connections between layers are provided by through silicon vias (TSVs) [1] , [4] , [7] , [10] . TSVs are short and reduce the need for long interconnects as required on planar ICs, thus reducing the delay and power consumption [2] , [9] . 3D IC also offers heterogeneous integration, which means that dies in the stack can be fabricated by different vendors and can be optimized according to their own technologies [2] , [6] , [8] . Moreover, a 3D IC offers smaller device footprint, which is desirable in hand-held devices.
Currently, there are three types of layer stacking methods in 3D IC fabrication, namely, die-on-die, wafer-on-wafer and die-on-wafer. Among these, wafer-on-wafer stacking is most attractive. It offers the highest production throughput since each stack bonding produces a large number of stacked ICs. Other advantages of wafer-on-wafer stacking include smaller die sizes, thinner wafers, and higher TSV densities [11] , [14] , [18] . For high-end systems with dense TSVs, waferon-wafer stacking may be the only way for fabrication. Although the other two stacking methods offer higher final yield, they are harder to handle, stack, and process, besides being expensive [3] , [12] .
On the other hand, a bottleneck in the wafer-on-wafer stacking is its relatively low compound yield, especially for large number of stacked layers and low wafer yields. In this paper, a novel manipulation of wafers is introduced. To be specific, wafers fabricated with rotational symmetry are cut into identical sectors called subwafers. In this work, we refer to two subwafers as identical if they have the same die distribution and orientation. These identical subwafers are then used to replenish the repository which consists of subwafers corresponding to a specific layer of a 3D IC stack. Simulation results show that compared to the work in [12] , [13] , the relative improvement of compound yield can be greater as as high as 84%. This advantage is significantly higher compared to [15] , [16] , where the reported yield improvement was more than various other proposed methods [11] , [14] , [18] .
The rest of this paper is organized as follows. Section II introduces the background and motivation for this work. Section III introduces a novel idea, i.e., sector symmetry and cut (SSC). Experimental results and comparison to related work [13] , [15] , [16] are presented in Section IV. Section V concludes the paper.
II. BACKGROUND AND MOTIVATION
To improve the yield, or more accurately predict the actual yield, of wafer-on-wafer stacking two kinds of previous efforts are worth mentioning:
1) Matching algorithms have been proposed so as to select the best matching wafers to stack instead of stacking them randomly [11] , [13] , [14] , [15] , [16] , [18] . 2) Exploiting more practical defect distribution model (such as considering wafer maps with radially clustered defects [13] ) or specially designing wafers (like fabricating wafer with rotational symmetry so that two wafers can be matched in more ways than one [12] , [20] ) have been proposed. Figure 1 summarizes the existing work and shows four different aspects of the wafer-on-wafer stacking procedures. These are defect distribution models, wafer manipulation, repository replenishment schemes, and matching algorithms. Notice that these aspects are generally independent of each other and any alternative can be generally selected for one aspect without interfering with choices of others. In Figure 1 , a top to bottom path show the choices made by the referenced work. For example, the leftmost path means that reference [11] uses a uniform defect distribution model, takes no action for wafer manipulation, and uses greedy, IMH, and an ILP matching algorithms based on a static repository replenishment scheme. Horizontally, Figure 1 organizes (left to right) the existing works in an ascending order of their publication date.
The rightmost path in Figure 1 expresses the motivating factor for the present work. The motivation for a sector symmetry & cut strategy comes from the realization that compared to die-on-die or die-on-wafer stacking, the low yield of wafer-on-wafer stacking is due to the restriction of keeping all dies on a wafer together. If the wafer can be cut into several smaller sectors, the restrictions among the dies on a wafer can be reduced. Moreover, if the wafer being cut is fabricated with rotational symmetry, then each sector (subwafer) will look identical. Due to such flexibility in wafer matching, large stacking yield improvement is expected.
III. SECTOR SYMMETRY AND CUT FOR YIELD IMPROVEMENT

A. Rotationally symmetric wafers
An example of a wafer with rotational symmetry is illustrated in Figure 2 this characteristic, then any pair of wafers drawn from two different repositories can be matched in any one of the four 90°rotational positions. This virtually enlarges the physical repository size four times. The wafer map introduced in [12] is only capable of fourfold rotation, here we also consider wafers capable of double rotation as shown in Figure 2(b) .
B. Wafers cut into sectors
Compared with designing wafer with rotational symmetry, a more flexible manipulation is to cut each individual wafer to several sectors (called subwafers). If all wafers can be cut into subwafers, then a subwafer can match with any subwafer in another repository. Previously, all subwafers of a wafer were kept together (uncut) during wafer matching. Because by cutting the wafer, the restriction of matching a sector from one wafer with any arbitrary sector of another wafer applied. Figure 3 shows four 90°sectors cut from a conventional wafer. Similarly, we can cut the wafer into halves (180°sectors) or any number of sectors.
Cutting a wafer into sectors offers an adaptive method between wafer-on-wafer stacking and die-on-die stacking. Comparing with die-on-die stacking or die-on-wafer stacking, the throughput is increased mainly because now each stack produces a sector of 3D ICs. Comparing with wafer-onwafer-stacking, the yield should be improved because of the reduced restrictions between sectors on an individual wafer. It is quite obvious that extreme cutting will produce individual dies, and it becomes die-on-die stacking. In this way, the stacking has most of the flexibility and thus produces the highest yield. However, there is a downside. Stacking and bonding of individual dies requires larger amount of effort. The other extreme is not to cut at all. In that case it becomes wafer-on-wafer stacking, which has the lowest yield.
C. Sector symmetry and cut
After cutting wafers into subwafers, each subwafer can only be matched to another subwafer located at the same position within the wafer. For example, the top-left subwafer (second subwafer in Figure 3 ) from repository 1 can only be matched to the top-left subwafer from repository 2. If all subwafers look identical, the restrictions of subwafer locations are eliminated and the matching will be more flexible. The idea to obtain identical subwafers from an individual wafer is straightforward. If the subwafers are cut from a wafer fabricated with rotational symmetry, all subwafer look identical. Figure 4 illustrates the sector symmetry and cut manipulation to the wafer in Figure 2 (a). Now, any subwafer from one repository can be matched to any subwafer from another repository. The sector symmetry and cut method provides more choices for subwafer stacking in matching algorithms.
D. Discussion on the number of cuts
It is quite natural to think about cutting wafers with rotational symmetry to more sectors than just 2 or 4. However, if a wafer is cut into either 3 or 4 or more sectors, new challenges appear. We make two observations. One, dies on the wafer cannot be arranged as compactly as is the case for 2 or 4 sectors. In other words, there will be space wasted at the edges of each sector due to the square shape of the chip. Two, cutting a wafer into too many narrow sectors will generate a circular area of certain radius inside which chips cannot be printed, i.e., the area within the circle will be too small to accommodate a complete die. Figure 5 illustrates this point where the wafer is divided into 6 equal sectors. The dotted areas indicate where there is not enough space to accommodate a full die. As can be seen, these areas are either at the edge of the sector or near the center of the wafer. All dotted central areas form a small circle where no single die can be placed within a sector. In other words, cutting a wafer into either 3 or more than 4 sectors will waste space and reduce the number of dies per wafer (DPW). Correspondingly, the cost of producing a 3D IC will increase, which may not be compensated even if the stacking yield is increased by the sector symmetry and cut method. To decide how many dies are actually lost due to sector symmetry, we need to do geometrical calculations considering the number of cuts (or sectors), the shape of the die, area of the wafer, etc. Actually, to design rotationally symmetric wafers, there could be two different die placement methods as illustrated in Figure 6 . Different placement methods yield different DPW and we derive mathematical formulas for DPW calculation for both methods. Relevant parameters are summarized in Table I . Note the vertical and horizontal spacing between dies on the wafer can be easily included in the die height H and die width L. Figure 7 shows a sector from Figure 6 (a). Equations 1 through 5 show the step by step dies per sector (DPS) calculation process.
where N 1 is the number of rows of dies that can be placed below the dotted line in Figure 7 . Note, the L 2tan α 2 part at the bottom cannot hold any die.
where DP S 1 is the number of dies that can be accommodated below the dotted line in Figure 7 .
where N 2 is the number of rows of dies that can be placed above the dotted line.
where DP S 2 is the number of dies that can be accommodated above the dotted line of Figure 7 .
where DP S is total number of dies per sector. Figure 8 shows a single sector from Figure 6 (b). The total number of dies per sector is calculated using equations 6 and 7.
where N represents the number of rows of dies that can be placed in this sector. 
DP S =
The DPW is calculated as follows:
To illustrate the DPW differences between these two die placement methods, we consider a specific example: an 8 inch wafer with edge clearance 5mm and 31.8 mm 2 square die with die spacing 0.04mm. For the selected wafer size and die area, the number of DPW is 804 for a conventional wafer. Figure 9 compares the two placement methods regarding various numbers of cuts. As can be seen, the general trend is that as the number of cuts increases (larger capability of SSC), the DPW decreases. Placement method 2 always outperform method 1 from DPW point of view. Actually, through large amount of experiments considering different wafer size, different die size with different aspect ratio, we find placement method 2 outperform method 1 most of the time. That is why we consider placement method 2 in this work. Note that equations 1 through 8 are derived for calculating DPW of rotationally symmetric wafers. However, like previous work on DPW calculation [5] , [19] , they also provide accurate DPW calculation for normal wafers. Extensive experiments have been done based on different defect models, different wafer sizes, and different die sizes to analyze the effect of different cut number on the final production size of good 3D ICs. The results show that under most of the conditions four cuts yield the maximum number of good 3D ICs compared to any other number of cuts. Parts of our experimental results are shown in the appendix to illustrate this point. In this work, we introduce the significance of wafer cut methodology, and only consider cutting wafers into 2 or 4 sectors (where no die loss occurs) in the experiment section. Figure 10 shows the complete stacking procedure of the SSC method applied to an example of 3 stacking levels. In Figure 10 , initially all repositories are filled with subwafers. The best-pair match between the first two repositories and the best-one match for the rest of the repositories are conducted afterwards [16] . Consider for now that the matching is with respect to subwafers instead of wafers. For each repository replenishment, there is a back-up wafer that is cut into identical sectors. As one subwafer leaves a repository, a new subwafer from the back-up wafer will replenish the repository, immediately. Once the back-up wafer is used, a new back-up wafer will replace it. Since running repository based best-pair matching algorithm is used in Figure 10 , the run time complexity is O(C × m × p × s) [15] , [16] , where C, m, p and s represent number of cuts, repository size, production size and number of stacked layers, respectively.
We summarize five different manipulations of wafers that will be explored in the experimental section in Table II where the Rotation manipulation serves as a comparison to the proposed sector symmetry and cut manipulation. 
A. Experimental setup
We consider an 8-inch wafer with edge clearance set to 5mm. We consider three types of chips with different die sizes: 31.8 mm 2 as type 1 chips, 63.4 mm 2 as type 2 chips, and 131.6 mm 2 as type 3 chips. For the selected wafer size and die areas, the numbers of dies per wafer are 804, 436, and 184 for type 1, 2, and 3 chips, respectively. We utilized the radially clustered defect model [17] to generate wafer maps. The inner core wafer yield for each type of chips is set as 88%, 80%, and 70%, respectively. A production size of 100,000 3D ICs is targeted in all experiments. All the experiments are repeated 1,000 times with results averaged to remove noise.
The five manipulations of Table II are all used with running repository based best-pair matching algorithm in this section. The names of these manipulations refer to the complete stacking procedures depending on the context.
B. Comparing various stacking procedures on both defect models
In this section we examine the compound yields of different stacking procedures like the basic, Rotation2 and SSC2, under both uniform and radially clustered defect distribution models. First, we generate wafer maps with clustered defects based on [17] , and then wafer maps with exactly the same yield but with uniformly distributed defects are generated. The experimental results are shown in Figure 11 .
In Figure 11 , we append Cluster or Uniform to different procedures to indicate the used defect model. For example, Uniform-basic means the procedure uses the uniform defect model and running repository based best-pair matching algorithm, without any manipulation to wafers. The curve indicated by Uniform-Basic actually shows the results of the procedure from [15] , [16] .
In this paragraph, we analyze the effects of different defect models on the compound yield. Figure 11 shows a very notable trend that, with the same stacking procedure, the clustered defect model always produces higher compound yield than the uniform defect model for all three types of chips. For example, Figure 11(b) shows that for the Rotation2 procedure, the yield difference between the two defect models can reach 6.7%. The significant difference between the two defect models is because the radial clustered defect model increases the possibility that a bad die near the edge will be stacked onto another bad die also near the edge, which in turn increases the number of matching good dies. Here, our results further extend the conclusion of [13] about the suitability of the advanced stacking procedures like rotation and SSC. Now we compare the performance of different stacking procedures. Under uniform defect model, we can see that SSC2 outperforms the basic [15] , [16] obviously while Rotation2 has only slight yield improvement. Our results indicate that under uniform defect model, rotation manipulation [12] to wafers does not work well with running repository scheme. Also, regardless of what defect model is used, SSC2 consistently gives better yield than Rotation2. The superiority of SSC is because in SSC the stacking restrictions among subwafers are eliminated while in Rotation2 all subwafers are bonded together. In SSC2, two subwafers selected from the same repository are not necessarily from the same wafer. The differences between SSC2 and Rotation2 become more obvious as the repository size grows from 1 to 50.
Another interesting phenomenon is that the yield for all stacking procedures increases as repository size gets larger. The explanation is that larger repository size provides more candidates for matching algorithms thus increasing the compound yield.
Because of the notable differences between the two defect models, in the rest of this section, only wafers with radially clustered defects are considered.
C. Impact of cut number and rotation number on the compound yield
In this section, different cut number (2 or 4) and rotation number (2 or 4) are applied to SSC and Rotation procedures, respectively. Initially, the yield of the basic procedure with repository size 1 (random stacking without pre-bond test) is calculated for type 3 chip. Then the yield for all procedures is normalized to this calculated yield. The relative yield increase versus repository size is shown in Figure 12 . Here, the basic procedure is actually a hybrid of the methods of [13] and [15] , [16] . First, as can be seen, yield of SSC procedure is always much higher than the others [13] , [15] , [16] , and again the advantage becomes more significant as the repository size increases. When repository size equals 50, the relative yield difference between SSC4 and basic is 24.1%.
Next, we evaluate the impact of cut number on the yield of the SSC procedure. It is obvious from Figure 12 that SSC4 consistently offers higher yield than SSC2. The reason for the yield difference between these two is that in both cases there is no die loss, but greater flexibility is provided in SSC4 which is similar to the reason why SSC2 produces higher yield than Rotation2. In SSC4, each wafer is cut into 4 pieces reducing restrictions between subwafers and produces a virtual repository twice the size of the virtual repository of SSC2.
In this paragraph, we evaluate the impact of rotation number on the compound yield. Generally speaking, yield of Rotation4 is better than that of Rotation2 and basic, but the improvement is only slight. Actually, the advantages of both Rotation4 and Rotation2 over basic are most obvious when repository size equals 1, and the advantage diminishes as the repository size increases. Why does larger rotation number not help the yield improvement? A possible explanation is that under clustered defect model, bad dies are already clustered near the edge of the wafer, in which case rotating the wafer does little for aligning good dies.
D. Impact of total number of stacked layers and wafer yield on compound yield
In this section the impact of number of stacked layers and wafer yield on final compound yield is studied. The experimental results for type 3 chips are shown in Figures 13  and 14 where y axis indicates normalized yield with respect to the yield of the basic procedure under the same condition. In Figure 13 , the number of stacked layers varies from 2 to 7. In Figure 14 , the layer number is fixed at 2, and the inner core yield [17] varies from 30% to 90%. In both figures, the repository size is set to 50.
Though not shown in Figures 13 and 14 , the compound yields of all procedures decrease for larger number of stacked layers and lower wafer yield. However, as can be seen, higher improvement is gained for SSC4, SSC2, Rotation4, and Rotation2 over basic. SSC4 and SSC2 always outperform Rotation4, Rotation2 and basic especially for situations where compound yield becomes poorer. For example, in Figure 13 , for 7-level stacks the relative yield increases from 1.00 for the basic [13] , [15] , [16] and 1.09 for Rotation4 to almost 1.84 for SSC4, which indicates 84% and 69% relative yield increases, respectively. Applications of the above procedures to type 1 and type 2 chips show similar results.
V. CONCLUSION
This paper deals with the problem of low compound yield in wafer-on-wafer stacking for 3D IC fabrication. We propose a manipulation method involving sector symmetry and cut (SSC). In this manipulation method, each wafer is cut into identical sectors. Each sector is then used to replenish the repository for matching. By wafer cut, the matching restrictions for dies on a wafer are reduced and correspondingly the compound yield is improved. Extensive experiments are conducted to compare the compound yield of the proposed SSC procedure with existing works. Results show that the SSC procedure improves the compound yield significantly.
We also derive mathematical formulas of DPS and DPW calculation for rotationally symmetric wafers. We find greater flexibility of wafer matching by sector symmetry and cut, which on the other hand induces larger die loss, in turn reducing the total number of final good 3D ICs. Based on experiments, we conclude that SSC4 should be the rule-ofthumb in practice to maximize the benefit of the proposed technique.
ACKNOWLEDGMENTS
This research is supported in part by the National Science Foundation Grant CCF-1116213. The authors would like to thank Jiao Yu for many useful discussions. Figure 15 shows the final production size of good 3D ICs considering different number of cuts. The same setup as in Section IV-A applies except a production size of 100 wafers is targeted here. As we can see, SSC4 consistently produces the largest number of good 3D ICs. More experiments have been done considering different wafer sizes, die sizes, defect models, etc. The results are similar and hence are not duplicated here. 
APPENDIX
