Abstract-We report the development of multistrata subsurface IR (1.342 µm) nanosecond pulsed laser die singulation [stealth dicing (SD)] on high backside reflectance (up to 82%) Si wafers. We study the microstructural properties and formation mechanisms of the subsurface Si dislocation belt layer with respect to laser scanning speed, pulse laser energies, and interstrata distances. We optimize and exploit the multistrata interactions between generated thermal shock waves and the preceding dislocation belt layers formed to initiate frontal crack fractures that separate out the individual dies from within the interior of the wafer. A new partial-SD before grinding (p-SDBG) integration scheme based upon the tandem use of three-strata SD for controlled crack fracture toward the frontside of the wafer followed by static loading from backgrinding to complete full kerf separation is demonstrated. The optimized three-strata SD process and p-SDBG integration scheme can be used to compensate for the high backside reflectance wafers to produce defect-free eight die stacks of 25-µm-thick mechanically functional and 46-µm-thick electrically functional 2-D NAND memory dies.
to create a variety of 3-D solutions such as systems-inpackages and systems-on-packages. However, to fully take advantage of these 3-D solutions in support of continued miniaturization vis-à-vis More Moore, and added functionality vis-à-vis More than Moore (e.g., enabling heterogeneous integration), thin dies are needed more than ever. Thin dies help not only to reduce the overall package form factor, but also to improve package functionality, heterogeneity, heat dissipation, and performance parametrics. Ultrathin die stacking can be applied to mobile, client, Internet of Things, and wearable technologies and applications as we move from packaging applications in the past with a computing focus to current packaging with wireless mobility focus [1] . However, the fabrication of ultrathin dies is more challenging in the die preparation process. The die singulation process in particular tends to result in increased defect modes (e.g., chipping, die sidewall damage, and microcracks) and decreased quality characteristics (e.g., die strength, kerf geometry, and reliability). Advanced mechanical dicing, laser ablation dicing, or hybrid or sequential combinations based on dicing after grinding or dicing before grinding integration have been developed to help address the associated challenges [3] . Although they help, the fabrication of defect-free ultrathin dies cannot normally be performed because of surface perturbations at the frontside (FS) and/or backside of the wafer due to mechanical interactions or direct laser ablation with Si.
The development of a Si-permeable IR nanosecond pulsed laser technology known as stealth dicing (SD) [4] [5] [6] [7] [8] offers a potential defect-free subsurface die singulation process. In its optimal condition and depending on wafer type and process integration, high-speed singulation with no chipping and cracks on both the FS and backside of the wafer are possible due to its noncontact approach. There is no damage at the surface layer and its edges because a laser wavelength permeable to Si is used. In addition, there is virtually no kerf post die singulation and as a result, there is no debris contamination induced by SD directly. The completely dry process of SD can also eliminate some of the defects caused by wet processing. In principle, the single-strata SD method involves single SD layer scanning of laserinduced perforations within the bulk Si (i.e., a subsurface laser modified Si microstructure). This is followed by controlled fracture mechanics to physically cleave out the individual 2156-3950 © 2015 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
dies from within the wafer interior. Experimental SD work has assessed processing quality based on photodiode characteristics [5] , [6] , [9] , demonstrated the importance of focal plane depth [6] , and reported SD-related defects [9] , die strength [9] , [10] , and stress distribution analysis [9] . Attempts have also been made to apply SD on ThroughSilicon-Via wafers 100-μm thick [11] , subsurface machining of transparent materials [12] , and to enable MEMS [13] . Most experiments have been performed to realize singulated 50-μm-thick Si die using the SD after backgrinding (SDAG) approach [5] , [6] , [9] but without documenting wafer backside reflection, which is almost inevitable for patterned wafers because of collateral thin films deposition on the backside. Little work has been done on enabling SD on high backside reflectance wafers where the amount of SD energy coupling into the wafer is extremely limited. In this paper, we report the development of an optimized multistrata SD process and a proposed partial-SD before grinding (p-SDBG) integration scheme to produce production-worthy defect-free eight die stacks of 25-μm-thick mechanically functional and 46-μm-thick electrically functional 2-D NAND memory dies on high backside reflectance (up to 82%) Si wafers. This is achieved by exploiting the interactions between generated thermal shock waves and the preceding multistrata SD layers by optimizing the interstrata distances, the effective SD laser dose, and the number of SD layers to initiate partial crack fracture toward the FS of the wafer. Having a controlled, partial frontal crack followed by wafer backgrinding to complete full kerf separation is necessary to minimize overall cycle time and to achieve high volume manufacturing readiness.
II. EXPERIMENT
Two types of Si substrates are used for the work reported here: patterned 2-D NAND flash memory monitor wafers and functional wafers. Both types of wafers come in different technologies, depending on the advanced 2-D NAND memory process node, which ranges from 24-nm down to 15-nm patterning technology. These wafers are used for different stages of SD process characterization, optimization, and integration to enable ultrathin stacked memory die assembly. Wafers used were not backgrinded beforehand, and thus come in their original full thicknesses. As a result, the work reported is subjected to the challenges associated with high backside reflectance from the wafers, varying from ∼13% to as high as 83% depending on the process technology. Depending on the process technology, the wafer backside consists of multistack layers of thin films such as SiO 2 , polysilicon, and Si 3 N 4 , measuring hundreds of nanometers in total stack thickness as a consequence of the wafer fabrication process. Key characterization methods used include visual inspection, backside reflectance measurements, atomic force microscopy, optical microscopy, scanning electron microscopy (SEM), and X-ray diffraction.
A. 2-D NAND Flash Memory Monitor Wafers
Boron doped, Czochralski grown, nonepitaxial grade, and 12 inch diameter silicon (100) surface orientation wafers were used to build the monitor wafers. These p-type silicon wafers Measured spectral properties of the specularly reflecting wafer backside due to different thin-film stacks for different wafer technologies: A, B, and C.
are 775 ± 25 μm in thickness and have a resistivity in the range of [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] · cm. These wafers undergo 2-D NAND flash memory full-process fabrication proprietary steps but have low or zero known-good-die (KGD) yield at die sort and test and are therefore used as monitor wafers mainly for mechanical test vehicle purposes. The monitor NAND wafers have a die size of 11.982 × 7.850 mm and can come in three different technology flavors, i.e., A, B, and C. The bulk of the optimization work here focuses on using wafer C technology because it has the highest backside reflectance to the SD laser's operating wavelength and thus is able to be used as a representative test vehicle to encompass the remaining two wafer technologies. Depending on the wafer technologies, the wafer backside consists of multistack layers of thin films such as SiO 2 , polysilicon, and Si 3 N 4 , ranging in the hundreds of nanometers in total stack thickness as a consequence of the wafer fabrication process. Prior to SD laser processing, the wafers are laminated on the wafer FS with a Lintec backgrinding tape. The wafer notch, cut during crystal grinding, has a standard 110 orientation.
B. 2-D NAND Flash Memory Functional Wafers
These wafers are similar to monitor wafers except that die sort test yielded baseline levels of KGDs, thus justifying subsequent memory and card assembly electrical tests, and reliability tests to observe device circuit characteristics. However, it is important to note that the NAND functional wafers used have a smaller die size of 6.276 × 10.922 mm. Fig. 1 shows the measured spectral properties of the specularly reflecting wafer backside due to different thinfilm stacks for different wafer technologies, specifically A, B, and C. The backside surface of wafer technologies A, B, and C appear visibly purple, green, and light brown, respectively [see inset images of Fig. 1 ] as a result of the different multistack layers of thin films such as SiO 2 , polysilicon, and Si 3 N 4 measuring between 300 and 1100 nm in total thickness. It can be observed from Fig. 1 that the absolute backside reflectances for A, B, and C wafers at the SD laser operating wavelength of 1342 nm are 13.4%, 65%, and 82.3%, respectively. This means that C wafers will attenuate the SD laser energy the most as laser light passes through the air/Si backside interface in the SDBG integration approach. In other words, ceteris paribus, applying SD processing to C wafers is highly inefficient from an energy coupling point of view and risks not creating enough SD-modified controlled damage to allow for the subsequent singulation of the wafer. This can potentially make the SD process nonmanufacturable. It also means that by characterizing and optimizing the SD process on C wafers, the solution space defined can be used to encompass the remaining two wafer technologies. The successful optimization of SD processing to C wafers provides flexibility to the ambiguity of product mix and loading using the different wafer technologies. From the results of Fig. 1 , it can also be seen that the absolute backside reflectances for A, B, and C wafers at the SD measurement laser operating wavelength of 830 nm are 53.3%, 26.6%, and 93.9%, respectively. Because all three reflectance values are greater than 10%, the SD measurement laser has no issues performing the backside surface profile mapping to compensate for inherent wafer warpage effects, due to high sensitivity of its detection capability. Fig. 2 shows the SDBG process integration flow from FS taping of the wafer to the final framed diced, ultrathin 300-mm-diameter wafer staged inside a specialty carrier. This SDBG process, where the SD laser is incident on the wafer backside, forms the integration flow from which this paper is derived, including the development of the partial-SDBG (p-SDBG) integration for cycle-time optimization. The inputs to the wafer FS taping step (Lintec backgrinding tape either 125-or 165-μm thick) are the fabricated and sort tested monitor wafers or functional wafers, while the output of the SDBG process shown in Fig. 2 will be ready for the die attach process, where the individual dies will be picked up and subsequently attached on the substrate strips. The three key processing modules involved in the SDBG integration flow are the SD module, the integrated wafer backgrinding module, and the die separation (DDS) module.
C. Backside Reflectance Measurements

D. Key Integration Flow-Stealth Dicing Before Grinding
E. Stealth Dicing Experimental Setup
The SD processing experiments were conducted in an integrated 300-mm DISCO DFL7360 SD tool with the SD Engine developed by Hamamatsu Photonics K.K. This particular tool configuration is capable of supporting the process development of wafer-level SD processing. The SD laser processing head consists of both the measurement laser and the SD laser, and an IR camera microscope (detection range is ∼800-1100 nm) with a halogen lamp light for wafer FS pattern alignment. The measurement laser operates at a near IR wavelength of 830 ± 20 nm and is primarily used to detect the backside surface of the wafer to account for undesired wafer warpage effects on the z-height focal point positioning of SD scanning. This is performed in situ to reduce cycle time by avoiding the need to conduct a wafer prescan for each dicing line. As for the SD laser light source, a 90 kHz, 1.342-μm near IR wavelength pulsed laser varying from 1.0 to 2.2 W in average power is used. Its proprietary operating pulsewidths and energies are in the order of tens to hundreds of nanoseconds/pulse and microjoules/pulse. Multistrata SD layers within the wafer are defined by translating the chuck table relative to the position of the SD laser with scanning speeds ranging from 50 to 900 mm/s and z-height focal points positioned from ∼25 to 200 μm, as measured from the wafer FS surface.
F. Wafer Backgrinding Experimental Setup
Wafer backgrinding, die attach film (DAF)/dicing tape (DT) lamination, and backgrinding tape peel off were performed in a 300-mm linked DISCO DGP8761 grinder/polisher and DFM2800 wafer mounter integrated tool. This linking approach helps to optimize the handling layout to shorten the overall cycle time and minimize defects due to excessive wafer handling. Postwafer backgrinding, the wafer mounter serves to mount the thinned wafers onto the integrated DAF (10-μm thick)/DT (100-μm thick) films on the backside of the wafer, followed by backgrinding tape peel off from the FS of the wafer. Before DAF/DT tape mounting, UV irradiation is performed to reduce the adhesion of the wafer to the existing backgrinding tape. After tape peeling, the frame-mounted DAF/DT laminated thinned wafers are stored in a cassette.
G. Die Separation Experimental Setup
There are two main functions of DDS processing. A cool expansion is used to cleave/dice the DAF film (which is attached on the backside of the wafer after SD processing and backgrinding) and to increase the distance between singulated, thinned dies for ease of die pick up later on (during die attach). Using the characteristics of DAF in which it becomes brittle at low temperatures, expansion is performed in a low-temperature environment to realize high-precision DAF separation. In the cool expansion stage, the thinned wafer cools down to 0°C and stays there for 120 s before ascending upward by 12 mm at a speed of 200 mm/s. It remains at the peak of the ascension for 10 s before returning to its home position. This is followed by the less critical heater shrink stage, which is primarily used to reduce the DT sag due to the previous expansions. In the SDBG approach, DDS eliminates the need for DAF laser grooving as used by DBG. For a process that separates thin wafers with DAF, there are issues such as DAF burrs forming on the cut surface when full-cut dicing is performed and pickup errors during die attach. Using DDS helps to improve the DAF cut quality.
III. RESULTS AND DISCUSSION
A. Single Strata and Multistrata Stealth Dicing Layer(s): Key Responses
In terms of the experimental responses, both qualitative (e.g., microstructural morphologies, defect modes, and crystal orientation) and quantitative (e.g., SD-layer dimensional changes, kerf dimensional changes and DAF separation performance) observations are made. Key SD-layer dimensional responses are identified in Fig. 3 , which shows a schematic of a representative multistrata SD processing scan that generates three SD layers (strata) with quantifiable parameters, SD layer focal z-height, Z SDi and SD layer height, T SDi (where i = 1-3). Both single strata and multistrata SD processing are performed as part of the characterization and optimization efforts reported here.
B. Mechanisms of SD Laser Processing
The SD laser beam consists of nanosecond short pulses, permeable to Si (near IR), oscillating at a high repetition rate and can be highly condensed up to the diffraction threshold level. This highly localized beam is formed at an extremely high peak power density, both time and spatially compressed in the vicinity of the laser focal point, without damaging the material surface. When the SD laser transmitting through the Si wafer exceeds a peak power density (typically more than 100 MW/cm 2 during the condensing process), a nonlinear absorption effect causes a phenomenon in which extremely high absorption occurs at the focal point [4] , [6] . A localized temperature field of larger than 1000 K in a condensed volume of 10 μm 3 within the vicinity of the focal spot is established within nanoseconds. As a result, at the focal point vicinity, a void 1-3 μm in size is formed due to the melting and vaporization of Si. There are also reports of partial recrystallization of Si and microcracking occurring near the focal spot during the cooling phase [3] , [4] . Thereafter, a controlled high dislocation density is generated due to the thermal shock wave produced upward from the laser focal point vicinity. The heataffected zones grow only toward the beam incident surface because the absorption coefficient increases nonlinearly with the increasing temperature [14] [15] [16] [17] . As the SD laser scans in the horizontal direction, the interaction between the subsequent thermal shock wave generated and the previously formed high dislocation density layer will initiate a crack fracture that separates out the individual dies. SD laser dicing avoids issues associated with the use of conventional laser dicing methods operating at wavelengths that are highly absorbed by the materials to be diced (i.e., ablation) that unavoidably produce heat and debris at the surfaces. Optimization of SD processing is greatly dependent on wafer specifications (e.g., thickness, impurity elements and their concentration, and backside film thickness and materials) and the final device applications (e.g., final thickness, defect sensitivity of device, and die size/shape), balancing system-level throughput via different SD process integration flows. For the purpose of die singulation on a full thickness wafer followed by backgrinding down to the target thickness, a larger SD layer height is preferred. This is because a larger SD layer height will lead to an equivalent or higher likelihood of die separation from within the interior of the wafer due to the perforation created by the dislocation belt (i.e., the SD layer) in addition to the frontal crack propagation generated. Lower reflectance on the backside of the wafer more efficiently couples the SD laser's energy to create denser and dimensionally larger SD layers and this can help in lowering the SD processing time. Fig. 4 shows representative images of identified key SD layer microstructural transitions as observed by optical microscopy on the sidewalls of a full thickness C die. The observations suggest that there are two key microstructural transitions: from a dense to an optimal SD-layer structure and from an optimal to a fishbone structure, as the total deposited energy decreases by way of increasing laser scanning speed. The magnitude of the dislocation density and polycrystallization of the SD-modified layer changes (left to right of Fig. 4 ) from high (dense) to medium (optimal) to low (fishbone) as laser scan speed increases. Because the Fig. 4 . Representative SD layer microstructural transitions as observed by magnified optical microscopy on the sidewalls of a full thickness memory die from wafer C (82.3% backside reflection). Ceteris paribus, the magnitude of the dislocation density and polycrystallization of the SD-modified layer changes (left to right) from high (dense) to medium (optimal) to low (fishbone) as laser scan speed increases. The upper right inset shows a schematic illustration of the separation of individual irradiation pulses as scanning speed increases for a given PLE.
C. Effects of Laser Scanning Speed
SD laser is operating via rapid single-shot irradiations in the form of nanosecond (in the hundreds) pulsing with a repetition frequency of 90 kHz, the lowering of scanning speed in effect increases the magnitude of the SD belt-layer exposure dose due to pulse overlapping [see upper right inset of Fig. 4 ]. In Fig. 4 , the representative dense, optimal, and suboptimal fishbone SD layers are produced on C monitor wafers with a laser average power of 2.0 W [pulse laser energy (PLE) = 22.2 μJ] at a Z SD1 ∼ 70 μm with scanning speeds of 50, 500, and 900 mm/s, respectively.
We define the optimal SD-layer structure by examining and identifying the type of SD layer microstructure, both qualitatively and quantitatively, given a set of processing conditions (including the type of wafer technology) such that the density of the dislocations and amount of polycrystallization induced by the thermal shock wave is just enough to generate a controlled crack fracture toward the FS of the wafer (instead of fracturing across the entire full thickness of the wafer as reported in [4] [5] [6] [7] [8] [9] ). Fig. 4 also illustrates various important characteristics, divided into different zones of the SD layer. Zone A consists of uniformly spaced voids measuring ∼2 μm in size along the z-focal plane of the SD laser. These voids form as a result of the heating phase induced by the extremely high absorption phenomenon that occurs when a certain peak power density is exceeded during the beam condensing process. Zone B is the larger microstructural region where the internally modified SD belt layer with high dislocation densities is generated as a consequence of the thermal shock wave. The systematic characterization produced in this paper shows that the density, the positioning, and the dimensional characteristics of zone B (along with interstrata distance for multistrata SD) play a dominant role in controlling the crack fracture mechanics for high-quality and low cycle-time SD processing. Zone C represents the region at the tail end of zone B where the thermal shock wave propagation is near complete and thermal diffusion starts to take over. It is important to point out that vertical microcracks are formed above and below zone B. The microcracks generated upward have a higher crack length and frequency as compared with the ones propagating downward. This is due to the directional nature of the high-velocity thermal shock wave going upward. It is believed that the microcracks propagating downward are actually an artifact from the manual breaking of the samples required to conduct the optical microscopy. As for the suboptimal SD microstructure, despite the lower effective dose of the PLE at high scanning speeds, the upper microcracks remain consistently large thus producing a fishbone like structure. Fig. 5 (a)-(c) shows experimental evidence of SD layer microstructural and dimensional transitions observed by optical microscopy on full thickness die sidewalls as a function of laser scanning speed (50-900 mm/s) for different monitor wafer technologies (A, B, and C) with different backside reflectances (13.4%-82.3%). Fig. 5(d) plots the mean SD layer height T SD1 as a function of laser scanning speed for the same. From Fig. 5 , as the scanning speed increases from 50 to 900 mm/s, the SD layer height T SD1 decreases nonlinearly across all wafer technologies. Qualitatively, it is found that at significantly lower effective energy doses (achieved with higher scanning speeds and higher backside reflectance wafer technologies), the SD layer becomes less dense with dislocation damage and vertical microcracks becoming more prominent. For example, at a laser average power of 2.0 W (PLE = 22.2 μJ), fishbone SD layer microstructure arises at laser scan speed v = 900 mm/s. When comparing across wafer technologies, it is obvious that C wafers have a generally lower T SD1 than A and B wafer technologies because of its significantly higher backside reflectance that limits the effective dose entering the Si wafer from the backside surface. For the A and B wafer technologies, despite scanning at 900 mm/s with a lower laser average power of 1.7 W (PLE = 18.8 μJ), a clear transition to the fishbone microstructure is not immediately obvious. As a result, for lower backside reflectance, the optimal scanning speed v can be set at higher values, i.e., 700 mm/s for A and B wafers instead of 500 mm/s for C, and thereby improving the throughput time for SD processing. These optimal scanning speeds for a set of given conditions are extracted not only qualitatively from microstructural observations [see Fig. 5(a)-(c) ] but also from the nonlinear dependency plotted in Fig. 5(d) where T SD1 starts to plateau beyond a certain scan speed. The plateauing of T SD1 can be explained by the fact that as irradiation pulses separate further and further from one another as scan speed , as a function of laser average power for the same. * indicates the optimal laser average power, positioned before a microstructural transition toward a low-density fishbone structure for a set of SD processing conditions and wafer technologies. Note that for wafer C, two passes of SD processing were made within the wafer to facilitate manual separation for cross-sectional optical microscopy.
increases, it reaches a point where no overlap of individual irradiation pulses begin to occur. When no overlapping starts to happen, T SD1 remains similar because the effective dose becomes a constant thereafter [see upper right inset of Fig. 4 ]. One can expect the fishbone structure to emerge when the vertical microcracks remain similar in size while T SD1 starts to decrease and plateau off. Results from Fig. 5 show that T SD1 can indeed be well controlled by scanning at different speeds.
This dependency enables good control of the SD damage necessary to realize a reproducible fracture mechanism to begin singulating individual dies from the wafer. Fig. 6(a)-(c) shows experimental evidence of SD layer microstructural and dimensional transitions observed by optical microscopy on full thickness die sidewalls as a function of laser average power (1.0-2.2 W, i.e., PLE from 11.1 to 24.4 μJ) for different monitor wafer technologies (A, B, and C) with different backside reflectances (13.4%-82.3%). Fig. 6(d) plots the mean SD layer height (first SD layer), T SD1 , as a function of laser average power for the same. Here, it can be seen that the identified optimal average laser power for wafer technologies A, B, and C are 1.7 W (18.8 μJ), 1.7 W (18.8 μJ) , and 2.0 W (22.2 μJ) , respectively, for a scan speed of 500 mm for C wafers and 700 mm/s for both A and B wafers. The results of the experiments in Fig. 6 show qualitative and quantitative evidence that as the laser average power increases from 1.0 W (PLE = 11.1 μJ) to 2.2 W (PLE = 24.4 μJ), the SD layer height, T SD1 , increases nonlinearly, in a sigmoidal fashion [see Fig. 6(d) ]. Qualitatively, analogous to the results shown in Fig. 5 , it is found that at significantly lower effective energy doses (achieved with lower laser average power and higher backside reflectance wafer technologies), the SD layer becomes less dense with dislocation damage and vertical microcracks becoming more prominent. For example, at a laser average power of 1.0 W (PLE = 11.1 μJ), fishbone SD layer microstructure arises at laser scan speed v of 500 mm/s. Similar to the results in Fig. 5 , when comparing across wafer technologies, it is found that C wafers have a generally lower T SD1 than A and B wafer technologies because of the significantly higher backside reflectance in C wafers that limits the effective dose entering the Si wafer from the backside surface. For the A and B wafer technologies, despite a low laser average power of 1.0 W (PLE = 11.1 μJ) with a higher scan speed at 700 mm/s, there is no obvious transition to the fishbone microstructure. Therefore, for lower backside reflecting surfaces, the optimal laser average power can be set at a lower value, i.e., 1.7 W for A and B wafers instead of 2.0 W for C, and thereby improving laser lifetime (cost of ownership) for SD processing. In addition to qualitative observations, the optimal laser average power for a given SD condition can also be validated from the nonlinear plot shown in Fig. 6(d) . From  Fig. 6(d) , it can be seen that T SD1 starts to increase as laser average power increases but begins to plateau beyond a certain point, thus resembling a sigmoidal curve (this is more apparent for A and B wafer technologies given the sweeping range for the average power). The decreasing sensitivity of T SD1 to laser average power as the latter increases reinforces the need to fully comprehend the minimal safety T SD1 (or TSDi if using multistrata) to initiate crack fracture without unnecessarily high laser power usage, to reduce the costs of ownership for the SD process. It is clear from Fig. 6 that T SD1 can be well controlled by using different PLEs in combination with different scanning speeds. This dependency enables another knob to control the magnitude of SD damage necessary to realize a production-worthy die singulation technology. Optimal laser average power is selected to complement high scanning speeds to reduce SD processing cycle time, balancing the need for power stability and laser head lifetime. For this reason, the optimal condition is usually close to the vicinity of the rising edges on the plot shown in Fig. 6(d) . Experimental evidence of a dual-pass 2.0 W, 500 mm/s SD process on wafer C to show the effects of multistrata SD processing in generating undesired cleavage plane {111} defects when interstrata distances are not optimized. Plots of measured (a) mean focal plane z-heights, Z , for SD1 and SD2 (left axis) and (b) interstrata distance, D SD1−SD2 (right axis) as a function of SD2 defocus position, DF SD2 . Inset images show representative optical micrographs with and without interstrata cleavage plane {111} defect formation depending on the interstrata distance for the given SD processing conditions.
D. Effects of Laser Average Power
E. Effects of Multistrata SD
Multistrata SD layers are created by scanning the SD laser beams focal spot at different z-heights within the Si wafer along a common dicing street. This technique is usually performed in cases where there is inadequate effective laser exposure dose when using single strata and/or a larger SD damage threshold is needed to initiate a well-controlled fracture crack. The need for having more than one SD layer is exacerbated by wafers with high backside reflectance to the SD laser operating wavelength such as the C wafers. The tradeoff in having multistrata SD processing is an increase in cycle time, as multiple passes are needed for each dicing street. Optimal multistrata positioning, dimensions, and microstructure definition enables a minimum number of SD passes (and hence, minimum cycle time) required to initiate a high quality, defect-free die singulation.
To understand the effects of multistrata SD, we experiment with controlled dual-pass SD processes on C monitor wafers. Fig. 7 shows experimental evidence of multistrata SD processing at 2.0 W and 500 mm/s in generating undesired cleavage plane {111} defects when interstrata distances are not optimized and go beyond a certain threshold, D th,SDi−SDi+1. Fig. 7(a) and (b) shows the measured mean focal plane z-heights Z SD1 and Z SD2 (left axis) and interstrata distance D SD1−SD2 (right axis) as a function of SD2 defocus position, DF SD2 . The DF parameter is a recipe-driven setting that facilitates the ability to form SD layers deep within the wafer using the deep trace function. The approximately linear correlation observed in Fig. 7(a) validates the deep trace linear operating regime, where a certain DF z-direction offset is applied to shift the distance between the measurement laser's focal spot on Cross-sectional optical micrographs of a dual-strata SD process on wafer C for interstrata distances of (a) 32, (b) 40, and (c) 61 μm. The SD layers are generated using a laser average power of 2.0 W (PLE = 22.2 μJ) at a scanning speed of 500 mm/s, focusing at Z SD1 ∼ 70 μm and varying Z SD2 from 122.5 to 149.5 μm. (d) Cleavage planes {110} of (100)-plane Si wafer with scribe lines aligned with the 110 directions; cleavage plane {111} intersects surface plane (100) along the 110 direction at an angle of 54.74°.
the wafer backside surface and the SD laser's focal spot to maintain the linearity achieved between a previously calibrated displacement versus measured photodiode voltage curve (generated due to the reflection of the measurement laser beam from the wafer backside) deep within the wafer. Depending on the material [and on the beam parameter (BP) settings] to which the SD processing is applied, different Z SDi to DF position ratios are obtained, mainly driven by the refractive index of the material (in this case, Si) and the focused spot properties. The results in Fig. 7 show that for a BP setting of 15, mean SD layer depth (measured from backside of wafer starting from a full thickness of 775 μm) to DF ratio is ∼3.72-3.73 (the refractive index of Si is ∼3.5 at a wavelength of ∼1.3 μm). Inset images in Fig. 7 show representative optical micrographs with (D SD1−SD2 = 40 μm) and without (D SD1−SD2 = 27 μm) interstrata cleavage plane {111} defects. Here, it can be seen that, for a given SD processing condition, as the interstrata distance increases, there is a higher likelihood for the cleavage plane {111} defects to arise. In the example shown in Fig. 7 where Z SD1 ∼ 70 μm, while Z SD2 varies from 121 to 155 μm, we establish D th,SD1−SD2 to be ∼40 μm, beyond which the cleavage plane {111} defect is generated between the SD layers as a horizontal fracture line. Fig. 8(a) -(c) shows cross-sectional optical micrographs of a dual-pass SD process on C wafers for interstrata distances of 32, 40, and 61 μm, respectively. Here, the SD layers are generated using a laser average power of 2.0 W (PLE = 22.2 μJ) at a scanning speed of 500 mm/s, focusing at Z SD1 ∼ 70 μm and varying Z SD2 from 122.5 to 149.5 μm to increase the D SD1−SD2 . It can be observed from the results in Fig. 8 that the cleavage plane {111} defects are formed when D SD1−SD2 ≥ 40μm. Meanwhile, Fig. 8(d) shows cleavage planes {110} of a (100)-plane Si wafer with scribe lines aligned with the 110 directions. It can be seen that the cleavage plane {111} intersects surface plane (100) along the 110 direction at an angle of 54.74°. The cleavage plane {111} defect can be characterized by a horizontal V-shape fracture where two 111 fracture directions meet. When the interstrata distance between the SD layers (for multistrata SD processing) exceeds a certain threshold, the in-built interstrata tensile stress will result in the nominal cleavage fracture on the {110} plane deflecting into the {111} planes. The far field stress generated as the distance between the SD layers becomes larger increases the likelihood for plane slip to arise. The V-shape formation is not surprising, given that propagation along the 111 direction is difficult to achieve and never gives a flat fracture surface [18] , [19] . While these defects (along with the SD layers) can be subsequently removed using backgrinding in the SDBG process (whereas for the SDAG process, these defects will be left behind and thus may cause serious yield and reliability issues), the cleavage plane {111} defects are undesired because, once generated, they usually affect the integrity and control of the crack fracture initiated by the cumulative damage of the SD layers. This will, in turn, increase the variability of the SD process which is undesirable for low Z SD1 values that are typically more effective for ultrathin die fabrication using the SDBG process.
F. p-SDBG: Three-Strata SD With Backgrind-Assisted Controlled Fracture
It is important to ensure that for every new process developed for a particular objective, be it quality or output, or both, a solid understanding of its cycle time impact is made. While learning curves and economies of scale can eventually reduce the effective cycle time, it is useful (where possible) to ensure that the new cycle time falls somewhere within a reasonable distribution of the existing cycle time of the process it is trying to disrupt, while maintaining the goodness it serves to offer. This ensures a relatively smooth transition into the new process technology while offering less disruption to the existing capacity of processes upstream and downstream from the new process insertion. On this note, based on earlier results, we know that laser power influences microstructural transition with an indirect impact to cycle time. With a larger SD layer damage induced by higher SD power, a similar crack fracture initiation can be made with higher laser scanning speeds. This lowers the cycle time per wafer. Therefore, it is in our favor to maximize PLEs (subjected to no known quality or precision impacts) based on the results shown in Fig. 6 . Given the existing tool capabilities and laser lifetime impact, the laser average power is ultimately set at 2.0 W. The current SD laser has a maximum output power of ∼2.2-2.3 W and a maximum laser scanning speed of ∼900 mm/s based on empirical studies. The correct balance and buffering between laser average power, scanning speed, and number of SD passes determine the cycle time of the SD process and its quality.
For multistrata SD, the SD1 layer not only sets the starting point for the crack fractures propagation to singulate the wafer but also serves as the last damaged layer (first in, last out) to be removed by backgrinding. Therefore, it is important to Fig. 9 . Cross-sectional optical micrographs of the developed three-strata SD process before backgrinding for wafer C. The inset image shows a magnified optical image demonstrating well controlled definition of the three SD layers, SD1-SD3, with no defects.
position the SD1 layer optimally so that efficient fracturing occurs and no damaged layer is left behind. At the same time, correct Z SD1 positioning (i.e., larger than ∼50 μm for a given laser effective dose of 2.0 W at 500 mm/s) also helps to prevent FS surface ablation and Si interference defects [20] . This positioning will ultimately determine the beam optical parameter settings to minimize the spherical aberration of the condensed beam. Results from Figs. 7 and 8 help to establish a maximum interstrata distance of ∼32-40 μm, beyond which cleavage plane {111} defects will arise. For this reason, interstrata distances were set at ∼25 μm for subsequent wafer-level SD uniformity optimization. While its increase can potentially result in a larger effective tensile stress to initiate a frontal crack fracture with minimal number of SD passes (and thus reduce the overall cycle time), this direction is unfortunately limited by a stress-relieving mechanism via generated cleavage plane {111} cracks. Fig. 9 shows cross-sectional optical micrographs of the developed three-strata SD process before backgrinding for wafer C, which has the highest backside reflectance. The inset image in Fig. 9 shows a magnified optical image demonstrating well-controlled definition of the three SD layers, SD1-SD3, with no defects. At the same time, top view optical micrographs of the FS surface of adjacent dies on a full 300-mm C wafer demonstrate well defined, high-quality SD kerfs initiated along the dicing streets regardless of the presence of complex test element group (TEG) structures [20] . At this stage, post-SD laser processing, there are no signs of kerf geometric defects such as kerf width, kerf loss, kerf perpendicularity, and kerf straightness issues when using the developed three-strata SD process [20] . Kerf width measures about 2-μm wide on average with near zero kerf loss observed as expected.
The new p-SDBG integration scheme [see Fig. 2 ] is based upon the use of SD to initiate controlled crack fracture toward the FS of the wafer only, and not both the backside and FS. It then relies on the subsequent static loading from backgrinding to finish the job of full kerf separation of individual dies-improving the cycle time and solution space of SD processing. This is because the magnitude of damage required for die separation from SD processing (as a result of the frontal crack fracture initiation) becomes less, and thus saves processing time. This is especially critical for high backside reflectance wafers, such as C wafers, where the amount of SD energy coupling into the wafer is limited (unless an extra pregrinding step to remove the reflecting thin films is used). As a consequence of the nonlinearity of SD-induced damage in response to effective PLE dosing [see Figs. 5 and 6], the p-SDBG is a meaningful step toward developing a productionworthy integration scheme for high backside reflectance wafers that would otherwise take a nonrealistic number of SD passes to create (if it creates) SD kerfs across the entire thickness of the wafer.
Therefore, to generate a production-worthy p-SDBG integration flow, it is important to characterize die singulation quality from a wafer-level external appearance point of view. A good understanding of the process uniformity, variability, and reproducibility is critical to avoid potential process robustness issues during high volume ramping. Fig. 10 shows a comparison of FS within wafer (WIW) SD kerf coverage (%) against different SD multipass and scanning speed combinations before and after backgrinding down to 25 μm in thickness. In generating these results, the SD processes operate at a laser average power of 2.0 W and at focal Z -heights of Z SD1 ∼ 70 μm, Z SD2 ∼ 115 μm, Z SD3 ∼ 158 μm, and Z SD4 ∼ 201 μm, where applicable (in other words, the multistrata build-up leverages on the results shown in Fig. 9 ). These focal positions ensure no FS ablation defects, interference defects, and interstrata cleave plane {111} defects with an adequate PLE without creating any adverse effects. The estimation of FS WIW kerf coverage is based on optical microscope inspection of the wafers both before and after the backgrinding process (wafer is previously mounted on a Lintec backgrinding tape with the adhesive layer thicknesses measuring ∼125 μm). For example, Fig. 11(a)-(d) shows schematics illustrating FS SD kerf formation with approximately 20%, 50%, 70%, and 100% WIW coverage. Some generalizations can be made from the results of Fig. 10 . The findings suggest the need to increase the number of SD passes to progressively increase the SD kerf coverage across the wafer up to 100% coverage before the backgrinding process. This can be explained by the fact that with more SD layers stacking on top of each other, the effective tensile stress is higher and will assist the crack fracture propagation toward the FS of the wafer previously generated by the interaction between the thermal shock wave and the preceding damaged SD layer. This will result in a more uniform and consistent coverage of the SD across the entire wafer before backgrinding. Failing to achieve 100% coverage, it can be observed from Fig. 10 that backgrinding helps to provide a static load to complete the 100% kerf coverage on the wafer FS. While one may suggest that SD cycle time can be improved if only the response for full SD kerf coverage across the wafer postbackgrinding matters, the results from Fig. 12 suggest otherwise.
In the case of C memory monitor wafers where there are nonuniformly distributed TEG structures along the dicing street, the failure to realize a 100% SD kerf coverage across the wafer before backgrinding will result in kerf straightness and integrity issues postbackgrinding. This is a result of inadequate force to thrust the crack propagation normal to the TEG metallization layers resulting in crack deviation. At the same time, the benefits of reduced cycle time (running at 700 versus 500 mm/s for a three-strata SD process) are small when compared with the need for a robust SD process to encompass the performance variability of interacting tools, such as backgrinding and wafer mounting, for high volume manufacturing. Fig. 12 shows the comparison of FS SD kerf straightness and integrity against different SD multistrata and scanning speed combinations similar to the one shown in Fig. 10 . These results clearly demonstrate that kerf geometric quality is compromised whenever SD kerf formation coverage before backgrinding is incomplete. Also, it is much harder to obtain good kerf geometric quality on TEG structures as compared with bare Si, due to the different materials involved to form the TEG structures. From the results shown Comparison of FS SD kerf straightness and integrity against different SD multipass and scanning speed combinations. All SD processes represented here are performed at 2.0 W at Z SD1 ∼ 70 μm, Z SD2 ∼ 115 μm, Z SD3 ∼ 158 μm, and Z SD4 ∼ 201 μm, depending on the number of SD layers.
in Figs. 10-12 , the use of three SD layers at a scan speed of 500 mm/s at 2.0 W with the positioning given above forms the core process-of-record (POR) SD recipe to realize 25-μm ultrathin die for subsequent stacking, having struck the right balance between achieving a low SD cycle time, i.e., ∼3.75 min per wafer, including alignment time and high die kerf quality. This SD cycle time is considerably shorter than mechanical blade dicing through full thickness wafers.
A total of 22 runs with two SD-processed wafers per run over a period of approximately two weeks were performed to characterize the run-to-run (RtR) and WIW variation of the developed three-strata SD process on the C memory wafers. It was found that all three SD layer heights have a well-controlled grand mean of ∼19-20 μm with an RtR mean variability (one sigma) of ∼1.3-1.4 μm. As for the WIW variation, all three SD layer heights have a grand mean of ∼1.4-2.3 μm with a variability (one sigma) of ∼0.6-0.4 μm. At the same time, the SD layer focal plane z-height for SD1, SD2, and SD3 layers have wellcontrolled grand means of 69, 115, and 158 μm, respectively, and the RtR mean variability (one sigma) ranges between 3.2 and 4.0 μm. As for the WIW variation, all three SD layer heights have a grand mean of ∼1.9-3.4 μm with a variability (one sigma) of ∼0.6-1.3 μm. These values demonstrate the potential for SD technology and p-SDBG to enable controlled fabrication of thinned, singulated die measuring 25 μm and below in thickness, because the size and the positioning of the SD damaged layers within Si has a very low variation, much lower than that of backgrinding. Fig. 13(a)-(c) shows the respective top view optical micrographs of defect-free FS surfaces of adjacent dies at center, midradii, and edge of a 25-μm-thick 300-mm SD-processed C wafer postoptimized DDS. Meanwhile, Fig. 13(d) shows the view from the wafer backside, with a magnified view of the boxed region showing the die-to-DAF edge separation distance, which is critical for successful die pick up. It can be seen that in addition to the defect-free and high SD kerf quality results from Fig. 13 , post-DDS, the kerf width enlarges to 49 and 62 μm for channels 1 and 2, respectively, from an initial recorded kerf width of ∼1-3 μm right after SD processing. This is to be expected as a result of the die separation process, which induces tensile force to widen the die separation distance via a cold expansion process.
G. p-SDBG: Post Die Separation
Also, using the developed DDS process, the die-to-DAF edge separation distance ranges from 1.0 to 3.0 mm across the wafer, comfortably exceeding the 0.2 mm requirement to allow a successful die pick up process, i.e., peeling off the thinned, singulated die with DAF attached on its backside from the DT. An increase in the separation distance between the separated die edge and the DAF edge leads to a reduction in the DAF-to-DT effective contact area, and thus reduces the adhesion at the DAF-to-DT interface. This is important to prevent an excessive die pick up force impinging on the thinned die to peel it off for subsequent die attach. On the other hand, having too large a die-to-DAF edge separation distance will also lead to the possibility of dies flying off if excessive handling or rotational based processes are used. For smaller die size, the solution space for an optimal die separation process is much smaller because of the reduced contact surface area between the DAF and the DT, which is essential to provide better tensile force coupling. The important thing to note here is that the defect free and 100% kerf uniformity/integrity performance produced by the developed backgrind-assisted controlled fracture of SD-processed wafers remains robust post-DDS. Angled side-view SEM micrographs showing defect-free memory dies progressively stacked with single-sided bonding pads using (a) two four-die blocks and (b) one eight-die block. The inset in (a) shows magnified SEM images to illustrate the integrity of the sidewalls/edges and the flush profile across the 25-μm die to the 10-μm-thick DAF.
H. p-SDBG: Post Multidie Stacking and Testing
Finally, Fig. 14 shows angled side-view SEM micrographs demonstrating defect-free memory dies progressively stacked with single sided bonding pads using [ Fig. 14(a) ] two four-die blocks (mechanical/monitor) and [ Fig. 14(b) ] one eight-die block (functional). The inset in Fig. 14(a) shows magnified SEM images to illustrate the integrity of the sidewalls/edges and the flush profile across the 25-μm die to the 10-μm-thick DAF, both of which are characteristics enabled by an optimal SD process and p-SDBG integration flow. As for the electrical test vehicle, which uses 2-D NAND flash memory functional wafers, Fig. 14(b) shows the architecture used to ultimately realize a prototype 64-GB flash memory product. Using the C functional wafers and the architecture shown in Fig. 14(b) , a total of up to 5 K unit packages in six separate lots were fabricated using the developed p-SDBG integration flow, and when compared with the current baseline assembly test yield, we obtain an increase of up to a mean of 3.5%. In addition, all units fabricated using the three-strata SD process and p-SDBG integration flow passed 100% POR reliability tests, which consist of various standard tests such as a minimum of 500 cycles of temperature cycle component level test from −65°C to 150°C, 96 h of highly accelerated stress test at 130°C, 85% RH, and 500 h of hightemperature stress test at 150°C.
IV. CONCLUSION
Wafers have the highest value at the die singulation step, and while singulation does not add value to the finished dies, it has a profound impact on die packaging and assembly yield and reliability. In many applications requiring fabrication of ultrathin dies for 3-D die stacking, wafer singulation has become the most challenging process in meeting both yield and cost targets, as it is one of the most important steps to maximize the number of packaged units per wafer. This work contributes a production-worthy multistrata SD technology, with thorough quantitative and qualitative process characterization and optimization. A new p-SDBG integration scheme based upon the tandem use of three-strata SD for controlled crack fracture toward the FS of the wafer followed by static loading from backgrinding to complete full kerf separation has been demonstrated to compensate for high backside reflectance wafers. In addition, demonstrations of zero-defect 8D stacks of 25-μm-thick nonfunctional memory dies and 46-μm-thick functional memory dies (64 GB) are shown, with evidence of up to 3.5% increase in assembly test unit yield and the passing of reliability tests. Future work can entail building empirical models to predict the power, speed, and the number of strata needed for different backside reflectance. 
