Abstract-Single-event effects (SEE) evaluation of five different part types of next generation, commercial trench MOSFETs indicates large part-to-part variation in determining a safe operating area (SOA) for drain-source voltage (V DS ) following a test campaign that exposed >50 samples per part type to heavy ions. These results suggest a determination of a SOA using small sample sizes may fail to capture the full extent of the part-topart variability. An example method is discussed for establishing a Safe Operating Area using a one-sided statistical tolerance limit based on the number of test samples. Burn-in is shown to be a critical factor in reducing part-to-part variation in part response. Implications for radiation qualification requirements are also explored.
to account for trans-conductance shifts and increased leakage currents. Destructive single-event effects (DSEE) [5] can be managed by derating against a maximum drain-to-source voltage (V DS ). Possible ion microdose [6] - [9] is treated with the standard TID mitigation.
Additional consideration must be given to ensure adequate margin for part-to-part variability in commercial MOSFETs. Sample variability in commercial power MOSFETs has limited discussion in the literature, especially for the long tails requiring large numbers of devices to evaluate. Liu et al. [10] noted misinterpretations of prior trench MOSFET test results. This paper considers the statistical handling of SEE results from a large data set for establishing a safe operating area (SOA) in the "risk avoidance" approach to MOSFET qualification for space.
A. Survey of Existing Standards and Literature
A survey of existing industry SEE test standards and test guidelines is summarized here. Sample size recommendations are often lacking or allow small quantities when given. This may be based on an expectation of consistent SEE results based on processing controls. For example, MIL-STD-750 Method 1080 specifies a minimum of 5 samples for establishing a cross-section vs. linear-energy-transfer (LET) curve but offers limited guidance for the quantity required to establish an SOA [11] . The JEDEC JESD57 SEE standard does not give a recommended sample size or a statistical methodology for evaluating acceptable variation in response [12] . The ASTM F1192 heavy-ion test guideline does not provide sample size guidance for hard errors and points to MIL-STD-750 Method 1080 for power transistor test procedures [13] . JPL Test Guidelines [3] recommend that "at least five data points should be taken such that the threshold voltage can be determined to the precision required by the mission application". The MIL-PRF-19500 slash sheet requires three of three samples pass to define the SOA. The European Space Agency test methods (ESCC No. 25100) recommend three devices as a minimum sample quantity [14] . This paper suggests that these sample sizes may be inadequate when evaluating commercial device DSEE performance.
B. Test Methods
Five device types from the International Rectifier G10.7 family of DirectFET TM MOSFETs were evaluated (see Table 1 ). This MOSFET family uses a trench structure to improve current handling capability and a flip chip, direct bonding of die pads to a printed circuit board to lower on-state 0018-9499 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. resistance (R DS ). This line of commercial MOSFET products was not intended by the manufacturer to be used in space environments so part qualification and mitigation of space radiation effects are the responsibility of the user. All tested parts of a given device type came from a single wafer lot.
To determine the statistical part-to-part variation, more than 700 samples were tested, simultaneously exposing 63 devices on a test board using a broad (20 cm x 25 cm) ion beam at the NASA Space Radiation Effects Laboratory (NSRL). The experiment looked at tilt and rotation angles, temperature, and burn-in conditioning as factors that may influence the statistical distributions.
We conducted burn-in per the MIL-PRF-19500 JANS flow on all devices except where noted. Burn-in included 48 hours at 150 • C for High Temperature Gate Bias (HTGB) with the gate at 80% of the rated V GS and drain grounded. Burn-in also included 240 hours at 150 • C for High Temperature Reverse Bias (HTRB) with gate grounded and drain at 80% of the rated V DS . The irradiation test method was based on MIL-STD-750 Method 1080 which included post-irradiation stress testing after each exposure.
Most NSRL exposures used 197 Au at 121.5 MeV/amu with a surface linear-energy transfer (LET) of 28.5 MeV-cm 2 /mg and initial range of 2.4 mm (Si). The LET at the active layer region was calculated to be approximately 35.5 MeV-cm 2 /mg. Despite a flip-chip die orientation and a 0.25 mm thick Cu lid, the high-energy capability of NSRL beams allowed for in situ exposure of the device active layers without modification of the packaging. Beam flux was provided in 200 ms to 300 ms duration "spills" of about 1 × 10 4 ions/cm 2 coming every 5 seconds. This corresponds to an effective instantaneous flux of about 4 × 10 4 ions/cm 2 /s.
A schematic of the test circuit for each MOSFET is shown in Figure 1 . A PXI-based test controller briefly switched each DRAIN_V channel in sequence away from V DS to a voltage measurement node to monitor the drain voltage. A device failing short while V DS was applied to DRAIN_V would blow the 250mA rated drain fuse, detected as a 0 V reading when polled. The DRAIN_ON connection maintained the bias during polls and was used to perform trans-conductance sweeps between runs. Only two non-shorted failures of monitored devices were detected this way, noted below in the IRF6643 (150 V) device discussion.
Devices were exposed to a fluence of 2 ×10 5 ions/cm 2 with V GS set at −2 V (MOSFET off state) to mitigate potential threshold shifts due to total ionizing dose. After each exposure, V DS was increased by 1 V and the test sequence continued until all devices failed. This gave a failure distribution for each device type as shown in Section III. 
C. Potential Test Issues

1) Beam Uniformity:
The NSRL facility can produce a beam with 2-3% variation in uniformity over an area of 20 cm ×20 cm. The beam flux is slightly higher at the edges due to the ion focusing optics. Ten of the 63 parts on the test board were located near the edge of the radiation field and could have received doses up to 10% higher than those in the center.
2) Current Draw and Self-Heating: During beam exposures, the DRAIN_ON connection was left connected to the applied V DS supply (see Figure 1 ). This had little impact for functional devices biased OFF. Once the device failed short, the blown drain fuse caused a high current across the drain resistor. Figure 2 shows damage to a test board due to excessive heating. The board current increased as more devices failed, contributing to higher overall board temperatures toward the end of the exposures. The excess heat may have stressed the remaining MOSFETs. A few instances of excess thermal heating were noted in the run logs and these points were excluded from the analysis as indicated in later sections.
III. TEST RESULTS AND DISCUSSION
A. Results Summary
Results for DSEE sensitivity are shown below for each device along with their specific challenges. While the failure distributions were fit to a normal distribution a log-normal did not significantly improve the quality of the fits (chi-square). Figure 3 shows the failure distribution for 63 burned-in parts (all failed). A fit to a normal distribution is indicated by the solid curve. Figure 4 shows a cumulative distribution function (CDF) for the same burned-in devices. This distribution has a classic S-shape though the fit is slightly skewed to the right.
1) IRF6646 (80 V):
2) IRF6648 (60 V): Time limitations did not allow a full failure distribution for the IRF6648 (60 V) devices to be obtained. Instead, a lower bias limit (31 V) was established where no failures were seen. After two small bias steps (33 V and 35 V), V DS was raised to 40 V where 46 parts failed as shown in Figure 5 (14 survived). This is not an ideal test but it provided a bound to the lower limit and the width of The 60V cumulative distribution shown in Figure 6 is consistent with the 80V CDF from Figure 4 , shown here with a dotted line for comparison. Assuming both devices have an underlying normal distribution, it is inferred that the 60 V SOA can be bounded by using the threshold and width from the 80 V device failure distribution.
3) IRF6644 (100 V): Figure 7 shows failure distributions for the IRF6644 (100 V) devices using a full board each of burned-in and non-burned-in parts from the same lot. No parts were discarded during the burn-in process. The burned-in samples (63 failed) are fit very well by a normal distribution. The four failing at V DS = 60 V were excluded from the fit based on thermal issues. For non-burned-in devices, 62 of 63 failed. Figure 8 shows CDFs for the IRF6644 (100 V) burned-in and non-burned-in devices. The solid line is a CDF for the fit to the histogram data in Figure 7 .
All tested devices showed increased leakage and lowered V GS thresholds based on I-V curves taken after each exposure as seen in Figure 9 for the burned-in devices and Figure 10 for non-burned-in devices Both groups started out with similar leakage distributions (though non-burned-in parts had a little more variation). However, the two populations I-V curves for IRF6644 (100V) burned-in devices before exposure (blue circles), and after exposures at 0.8 krad (red squares), 1.6 krad (green diamonds), and 2.5 krad (gold triangles). All functional devices are plotted at each step, showing very tight distributions. I-V curves for IRF6644 (100V) non-burned-in devices before exposure (blue circles), and after exposures at 8.6 krad (red squares), 9.7 krad (green diamonds), and 18.8 krad (gold triangles). The non-burned-in parts start out with a small amount of variation but evolve with exposure to much wider distributions. These parts saw significantly more dose than those in Figure 9 . evolved very differently under irradiation. In neither case was there significant increase in leakage current for V GS = −2 V where the exposures were done. While this indicates that the devices were still able to fully turn off at that gate voltage, it is possible that dose effects could still affect the SEB sensitivity by reducing the potential barrier at the source-body junction.
It should be noted that the curves in Figure 100 correspond to significantly higher total dose than those shown in Figure 9 even though the maximum leakage current shifts are roughly similar. This arose from an exposure at 60 degree tilt angle and low V DS (no failures seen) and a longer than usual first step at normal incidence that effectively pre-dosed the board. For the tilt angle case, the parts sampled the beam at all angles with respect to the trench structure (details follow). This complicates the interpretation. Comparing Figure 9 and Figure 10 , it seems reasonable to conclude that the non-burned-in parts show much greater variation for similar shifts in the leakage current threshold, even though those shifts required greater total dose to achieve.
Past studies have shown that burn-in increases average threshold shifts and leakage currents in irradiated MOS devices [15] . The effect depends strongly on temperature but not on applied voltage [16] . The indicated activation energy was consistent with the diffusion of molecular hydrogen during pre-irradiation stress. It is possible that such a mechanism may be changing the number or distribution of interface traps even before irradiation. Figure 11 shows failure distributions for the IRF6643 (150 V) devices (61 of 63 failed in beam) at normal incidence and when mounted at 60 degrees tilt angle from the beam axis (32 of 63 parts failed). All devices in each case were burned in. Two devices at normal incidence failed at 75 V bias during trans-conductance sweeps (not shown). These were excluded from analysis based on thermal issues and the fact that that they still retained some ability to turn off.
4) IRF6643 (150 V):
The failure distribution for the 60 degree tilt case is very different from the normal incidence case. Note that both sets of parts were burned-in. The device is much less sensitive to ions at angles away from normal. Only 20 of 63 parts failed below 90V. Fourteen more parts failed at 90 V (not shown) during 5 repeated high fluence exposures. The board was very hot and these failures were also excluded from analysis. Figure 12 shows CDFs for the IRF6643 (150 V) devices. The solid line indicates the CDF for the normal distribution fit to the histogram data. The 60 degree tilt data show a dramatically higher (in V DS ) and slower-rising curve indicating a much lower sensitivity to failure compared to normal incidence.
The 60-degree tilt case offers an opportunity to also look at rotation angles. The part orientations on the board are clocked around a complete 360 degrees. At normal incidence this makes no difference but when the board is tilted with respect to the beam the parts sample the rotational angles as well. These angles are not equally represented, ranging from 3 to 8 parts in each 30 degree bin. However, when normalized to the total parts in each bin, there is a greater sensitivity for failures when the parts are oriented at 90 and 270 degrees with respect to the beam. This is shown in Figure 13 . The DirectFET devices have a trench structure that runs left to right across the die. The most sensitive rotation angles in the figure are aligned with the trench direction. Figure 14 shows a failure distribution for the IRF6641 (200 V) devices (41 of 63 failed). The bias was stepped in 5 V increments starting at the 82 V test level. Thermal issues may have had a role due to the high V DS bias. The data shown were collected in 13 runs for a relatively low total fluence compared to other devices in this paper. One additional bias step at 102 V saw all channels fail immediately, likely due to exceeding the 100 V rating on the drain path fuses. Even a 5% tolerance on the fuse rating could be a factor for the 97 V failures.
5) IRF6641 (200 V):
The CDF in Figure 15 has a sharp break in the curve above 80 V, as though from a second, distinct population of failures, perhaps thermal. We fit only the lower bias (<82 V) data.
B. General Observations From Results
The key observation (and the focus of this paper) was a wide variation in population response which underscores the need for a statistical treatment of part-to-part variation when qualifying MOSFETs. General test observations are summarized below to show which conditions influenced the results and thus are potential factors in part-to-part response variability.
Burn-in greatly reduced the part-to-part response variability in the IRF6644 devices, primarily by tightening the standard deviation σ (see Figure 7) . The V DS failure threshold did not appear substantially different between the two populations. The mean V DS failure level was lowest for tilt angles near normal. A 30 • tilt resulted in only a +0.6 V mean change (6 parts tested). A 60 • tilt produced a +19.5 V mean change as shown in Figure 11 .
Dynamic bias switching results suggested an increase in the V DS failure onset, consistent with previous observations [17] - [20] . Devices with V GS switched from −6 V to 13V and from 0 V to 12.5V showed mean failure V DS levels at or above the static −6 V or 0 V cases. The trend is suggestive but not highly significant due to large V DS steps between a passing level near the static mean value and the failure level.
A photomicrograph of a failed IRF6644 sample is shown in Figure 16 (left), magnified at right. Destruction near the guard ring was consistent with an interpretation of a singleevent burnout (SEB) mechanism traversing many trench cells (oriented left to right).
IV. ANALYSIS AND DISCUSSION
The results highlight the importance of ensuring that a DSEE test program uses a flight-like high-temperature burnin processing and that sufficient samples are tested to allow for statistical evaluation of tolerance limits. In our case, a 3-sample test of non-burned-in IRF6644 devices risks overestimating robustness since 78% of the non-burned-in devices exceeded the mean of the burned-in sample set. If a vendor selected radiation test samples prior to burn-in for DSEE acceptance testing it could produce an artificially large SOA. For example, if 3 devices were selected from the non-burnedin IRF6644 dataset at random, there is a 5% probability that the SOA would be established at 54 V (based on all failures occurring at 55 V or above). Applying a simple 25% safety margin factor produces a design limit of 40.5 V. And yet, 1 of 59 tested parts (1.7%) failed at 40 V and 6 of 59 (10.2%) failed at 41 V.
The hardness assurance goal of a DSEE test on devices like these is to establish a SOA for which a system using a large number (N) of total parts can be expected to have no failures at a given confidence level in the flight operating conditions. The general procedure to establish a left tail tolerance limit from a test sample result is as follows:
• Calculate the mean (μ) and standard deviation (σ ) for the full data set from the best-fit probability distribution (at the required LET threshold).
• Determine the per-device survival probability (P DEV ) needed to meet the system level survival requirement (P SYS ) given N total devices in the system.
• Determine the one-sided tolerance limit (K TL ) such that devices operated below this limit will exceed the perdevice survival probability P DEV at a given confidence level (CL) for the number of samples tested. K TL is simply the number of standard deviations (σ ) below the mean for which the remaining tail contains only (1-P DEV ) of the total area. This is a purely statistical quantity.
Values of K TL for the normal distribution can be found, for example, in MIL-HDBK-814 in Table IX C for a 95% confidence level, though readers are cautioned against errors in that reference (8 February 1994 edition).
• The SOA boundary is μ − σ * K TL .
• If desired, the SOA boundary may be multiplied by an additional derating factor to ensure the tolerance limit is not approached in an application with extremely low risk tolerance. Consider a hypothetical space application using 100 parts with a 99% system survival requirement. To achieve this, each part must have a survival probability P D E V = (0.99) 1/100 = 0.9999. Table 2 gives the values needed to calculate a safeoperating area for the four devices for which a failure distribution was measured. The K TL values were calculated based on the number of parts (N) used to determine the distribution, P DEV , and a 90% confidence level. The SOA value contains an additional 25% derating for illustration purposes. The table indicates that for an application running at 24 V max in the example system, the 80 V device would not be acceptable but the 100 V device could be used instead if the resulting loss of power efficiency were deemed acceptable.
The goal of the above steps is to achieve >99% probability of success for the full quantity of commercial MOSFETs with respect to DSEE mechanisms. This methodology is intended for "risk avoidance" where a reasonable level of immunity to DSEE is sought. This contrasts with rate-based methods that map heavy-ion cross section vs. LET and proton cross section vs. energy.
For cases where the test results align poorly to a normal distribution, an improved statistical fit may be sought, including treatments that handle asymmetrical tails. In our cases, once data believed to be affected by experimental issues were excluded, a normal distribution was not unreasonable. Fits to a log-normal distribution did not yield any significant improvement in the quality of the fit (chi-square), even for the 150V devices whose failure distribution appeared to favor an extended right tail. Fits to the CDF of an assumed distribution rather than the failure histogram offered a way to avoid large spikes in the failure histograms from unequal data taking steps. In practice this introduced challenges for normalizing the cumulative data when the complete failure distribution was not observed and did not result in any significant improvement.
In general terms, increasing the test sample population will reduce sampling error and lessen the K TL factor that accounts for it. It is recognized that many efforts will lack a reasonable means to obtain a 50+ sample result, but even an expansion from 3-6 samples to 10-20 allows for a significantly improved assessment of sample variation. For the previous example, a change from 3 to 10 samples reduces K TL from 16.6 to 6.2.
For commercial n-type MOSFET devices, SEB continues to be the dominant failure mechanism as opposed to single-event gate rupture (SEGR) seen more often in space-grade devices. This has implications for planning a part qualification. For example, SEB may occur over a broad range of tilt angles while SEGR sensitivity tends to decrease more quickly with angle, depending on the neck width. The V DS failure voltage threshold for SEB tends not to depend on ion species where a stronger dependence is expected for SEGR [10] . SEB may also have a proton or neutron sensitivity [20] .
The SEB failure mechanism is affected by numerous fabrication variables (e.g., doping and thickness of the source and body regions; diffusion profile of the source and body regions; and doping and thickness of the epitaxial layer and epitaxial buffer layer). These factors can produce part-to-part variations across a wafer, across multiple wafers within the same processing lot, and across processing lots.
SEB variability may have a greater impact than similar variations in SEGR [21] - [22] due to the typically higher sensitivity at lower LET values and at larger angles. This is particularly relevant for commercial devices where SEB is already the dominant mechanism. Generalizations require caution, however. Because of the lower visibility of SEGR in commercial devices, the part-to-part variations are less well known. This could be investigated by looking at commercial p-type devices which tend to fail due to SEGR. Also, since commercial MOSFETs tend to have wider drain neck regions, the SEGR angle susceptibility may be wider as well.
The IRF6644 results demonstrate the importance of performing burn-in on the test sample population. Since the righttail was broadened, a reduced sample size test carries a meaningful probability of selecting samples from the broadened right tail of the sensitivity distribution and thus of failing to recognize the sensitivity risk of the full part population. It has been demonstrated that an n = 3 with 25% margin SOA test approach would have produced an unacceptable risk of industry part usage in violation of the SOA. If the origin of the large variation in the non-burned-in devices is related to dose effects in the trench structure then the need to perform burn-in before tests may apply even if the parts are not commercial-grade.
V. CONCLUSION
These results show large part-to-part variability in five different part types of advanced commercial trench MOSFETs. A methodology is proposed for establishing a Safe Operating Area based on a one-sided tolerance limit in a Gaussian fit of the distribution of DSEE V DS onset. Any SEE testing of commercial MOSFETs that uses a small test sample size may fail to consistently capture the full extent of part-topart variability. This could have serious consequences for space qualification if a small sample test is widely leveraged by companies for hundreds or thousands of uses. Particular caution is advised where a single-event burnout mechanism is dominant and part-to-part variation may challenge traditional MOSFET SEE qualification methods. Burn-in is a vital step in MOSFET SEE qualification. Failure to apply burn-in may result in an understated SOA due to use of a distribution that is non-representative of the screened spaceflight product.
