Here, for the first time, advanced simulation models are used to investigate the performance advantage of Schottky source/drain ultra-thin-silicon technologies at a 25 nm gate length target. Schottky and doped source/drain MOSFETs were optimized and compared using a novel benchmark. Mixed-mode simulations of optimized devices in a two-stage NAND chain show an approximate 45% speed advantage of Schottky source/drain for one set of parameter choices. Contact requirements for Schottky source/drain, and for doped source/drain relative to ITRS targets through 2016, are discussed.
I. Introduction C MOS technology continues its relentless drive toward 25 nm gate lengths. With length scaling comes reduced intrinsic channel resistance, increasingly emphasizing source/drain (S/D) extrinsics. Yet vertical scaling, in particular for single-gate fully-depleted silicon-on-insulator (FDSOI) and dual-gate 1 technologies, results in an increase in the sheet resistances of doped S/D regions. As dimensions shrink, the challenge of establishing and maintaining the required S/D doping gradients [2] , including the inevitable atomistic stochas-tics [3] , grows more daunting. One alternative to doping is Schottky (metal) S/D [4] . Metal S/D technology is challenging, but at L = 25 nm, so are all other options. It's thus important to understand the relative performance which can be achieved by doped and metal S/D's.
A previously published simulation study [5] compared these device types. However, the present work has several improvements. One is the use of the density gradient method [6] , which accounts for quantum mechanical electrostatic effects which are important at these dimensions. Another is the use of the thermionic field emission model of Ieong et al. [7] ; the approach used in the prior work neglects tunneling and thus can substantially underestimate current at metal-semiconductor contacts with large electric fields. Also, a more aggressive device design is considered here. The result is a much more realistic prediction of relative device performance, particularly considering the effect of changes in the metal S/D workfunction.
In [8] , mixed-mode simulation is used to demonstrate the advantage of underlap in dual-gate metal S/D design. Here a broader view is taken, with an emphasis on a fair comparison of dual-gate and single-gated devices of each S/D type, using both discrete device benchmarking and mixed-mode circuit performance.
Other comparisons of Schottky and doped S/D ultra-thin-body MOSFET performance [9] [10] have used devices with unoptimized source/drain placement. Furthermore, thresholdvoltage roll-off was neglected, and circuit performance was not directly simulated.
II. Device Structure
NFET cross-sections are shown in Fig. 1 . The S/D contacts were angled to reduce coupling capacitance while allowing a contact close to the channel. Contact was to the sidewall of metal S/D's and to the silicon top surface of doped S/D's. The gate was tunableworkfunction metal, where the workfunction was set to meet the leakage current constraint.
With FDSOI, the substrate was treated as a metallic contact to the bottom of the buried oxide.
[ Fig. 1 
about here.]
Doped S/D's were modeled using Gaussian profiles of arsenic (NFET) or boron (PFET)
-one for FDSOI, and two for dual-gate. For each, concentration was constant along a reference line segment from the edge of the device to ∆x = x SD from the edge of the gate, at ∆y = y SD from the silicon surface(s) (top only for FDSOI, both to maintain symmetry for dual-gate). For each reference segment, the profile was attenuated as a Gaussian of σ = σ SDy above and below the segment, σ = σ SDx to the side. For dual-gate, the Gaussian amplitudes were reduced to maintain a net concentration of N SD on each reference segment given the contribution from the other profile in the same S/D region. Atomistic dopant effects, significant in doped devices of this scale [3] , were neglected.
Key NFET parameters are shown in Table I . PFETs were designed with the same parameters as NFETs, but with doping types swapped, and with metal S/D workfunctions near the silicon valence band rather than the conduction band.
[ Table 1 about here.]
III. Modeling Details
A. Device Modeling ISE/DESSIS version 8.0.5 [11] was used. Discrete device simulations used the density gradient model [12] with parameters from [13] . Metal S/D's were treated using the DESSIS implementation of the nonlocal thermionic field emission model of [7] . Light-carrierdominated tunneling was assumed (NFET electron effective mass of 0.19, PFET hole effective mass of 0.16), with a Richardson factor of 2.1 for NFETs and 0.66 for PFETS [11] and a tunneling submesh of 100 points 2 over 30 nm. Continuity equations and the density gradient equations were both solved for minority carriers only, as bipolar effects were found to be insignificant at the current levels of interest.
Low-field transport was treated via the drift-diffusion model of [14] . Assumptions implicit in this model are invalid at these gate lengths and silicon thicknesses. But the relative effect of the S/D is the focus here, and thus a highly accurate channel transport model is not required. High-field transport was handled with a Caughey-Thomas term [16] :
where v is the magnitude of the mean carrier velocity, µ is the low-field mobility, |∇φ F | is the magnitude of the quasi-Fermi potential gradient, v sat is the saturation velocity, and β is an empirical parameter. Values of v sat and β to accommodate Monte-Carlo simulation of ballistic transport in decananometer devices are described in [17] . For this work, given the lack of experimental data on high-field transport in silicon films in the t Si ≤ 5 nm range, more conventional values from [11] after [18] were used : for electrons β = 1.11
and v sat = 1.07 × 10 7 cm/sec, while for holes β = 1.21 and v sat = 0.84 × 10 7 cm/sec. If electron velocities were underestimated, then performance advantages predicted here were also underestimated, as the extrinsic resistance would be more significant.
Metal contacts were treated as constant potential surfaces whose workfunctions were specified relative to the vacuum potential. The assumed value of X e , 4.072 eV at T = 300 K, must be considered when comparing workfunctions specified here to those published elsewhere.
2 reduced to 40 points for mixed-mode simulations 3 See [15] for another example of the use of drift-diffusion to compare 25 nm ultra-thin-body MOSFETs.
B. Meshing
Meshing was performed using ISE/MDRAW version 8.0.4, with approximately 2300 to 5000 mesh vertices per device for discrete device simulations, 880 to 1900 for circuit simulations.
C. Device Benchmarking
It is critical in device benchmarking to consider the effect of device variation; the "average" device characteristics alone are insufficient. In [19] , variability in buried oxide thickness, silicon thickness, gate oxide thickness, and doping are all considered. Here the simpler approach of [20] was followed, with all device variation assumed to be in the gate length.
Leakage current
shorter-than-average devices dominate the net circuit leakage. 5 The gate workfunction which yielded a leakage target of 20 nA/µm at L sub was used to simulate the L = L sup ≡ L nom +2σ L supernominal device, under the assumption that the clock-limiting path among many candidates will likely have longer than average devices. After Equation 5.38 of [21] , the supernominal current was evaluated at V G = 0.7 V DD 6 for both V D = V Dlin ≡ 50 mV (yielding I tlin ) and V D = V DD ≡ 1 V (yielding I tsat ). These are summarized in Table II . A harmonic mean, weighting in proportion to drain voltages 7 , was used as the benchmark :
[ Table 2 about here.]
The device with L = L sub has leakage current which approximates the average leakage current of all nominally equivalent devices of nominal length Lnom , assuming gate length is normally distributed and leakage current is near-exponential with gate length [20] . 6 The optimal VDD fraction is specific to a given application; [21] claims it to be a "numerical fitting parameter" and "≈ 0.7". The use of ID at VG = 0.7 VDD in Table II follows from Equation 5.38 of [21] if ID ∝ VG − VT .
7 This crudely averages the channel resistance during switching. The optimal weighting of 1/I tlin and 1/Itsat is also application-specific, and thus the weighting should calibrated with circuit simulations.
This approach is more complex than just looking at I Dsat of the nominal-length device.
However, the advantages are :
1. It includes consideration of low-V D performance, which is needed if the circuit is to gain the full benefit of the supply voltage range. 
D. Device Design
Parameters to be optimized were x SD for the doped S/D, and L x for the metal S/D. L x was limited to at least 3 nm to avoid S/D shorts to the gate. In doped S/D devices with x SD > 3 nm, L x was set to x SD to keep the contact in the heavily doped region.
Silicon thickness t Si was constrained to be 4 nm for the FDSOI devices, based on several considerations. One is that for t Si ≤ 3 nm, structural electron confinement severely reduces mobility [22] . Another is that the threshold voltage sensitivity to silicon thickness becomes significant as t Si drops below 5 nm [23] , excessively increasing intradevice and in- Tuned gate workfunctions of selected subnominal-length devices are shown in in Fig. 2 .
As the source and drain are moved closer, S/D-to-channel coupling becomes stronger, and the gate workfunction needs to be increased to maintain the off-state leakage condition. This coupling is stronger for a metal than a doped S/D, due to the lack of a depletion region in the metal. Thus the metal S/D should have a larger optimal S/D-to-gate offset.
The primary "uncontrolled" parameter was ρ SD for the doped S/D, and ∆Φ SD for the metal S/D. 9 For each value, the parameter to be optimized was varied in 1 nm increments until the maximum of I t was straddled. The optimal L x or x SD was then parabolically interpolated, and the associated device simulated to yield the optimized device benchmark.
10
An example of this optimization, with a finer sampling of L x values, for metal S/D is shown in Fig. 3 . For FDSOI, optimization is a balance between reducing short channel effects with a greater S/D separation (longer extension), and maximizing drive with a reduced S/D separation (shorter extension). The optimal was still a considerable S/D-to-gate underlap.
For dual-gate, the minimum L x = 3 nm was preferable to greater underlaps. A similar tradeoff is involved with x SD for the doped S/D devices.
[ Fig. 3 about here.]
9 ΦSD may differ from the value appropriate for planar junctions due to effects of quantum confinement at nanometer scales. Guo et al. [10] discuss the effect on the Si conduction band, but the metal bands, and the states responsible for Fermi level pinning [26] , will also be affected. 10 If, in the case of the metal S/D, the peak It was for Lx < 3 nm, Lx = 3 nm was used instead.
IV. Results & Discussion
A. Discrete Device The PFET metal S/D benchmark advantage was approximately 23%.
[ Table 3 about here.]
B. Circuit
A two-stage NAND chain with fan-out of two (see Fig. 6 ) was simulated using the DESSIS mixed mode capability to assess integrated behavior in a sample application. All devices had L = L sup , in order to predict the performance of the clock-limiting path, for which it is assumed there are many candidates. The gate-loaded limit was considered, so interconnect loads were omitted, and device widths were normalized. The output load was a dummy gate -a metal-SiO 2 -n + (10 20 /cm 3 ) capacitor of the same L, t ox , and Φ G . All devices were assigned a gate workfunction which yielded I D = 20 nA/µm for V G = 0 and
at L = L sub , as determined in the discrete device simulations described in IV-A.
[ Fig. 6 about here.]
To accommodate the increased device count, mesh spacings were roughly doubled relative to the discrete device simulations, and the thermionic field emission submesh was reduced from 100 to 40 points. For numerical stability, the density gradient model was not used.
12
The reduced mesh density was observed to have negligible effect on device I-V characteristics.
Simulations were done for the devices from Table III . Input V in was a 1 ps transition from 0 to 1 V at t = 0, followed at t = 50 ps by a 1 ps transition back to 0 V. Sample rising-edge transient responses are shown in Fig. 7 . The output V out was considered to have switched at 80% V DD for the rising transition and 20% V DD for the falling transition. 13 Delay was calculated relative to the previous crossing of 50% V DD by V in .
Results, seen in Fig. 8 , show the metal S/D to be approximately 45% faster than doped S/D for the same device type. This is a somewhat larger improvement than was predicted using the benchmark for discrete device optimization, I t , which was uncalibrated and which neglects the effect of capacitance differences.
[ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 ¡ 6 
Φ SD =4.05 eV Fig. 3 . Device optimization is a tradeoff between managing short channel effects via greater source-drain separation and reducing resistance via a closer S/D placement. As extension length is increased, the V T vs. L curve flattens (b right), which retains more performance at L = L sup for Φ G set based on leakage at L = L sub (a). Setting Φ G based on leakage at L = L sup , neglecting V T roll-off, yields a preference for short extensions (b left). For single-gate FDSOI, the V T roll-off is substantial, and the optimum I t is near L x = 3.9 nm. Dualgate MOSFET gate control is stronger, and thus with these devices the resistance advantage of a smaller L x can be exploited. In 
