, provided by the authors. This consists of a PDF file which contains detailed explanations of the semi-empirical Notre Dame TFET model and 2D-FET model. This material is 169 KB in size.
I. INTRODUCTION
T UNNEL FETs have become candidates for low-power, high-frequency integrated electronic applications because of their potential for steep subthreshold slope (SS) and high transconductance at low voltage supplies [1] - [6] . They can be implemented with largely depleted channels in the saturation region of their I ds -V ds curves, leading to low gate capacitance, which is favorable for high frequency operation. In the conventional lateral tunneling structure, pictured in Fig. 1(a) , it is necessary to use thin channels in order to allow gate control over the full width of the channel. Thin channels also mitigate short channel effects by shielding the source junction from the drain voltage. However, thin channels also decrease the area available for tunneling current flow and thus the maximum current available. From the standpoint of fabrication, the lateral FET structure is typically achieved either by growing nanowires, or by etching epi-layers into narrow pillars. The former method usually requires accurate metal deposition for gate and drain, while the latter often suffers from poor quality at the interfaces where etching occurred.
To cope with these issues, vertical tunneling FET structures, as shown in Fig. 1(b) and (c), have been proposed by several groups. For example, Xing's group has proposed the Thin-TFET modeled both physically [7] and using a neural network methodology [8] . It is composed of two transition metal di-chalcogenide (TMD) materials layered as a stack. The van der Waal's gap between the two layers serves as tunneling gap when the two layers are biased properly. Such structures have been experimentally tested as tunneling diodes by several groups [9] , [10] and shown to have band-to-band tunneling characteristics. An electrically controlled carrier density in the source region could avoid dopant-induced band tails, enhancing the steep slope characteristics of the device. Additionally, better electrostatics could be achieved in the vertical tunnel structure due to high gate efficiency to all controlled regions, as opposed to the lateral tunneling case where the middle of the chan- nel has reduced gate control. Finally, no etching process is required at the tunneling junction, where process-induced defects could deteriorate the device performance. III-V and silicon/germanium-based materials, which have more mature epi-growth techniques, have also been proposed for fabricating similar structures based on vertical tunneling [11] - [13] . Additional combinations of materials have also been demonstrated; for example, in [14] germanium, it serves as the source layer and a 2-D-layer of MoS 2 serves as the drain layer.
A significant consideration for vertical tunneling FETs is that the overall tunnel current tends to increase as the channel length is increased, because of the increased area available for tunneling. However, this paper shows that the lateral resistance in the drain layer and related voltage drops associated with the lateral conduction can limit the overall current for long channel lengths. In the saturation region, the drain layer of device becomes depleted of carriers, which results in a high lateral resistance; voltage along the drain layer then changes rapidly, which causes highly nonuniform tunneling current. In such a case, the assumption of uniform potential across the layer is not valid and leads to overestimates of the current density. The distributed effects introduce a complex dependence of ON-state current on channel length. The channel charging/discharging process also strongly depends on the channel carrier concentration profile and the extra resistive component due to lateral conduction. As a result, the RF characteristics are also strongly affected by device dimension. Under conditions when the channel carriers are strongly accumulated before tunneling turn-on, a Miller effect intrinsic to the device can be observed. This paper presents a distributed model that can describe the nonuniformity of potential and tunneling current along the channel, which allows investigation of the joint effects of the vertical and horizontal limits on current flow. The model equations have been implemented in Verilog-A by using a set of serially connected unit elements to allow evaluating device and circuit performance with standard circuit simulators. Key results stemming from the distributed effects on dc, ac, and RF characteristics are discussed in this paper.
In Section II, the operation of the vertical tunneling FETs is described along with the governing equations. The distributed circuit structure is introduced, and models with fitted parameters are proposed for vertical tunneling and lateral conduction elements. In Section III, lateral and vertical conduction parameters are set up and then distributed effects are described in detail in Section IV, showing band diagrams and tunneling current distributions along the channel. In Section V, dc characteristics are illustrated and the impact of distributed physics is analyzed by comparing ONcurrent with varying gate length and mobility values. Section VI discusses the impact of lateral conductivity on the intrinsic capacitance values and cutoff frequencies, showing that gate length is an important factor in improving the RF performance. Extrinsic parasitic elements are also included in the simulation for practical estimation of frequency response as the intrinsic elements are scaled down.
II. DEVICE STRUCTURE AND MODELS
The operation of an n-type vertical TFET can be described briefly as follows. As the top gate bias increases toward positive values, a tunneling window starts to open between the p-type source layer and the drain layer, and electrons begin to tunnel from source valence band to drain conduction band. In the linear region (when V ds is small), only a small potential drop is expected from source terminal to drain terminal across the drain layer, since lateral conduction proceeds readily in the drain layer, which has abundant carriers; as a result, the tunneling current density is relatively uniform along the channel. As the value of V ds is increased, there is progressively more voltage drop across the drain layer associated with current flow from source to drain in the drain layer, due to limited lateral channel conduction. This changes the profile of tunneling current density along the channel, leading in general to significant nonuniformity.
In the saturation region (high values of V ds ), the region of drain layer near the drain terminal becomes depleted, leading to more vacant conduction band states for tunneling from the source layer. The lateral FET starts to operate in saturation condition. The drain layer near source terminal is not depleted, however, and as a result, the tunneling current density near the source will be lower than near the drain terminal.
VOLUME 3, 2017
A distributed circuit is used to emulate the coupled conduction mechanisms, as shown in Fig. 2(a) . Each unit cell consists of lateral conduction modeled by FETs (for drain layers) and resistors (for source layers), and vertical tunneling between drain and source layer, modeled by a tunnel FET illustrated in Fig. 2(b) . Fig. 2(c) shows a capacitive network unit cell which is connected to both sides of the unit element to emulate the distributed charge densities of the full device. Models for the unit components are described in the following. 
A. VERTICAL CURRENT COMPONENT
For the vertical tunneling current component, a semiempirical TFET model developed by Lu et al [15] based on Kane's model is adopted in each unit cell. The current density j tun can be expressed as
The detailed expression for each term is provided in the supplementary material; a brief overview is presented here to provide physical background. The term f (V ds ) determines the turn-on characteristics of I ds -V ds . Its physical origin corresponds to the quasi-Fermi level difference between source and channel. V TW (V gs , V ds ) describes the tunneling window modulation by both V gs and V ds . In the saturation region, V TW is controlled mainly by the gate. In the linear region, V ds will also modulate the tunneling window by decreasing the carrier concentration and thus affecting the electro-statics (known as the debiasing effect [16] ). F(V ds , V gs ) is the electric field at the tunnel junction, and is approximated by a simple linear function of both V gs and V ds . The gate voltage V gs for the tunnel FET also controls the 2-D FET in the drain layer.
B. LATERAL CONDUCTION
An analytic 2-D-FET model is used to describe lateral conduction in the drain layer [17] [18] based on drift-diffusion mechanisms. It assumes quasi-equilibrium for carriers and validity of Fermi-Dirac statistics.
The conduction current can be calculated with the gradual channel approximation, leading to an analytic approximate form:
Here K drift and K iffusion are expressions determined by the terminal potentials at source and drain, whose values are derived in the supplementary material. K drift is a drift-controlled term, which dominates in the ON-state, and K diffusion , diffusioncontrolled, dominates in the subthreshold regime. The formulation adopted in this paper is subject to a variety of limitations. Because of the quasi-equilibrium assumption, the channel length should at least exceed the scattering mean free path (MFP). Below the MFP, a quasi-ballistic or ballistic model should be used and the quasi-Fermi level is not a valid definition. Thus, the present model properly represents the performance of long gate length devices, and can give approximate guidelines for design as channel dimensions scale down. Even for short channel length devices, this model could indicate an upper limit of performance, as the driftdiffusion model usually overestimates the conductance of FETs compared to the quasi-ballistic or ballistic model.
Once the whole distributed system is solved, local carrier density in the drain, which corresponds to the charge at C q in Fig. 2(c) , can be determined along the channel. This carrier density be separated into two parts: one controlled by the top gate and the other controlled by the source layer (acting as bottom gate), which correspond to the charges at C tg and C vdw in Fig. 2(c) respectively
Smooth models for terminal charge description guarantee good convergence and efficient simulations.
III. PARAMETER SETUP AND SOLUTION APPROACH
The parameter set and representative values are summarized in Table 1 . The values of mobility for lateral conduction may vary from material to material. 2-D-TMD based FETs have been demonstrated to have a field-effect mobility ranging from several tens to about 200 cm 2 /Vs [19] - [21] at room temperature. In this paper, as a representative example we choose SnSe 2 as drain layer and WSe 2 as source layer, using for both a mobility of 250 cm 2 /Vs for lateral conduction unless otherwise specified. The parameters for the vertical tunnel FET are fitted to the physics-based n-type Thin-TFET (SnSe 2 /WSe 2 ) model reported in [7] with 0.35 nm van der Waal's gap and lattices for the 2 layers perfectly aligned (assumed for optimized current conduction). It should be noted that theoretically a slight increase as small as 0.3 nm in the van der Waal's gap thickness will decrease the current density by one order of magnitude. Without confirmation from experiments, it is often treated as a fitting parameter within a reasonable range. Rotational asymmetry may be another factor that degrades current density for vertical TFETs built by 2-D materials including both TMDs [22] and graphene [23] .
The system of equations has been solved by MATLAB with 300 unit cells serially connected. The problem becomes a set of coupled current continuity equations in MATLAB whose Jacobian matrix dimensions are determined by the number of unit cells. The system has also been solved using Advanced Design System (ADS), a circuit simulator for RF, microwave and high-speed applications, with 10 unit cells serially connected. The basic models for vertical and lateral conduction utilized in ADS are compiled by Verilog-A code with analytic expressions integrated to guarantee a high level of calculation efficiency.
IV. LATERAL DISTRIBUTIONS OF POTENTIAL, CURRENT AND CHARGE
The vertical tunneling current density is largely controlled by two factors: the tunneling window and the difference of quasiFermi levels between source and drain layer. The former factor is controlled by the gate bias and the latter factor by the drain bias. In the saturation region, the large difference of quasi-Fermi levels between source and drain layers will deplete the drain layer such that empty states in the drain are available for electrons from the source layer to tunnel. However, the lateral 2D-FET component is also affected by the depletion of drain layer: the more depleted the drain layer is, the higher its resistivity and therefore the lower its lateral current will be. These two conduction mechanisms compete with each other until a steady state current is reached.
In Fig. 3(b) , a variety of distribution profiles in ON-state (V gs = 0.3 V, V ds = 0.3 V) are shown, along with a corresponding structure of the vertical TFET in Fig. 3(a) . Band alignments at three different positions along the channel are also illustrated. It can be found that both tunnel window and quasi-Fermi level difference are larger near source and drain terminals than in mid-channel. This results in higher tunnel current contributions near source and drain terminals. It is noted that near the source terminal, the carriers in the drain layer are not fully depleted and therefore there is less net tunnel current contribution than near the drain terminal. Low mobility necessitates a higher carrier concentration in the drain layer to maintain a balance between vertical and lateral conduction. On the other hand, infinite mobility results in a fully depleted and equi-potential drain layer while allowing conduction of the tunnel current. In the mid-channel region the quasi-Fermi levels in drain and source layers almost line up, thus this region contributes negligible tunnel current even though the tunnel window opens. Under this bias condition, the E-field near drain terminal is calculated to be 0.13 MV/cm.
In Fig. 3(c) , illustrating the OFF state (V gs = 0 V, V ds = 0.3 V), the tunneling windows closes through the device, and the current tunneling from source layer to drain layer is very small. The band alignment does not vary along the channel, as also shown in the schematics in Fig. 3(c) . There is little voltage drop laterally because of the extremely low lateral current flow in the OFF state. Ideally when the tunneling window closes, there is no tunneling current and the two layers should be isolated. In practice, leakage mechanisms such as band tails or trap assisted tunneling (TAT) could happen even when there is no tunneling window. The semi-empirical model has taken that into account with an Urbach factor that controls the current around subthreshold region [15] .
V. DC CHARACTERISTICS
DC characteristics have been calculated both within the ADS framework using 10 unit cells and within MATLAB using 300 unit cells, and results are in good agreement. Fig. 4(a) shows the I ds -V gs curves for the nominal parameter set. Also plotted in Fig. 4(a) is the tunneling-limited current, which is the aggregate tunneling current between source and drain assuming equi-potentials for these layers; and 2D-FETlimited current, which is the lateral conduction current along the drain layer without any distributed tunneling current from the source layer, and ideal current supply on the source side of the drain layer. It can be seen that in the subthreshold region, the tunneling limited current is much lower than the 2D-FET limited current. The distributed vertical TFET characteristics follow the tunneling limited current and preserve the steep subthreshold slope. On the other hand, in the ON state, where the tunneling-limited current is much higher than the 2D-FET limited current, the distributed vertical TFET is limited by lateral conduction.
I ds -V ds characteristics are shown in Fig. 4(b) for the distributed model. The figure includes behavior at negative V ds , for which there is a negative resistance region well known in tunneling FETs. It is noted that here the negative differential resistance region is tilted due to series resistance stem- ming from lateral conduction of both source layer and drain layer.
To further characterize the effects of lateral conduction on dc characteristics, ON-state and OFF-state (at fixed bias conditions) current densities versus gate length are plotted in Fig. 5(a) and (b) using two sets of parameters. The first set is the same as stated in Section III based on the assumed TMD material system. TMD materials are expected to reach an intrinsic mobility up to 500 cm 2 /Vs based on density functional theory formalism [24] , therefore, a mobility of 250 cm 2 /Vs is a reasonable estimate. The second set of parameters has a higher mobility of 2000 cm 2 /Vs, which is closer to a III-V-based material (such as a 15-nm layer of InAs achieving field-effect mobility up to 2300 cm 2 /Vs [25] ). Both plots show opposite trends of current variation with decreasing gate length for the limiting mechanisms: 2D-FET-limited current is inversely proportional to channel length for long channel FETs, while tunneling-limited current is proportional to channel length. Although short channel effect and velocity saturation will prevent the channel length proportionality from being strictly valid as it scales down, the overall trend of increasing current with decreasing gate length will still hold. Furthermore, thanks to the ultra-thin body of the 2-D material, the FET device is rather resistant to short channel effects [26] . The figure shows that the trend for the overall distributed model follows the mechanism with lower current density. Thus, when µ = 250 cm 2 /Vs, the 2D-FET limited current is always less than the tunneling limited current, and the distributed model follows the trend of the pure 2D-FET of increasing current as gate length shrinks down to 20-nm in Fig. 5(a) . On the other hand, when µ = 2000 cm 2 /Vs in Fig. 5(b) , the tunneling-limited current is lower than the 2D-FET limited current below L g = 40 nm, and here the overall current of the distributed model decreases with decreasing channel length, following the trend of tunneling-limited current. Beyond 40 nm, increasing channel length decreases the current for distributed model, following the trend of 2D-FETlimited current. It is found that the OFF-state current always scales up with increasing channel length. This is because the tunneling mechanism is the bottleneck of conduction in the subthreshold region. Hence, in terms of ON/OFF ratio, a vertical tunnel device with shorter gate length is more desirable. In Fig. 6(a) , the plot illustrates a convergence of OFF-state current for devices with different channel length's while ONstate current varies. A more practical mobility of 20 cm 2 /Vs for TMD material at an early stage of development is also simulated and compared with the ideal TMD mobility of 200 cm 2 /Vs and III-V MOSFET mobility of 2000 cm 2 /Vs in Fig. 6(b) .
The simulations indicate that channel length can be an important parameter for optimizing the ON-state current of the vertical TFETs. They also show that the dependence of overall current on channel length follows the limiting conduction mechanism with smaller current density. We expect if the tunneling-limited component is more than one order of magnitude lower than the lateral conduction limited component, the effect of the 2D FET conduction limit is not significant. This corresponds to the case discussed in [27] .
VI. AC AND RF CHARACTERISTICS
AC characteristics have been calculated by incorporating the capacitance network of Fig. 2(c) together with the dc components in each unit cell. As noted in the figure, the capacitance network includes the quantum capacitance
which is a nonlinear function of V c , and in general varies considerably along the channel.
For any two-port device, the small signal characteristics can be described by the corresponding Y-matrix
The matrix elements can be described as a sum of real parts (conductances and transconductances) and imaginary parts (capacitances and transcapacitances)
Note that C gd derived in this manner from the Y-matrix with distributed physics is more complex than the case for a simple C gd bridged between gate and drain in a lumped equivalent circuit model. It is still an important parameter for circuit effects, such as the Miller effect on input capacitance.
The capacitance values are calculated quasi-statically
A variety of behaviors of C mn versus bias can be found depending on the threshold voltages for tunneling V TUN th and for lateral conduction V FET th . The overall small signal capacitances at different biases are shown in Fig. 7 Fig. 6 also shows an asymptotic curve of capacitance as mobility approaches infinity with V TUN th = 0.12 V. The asymptotic curve shows the extreme condition when lateral conductivity is so high that a uniform drain layer potential is formed. Such a condition is assumed in [7] and our capacitance results are in reasonable agreement with the results in [7] after scaling to the same gate length.
It can be observed in Fig. 7(a) that the gate capacitance for the case of infinite mobility is lower than for the case of realistic lateral conduction mobility. For the case of asymptotically high mobility, carriers do not accumulate in the drain layer. As soon as they tunnel into it, they are swept by drift out the drain terminal; only carriers in the source layer carriers contribute to the total capacitance. When mobility is finite, carriers accumulate in the drain layer even in the saturation condition, as is already depicted in Fig. 3(b) . Thus, they produce an extra capacitance contribution to C gg compared to the asymptotic mobility curve. This effect is further enhanced when the tunnel device turns on at a higher voltage (higher V TUN th ). It is noteworthy that the curve with V TUN th = 0.2 V in Fig. 7(a) shows a peak at V gs around the turn-on condition for tunneling. This peak of C gg is caused by the compound effect of lateral and vertical conduction. For V gs values below turn-on, the drain layer is an equipotential and exhibits a relatively high carrier density; tunneling limited conduction with a very low current value dominates the transport. Once the device is turned on, the drain layer becomes depleted by the positive drain bias, inducing a decrease of potential in the drain layer. These opposite trends of change in gate bias and drain layer potential result in an enhancement in input capacitance similar to the Miller-multiplication effect. This feedback effect is less prominent if carriers accumulate to a lesser extent when the current is turned on, as shown in Fig. 7(a) for V TUN th = 0.12 V. C gd with infinite mobility is greater than with finite mobility in the linear region, as shown in Fig. 7(b) . This is because the drain layer charge can be easily modulated by drain voltage when carriers are accumulated and the drain is extremely conductive. In the saturation region, lateral conduction induces higher C gd compared to the asymptotic curve, which may be a concern for an enhanced Miller effect for practical operation.
With the appropriate modeling of both current and charge, intrinsic RF characteristics can be determined. The RF performance is illustrated by evaluating the cutoff frequency f T using the equation
To study the RF performance of vertical TFETs, a comparison of vertical tunnel devices (µ = 250 cm 2 /Vs for both drain and source layer) with varying gate length (100 and 20 nm) is made in Fig. 8 by plotting the color maps of f T overlaid with I d -V ds characteristics. The f T color map is useful to reveal physical limiting factors for high performance devices as well as to determine the optimized operation region. As is illustrated in Fig. 8 , a drastic improvement in intrinsic cutoff frequency is accomplished for vertical TFET design by shrinking down the gate length (350 GHz to 2 THz). To further explain enhanced performance of intrinsic f T with shrinking gate length for vertical tunnel FET design, transconductance (g m ) and input capacitance (C gg ) are plotted versus gate bias at fixed V ds = 0.3 V in Fig. 9 . An improvement in transconductance g m (×1.5 times approximately at 0.3 V) and a decrease in capacitance (1.7 fF/µm versus 0.5 fF/µm at 0.3 V) together contribute to the improvement in cutoff frequency. The improvement in transconductance originates from the predominance of lateral conduction when µ = 250 cm 2 /Vs as stated in Section V. The high capacitance value of the 100 nm vertical TFET mainly originates from the nondepleted drain layer in saturation, degrading the speed for the vertical TFET. Shrinking gate length will effectively shrink the area that accumulates carriers for the vertical TFET structure, and consequently leads to a much lower input capacitance. On the contrary, for lateral TFET structures, transconductance and capacitance are more weakly dependent on gate channel length [16] . It is noteworthy that scaling the gate length down will not improve the transconductance g m necessarily, especially when vertical conduction becomes the limiting mechanism. But C gg should always scale down with decreasing gate length. In the asymptotic limit of infinite mobility, transconductance g m and input capacitance C gg will both scale linearly with gate length. The intrinsic cutoff frequency thus becomes independent of the gate length, L g . Despite the drastic improvement in intrinsic f T from gate length scaling, extrinsic elements will finally become the bottleneck for improvement. To appropriately take these into account, a list of extrinsic elements such as parasitic capacitance and contact resistance values are tabulated in Table 2 . A schematic diagram showing the relationship between the elements and the device structure is also pictured in the inset of Fig. 10 . The parasitic capacitance values are estimated according to Boeuf et al. [28] , with the device dimensions also listed in Fig. 10 . Note that channel length L g has little effect on parasitic capacitance (less than 10% when varying from 20 to 100 nm) and we have chosen the largest value when L g = 100 nm. The cutoff frequency with varying gate length at V gs = V ds = 0.2 V in Fig. 10 clearly illustrates the limiting effect from extrinsic elements at shorter gate length. The extrinsic f T evaluated at 20 nm reads 800 GHz at this bias condition, which is desirable for low power high frequency applications. The vertical tunnel FET cutoff frequency at V gs = V ds = 0.2 V is 600 GHz. This can be compared to a 25-nm gate length finFET with f T = 300 GHz reported in [29] with slightly higher parasitics of 0.5 fF/µm. The finFET uses a higher bias of V gs = 0.5 V and has higher power consumption.
VII. CONCLUSION
In this article, the effect of distributed physics on vertical tunnel FETs is studied for dc, ac, and RF characteristics. Because of the competing mechanisms for lateral and vertical conduction, there exists an optimized channel length for highest dc current. The distributed physics for vertical TFET also results in high capacitance value in saturation with long channel length and not fully depleted drain layer in saturation. Shorter gate length will improve the RF performance by drastically reducing the capacitance without the degradation in transconductance, which is different from the lateral TFET design case. Even after the inclusion of extrinsic elements, the cutoff frequency could still be improved significantly by shrinking the gate length, which is desirable for low power high frequency applications.
