Channel length scaling in graphene field effect transistors (GFETs) is key in the pursuit of higher performance in radio frequency electronics for both rigid and flexible substrates. Although two-dimensional (2D) materials provide a superior immunity to Short Channel Effects (SCEs) than bulk materials, they could dominate in scaled GFETs. In this work, we have developed a model that calculates electron and hole transport along the graphene channel in a drift-diffusion basis, while considering the 2D electrostatics. Our model obtains the self-consistent solution of the 2D Poisson's equation coupled to the current continuity equation, the latter embedding an appropriate model for drift velocity saturation. We have studied the role played by the electrostatics and the velocity saturation in GFETs with short channel lengths . Severe scaling results in a high degradation of GFET output conductance. The extrinsic cutoff frequency follows a 1⁄ scaling trend, where the index fulfills ≤ 2. The case = 2 corresponds to long-channel GFETs with low source/drain series resistance, that is, devices where the channel resistance is controlling the drain current.
Introduction
Graphene field-effects transistors (GFETs) are expected to become very relevant in radio frequency (RF) electronics [1] [2] [3] because of the exceptional intrinsic properties of the graphene: a carrier mobility over 10 5 cm 2 V -1 s -1 and a saturation velocity of about 10 8 cm s -1 [4] . Theoretical studies have predicted GFETs to be able to work above the THz regime [5] . Relevant figures of merit (FoMs) in the field of high frequency electronics are the cutoff frequency and the maximum oscillation frequency max , the latter even more important than the former in practical applications. Reaching frequencies of 1 THz in these FoMs would mean large improvements in fields such as high speed communications or medicine [6] . GFETs have experimentally demonstrated a of 427 GHz for a 65-nm-long channel [7] , which competes with other RF transistors. The record currently belongs to GaAs based high electron mobility transistors (HEMT), which have shown a of 688 GHz at a channel length of 40 nm [8] . Besides, GFET maximum f T has increased at a high pace during the last years. By contrast, max has only reached 105 GHz for GFETs [9] , while InP HEMTs have surpassed 1 THz [10] . In spite of the better RF performance of HEMTs and Si transistors, their application is limited to rigid electronics since the substrates can only withstand strains up to 1.08% [11] .
Graphene presents high strain limits (up to 25%), so this two-dimensional (2D) material promises to stand out in the field of flexible electronics [12] .
The upscaling of these FoMs of RF electronics implies a progressive shrinking of the channel length.
Depending on the circuit application, the GFET is biased at the appropriate DC point that, for instance, general, SCEs worsen the FoMs and can have an important effect, as we will demonstrate along this work.
This way, a model taking into account these effects becomes essential to make realistic predictions on the RF FoMs of scaled GFETs. Fabrication of short channel GFETs remains a major challenge because of the degradation of graphene mobility and a contact resistance comparable to channel resistance. Different approaches based on the self-alignment process can be applied to fabricate ultra-fast transistors, e. g. with a nanowire gated GFET [13] , a pre-deposited gold protection layer on graphene [9] or transferring the whole gate stack [7] .
Since the fabrication of the first GFET [14] , multiple analytical models have been developed to describe its transistor effect [15] [16] [17] [18] [19] [20] . These models give insight into the physics of graphene but they do not take account of the 2D electrostatics. In this work, we have developed a model that solves self-consistently the 2D Poisson's equation together with the current continuity equation in order to analyze short channel transistors using a drift-diffusion transport theory, which is valid as long as the transistor remains out the ballistic regime [21] . This means that the channel length must be larger than the mean free path of the carriers within graphene, estimated to be 10-100 nm [22] [23] [24] . The solution of the 2D potential equation allows for SCEs to be investigated, including the electrostatic influence of the drain in determining the carrier distribution along the channel. We have not considered impact ionization or hot carrier effects in the model. Impact ionization could result in an "up-kick" of the current-voltage characteristics, implying a worsening of [25] . In that case the FoMs predicted by our model should be regarded as an upper limit. For their part, hot carriers may arise in graphene from photoexcitation [26] or from a very thin insulator as in the graphene base transistor [27] , but none of these characteristics are present in the GFETs we are considering. High electric fields could also provide high energy to carriers and this would imply a quasi-ballistic conductance that falls out of the scope of our model. Nevertheless, we find a good agreement between current curves from our numerical model (without impact ionization) and the experiment of a 70 nm short-channel GFET [28] (see section 3.5). Thus, we suspect that the strength of impact ionization and hot carriers depends not only on the critical field but on the quality of the graphene sample and substrate interaction effects. This model is also able to reproduce the negative differential resistance (NDR), observed experimentally in GFETs [29, 30] , which is a valuable property for many applications [29] .
The paper is divided as follows: in Section 2, we explain the model and the way RF behavior is calculated. 
Methods

GFET model
As other models, our method of analyzing the electrical behavior of GFETs takes account of the basic physical principles of thermal statistics, electrostatics and current continuity [31, 32] . However it adds the contribution of 2D electrostatics coupled to the one dimensional (1D) drift-diffusion transport equation. Figure 1 (a) presents the physical structure of the double-gate GFET considered in this work. The graphene sheet plays the role of the active channel between the source and the drain. The device width ( ) along the -axis is large enough to consider that the transistor is uniform in that direction. To get the electrostatic behavior, Poisson's equation needs to be solved in a 2D region as the one depicted in figure 1 (b). Direction goes from the back gate to the top gate, while direction extends from source to drain along the channel length ( ). Poisson's equation then takes the following form [33] :
where free is the free charge density, is the relative dielectric permittivity of the medium and 0 is the vacuum permittivity; is the electrostatic potential, which is directly related to the local position of the Dirac energy = − , and is the elementary charge. In figure 1 (b Von Neumann homogeneous boundary conditions are assumed at the rest of domain boundaries. At the boundaries of graphene with the source and drain, von Neumann conditions ensure charge neutrality since the electric potential profile can float at those points [35] .
Our method includes the 1D expression of the drain current in the diffusive regime, which results from the quasi-Fermi potential (or local Fermi energy) = − gradient [31] :
The values and correspond to electron and hole mobilities, respectively, and can vary along the channel. In equation (2), a single quasi-Fermi level has been considered for both electrons and holes. This is a valid approximation for graphene since generation/recombination times are very short (1-100 ps) [36] [37] [38] and thus electron and hole quasi Fermi levels cannot deviate too much from each other [31] . The boundary conditions state that ( ) must be zero at the source ( = 0) and at the drain ( = ). The difference between electrostatic and electrochemical potentials is called the chemical potential = − , which is directly related to the carrier concentration inside graphene. To calculate electron and hole mobilities under high fields, a standard velocity saturation model has been considered under the form:
where 0 and 0 are the low-field mobilities for electrons and holes respectively and their values include the influence of the surface scattering or any other scattering mechanism, sat, and sat, are saturation carrier velocities, − ⁄ is the electric field along the direction, and and are adimensional saturation coefficients for both kinds of carriers. This model was proposed for silicon transistors [39] and it has been successfully applied to graphene transistors [4, 19] . It is important to note that must fulfill the continuity condition, so, if we consider stationary situations, must be constant along the channel
Our method solves equations (1) and (2) in a self-consistent way starting from an initial guess of the graphene charge density 0 ( ) (with 0 < < ). The algorithm is fully detailed in the appendix. It is able to find the drain current , the electrostatic and electrochemical potentials ( , ) and ( ), and the carrier distributions, ( ) for electrons and ( ) for holes, for any given bias point.
High frequency performance calculation
GFET high frequency performance has been studied through the following FoMs: , the frequency at which the current gain becomes unity; max , the frequency at which the power gain becomes unity; and the intrinsic voltage gain . These three magnitudes can be extracted from a small signal model explained in ref. [40] and can be calculated from the following expressions:
As defined in ref. [40] , the parameters used in equations (4)- (7) are the transconductance = ⁄ , the drain conductance = ⁄ , the gate-source capacitance = ch ⁄ , and the gate-drain capacitance = ch ⁄ . The intrinsic cutoff frequency , just refers to the intrinsic device. On the other hand, the extrinsic cutoff frequency , and max depend on the series resistances at source and drain, and . Moreover, max depends on the series resistance at the gate and the internal resistance . The latter has been neglected for the sake of simplicity. In order to obtain the capacitances and , we must calculate the total charge in the channel ch for each bias point by means of the following equation:
The aforementioned FoMs strongly depend on the bias conditions. In the supplementary material, we
show the values of the parameters , , and for a reference device (figure S1) and we explain how to select the bias point to get these FoMs (figure S2).
Results and discussion
In order to test the model, we have simulated a GFET fabricated on a 400 nm thick SiO 2 (on a highly doped Si wafer that serves as back gate), with a silicon oxide of 40 nm as top gate insulator. Table 1 summarizes the reference GFET parameters used in this work. The temperature was taken as 300 K along this study. Section 3.5 benchmarks our model against experimental results of a GFET with a 70 nm long channel from ref. [28] . Table 1 . GFET parameters used through this work. In case one parameter is varied, it is specified in the text. In ref. [41] , S. Hen-Jan et al. found that, when the channel length is short, the Dirac voltage (the gate voltage that corresponds to the minimum current for a constant ) shifted towards lower voltages when increasing . They related this to a weak gate control in short-channel devices. We expected to reproduce this result but Dirac voltages always shifts towards higher voltages with this model -see figure 2 (d), (e) and (f). It shows that the follows the rule = 0 + 1 2 ⁄ that is usually found [30] .
Parameter
To get some insight into the physics of the GFET, we have represented in figure 3 the chemical potential and energy profiles along the channel for different lengths at several bias points. In all cases, the top-gate voltage has been assumed to be 1.2 V. For the long-channel GFETs at low drain bias no pinch-off occurs in the channel, that is, all carriers are of the same type (electrons in this case) and thus does not change its sign. As the drain voltage increases, holes start to appear close to the drain. When the drain voltage is 0.6 V, the pinch-off point (where = 0) is approximately in the middle of the channel. For very short channels (30 nm), a drain voltage as low as 0.2 V is enough to cause pinched-off in the channel. In this case, carrier concentrations close to the source and drain are also much higher. Due to the short channel, the drain (e) and (f), for the mentioned channel lengths, respectively. For = 1 µm, the curves are compared with the results obtained by the 1D model in ref. [17] , using the same parameters (dashed curves).
voltage affects the electrostatic potential within the entire device, even in the region close to the source. So the SCEs in GFETs manifest in the following ways: (1) an increase of the carrier concentration close to the source and drain, (2) a more abrupt chemical potential slope along the channel and (3) a shift of the pinchoff point inside the graphene towards the middle of the channel.
In figure 4 , the electrostatic potential distribution is represented for several channel lengths. All three devices show the pinch-off point around the middle of the channel. The potential profile varies as channel length scales down: the isopotential lines (black lines) accumulate close to the graphene for shorter channel lengths. The higher electric field in the graphene leads to the higher carrier concentrations and the more abrupt transitions of around the pinch-off. To end this subsection, it is worth noting that punch-through effect is not observable in GFETs. In Si based devices, it occurs when source and drain depletion regions touch each other and causes a total loss of gate modulation of drain current [42] . Graphene channels, by contrast, present no depletion regions.
3.2 Impact of top oxide thickness and permittivity thickness and its relative permittivity are increased, keeping the same EOT. We would expect that this did not affect the -and -curves, as it happens with long-channel Si-based devices [42] . In the Figure 5 . Influence of top oxide parameters on current characteristics. In all graphs, filled symbols correspond to GFETs with = 100 nm and the parameters given in Table 1 . Open symbols show the effect of increasing on transfer characteristics (a) and output characteristics (d). Graphs (b) and (e)
show the effects when is increased. The results represented in (c) and (f) depict the effects of increasing both and but maintaining the same EOT.
supplementary material ( figure S4 ) it is shown that this is the same case for long-channel GFETs, so this dependence shown in figures 5 (c) and (f) has to do with SCEs.
Effect of carrier velocity saturation
In order to assess the influence of the velocity saturation, we have simulated devices assuming a constant low-field carrier mobility in equation (2) . Figure 6 shows the -characteristics and energy profiles at different gate voltages for channel lengths of 1 µm and 100 nm, comparing the results with those that include velocity saturation. Energy profiles coincide for long channel regardless of saturation, as shown in figure 6 (b). However, the velocity saturation clearly influences the energy profiles for short lengths. It results in a different carrier distribution within the channel that affects the -characteristics. In fact, the only effect of velocity saturation in long-channel devices is a reduction in of a factor = 1 + 0 ( ) ⁄ , which can be interpreted as an effective channel length eff = . By contrast, in shortchannel GFETs, there is an additional loss of current not explained just by this eff . Velocity saturation also modifies the carrier distribution along the channel, and should be taken into account for accurate calculations.
High frequency performance
In this sub-section, we deal with the impact that SCEs have in the RF behavior. Specifically, figure 7 depicts , , , , max and as a function of the channel length. We reach lengths as low as 10 nm, but always assuming that the device is working in a diffusive regime. The understanding of drift-diffusion at short channels could help in a further study of the transition to the ballistic regime [43] . In figure 7 (a), , is represented for two (0.1 and 0.6 V). The dashed black lines represent , , which is the physical limit that no cutoff frequency could surpass for GFETs [20, 31] . It comes out from the supposition that maximum group velocity achievable in graphene is the Fermi velocity , following the trend , ≈ (2 ) ⁄ . Dotted lines correspond to the extrapolation to lower channel lengths of the , calculated according to the 1D model from ref. [17] . It scales as ~1/ with = 2 since the transconductance is proportional to 1/ while the gate capacitance ( = + ) is proportional to . This trend is followed by , for channel lengths down to around 1 µm. Both and follow the expected trends for long channels (see figure S3 in the supplementary material). It is worth noting that the oxide capacitance dominates over the quantum capacitance in this device.
In short channel MOSFETs , scales as 1⁄ because of channel velocity saturation, making the transconductance independent of . In the GFETs we have studied, velocity saturation is not complete, causing ∝ 1⁄ with 0 < < 1, as shown in figure S3 of the supplementary material, and therefore
, ∝ 1/ with 1 < < 2. As the drain bias is reduced, the field in the channel decreases and the carriers are driven farther from velocity saturation, thus having a lower deviation from the 1/ 2 behavior -see the 0.1 V vs the 0.6V curve in figure 7(a) . Therefore, a low drain bias helps to reduce this deviation, so higher cutoff frequencies can be reached.
SCEs can be reduced at short channel lengths by an appropriate choice of : figure 7 (a) shows that, for = 10 nm, a reduction in the top insulator thickness can help improve the cutoff frequency. Decreasing from 40 nm to 20 nm (or even 10 nm) does not increase , at = 0.1 V since it is close to the physical limit. However it improves from 3.6 THz to 10 THz at = 0.6 V. Thus, a thin insulator is important to get the highest possible cutoff frequency. Similarly to silicon oxide in silicon-based devices, the lower limit of the insulator thickness will be set by the tolerable leakage current, which would affect both the power consumption and likely device reliability.
The extrinsic cutoff frequencies have been calculated assuming and of 200 Ω µm, which is currently a typical value of metal-graphene contact [44] , and a higher value of 5000 Ω µm. When series resistance is low, its effects start to be important, as compared to , , for channel lengths below 100 nm.
However, the case = = 5000 Ω µm implies that the current is dominated by the series resistance instead of the channel resistance. This reduces , reaching values near 1 for low drain bias. Interestingly, Yin et al. found a cutoff frequency that scales with = 2 [45] , which was attributed to a carrier transport limited by the channel resistance [46, 47] . Later, the same group found a behavior of = 1 [48, 49] , which was attributed to a current limited by the source/drain contact resistance [47] . A brief summary of the scaling discussion can be found in Table 2 .
Regarding max , this magnitude is represented for the reference GFET in figure 7 (c). The loss of transconductance combined with a poorer current saturation in short channel devices limits the highest frequencies that can be achieved. It turns out that a low drain bias is helpful to get the highest maximum ). This scaled gate series resistance shows much lower max at short channels and illustrates the importance of minimizing [20] , which can be done in practice by adopting a T-gate architecture [50, 51] . We find that max predicted by our model can go beyond 1 THz for channel lengths shorter than 50 nm. Table 2 . GFET cutoff frequency scaling.
shows maximum as a function of the channel length for the reference device. A maximum of around 7 dB can be found for a 1 µm GFET and then it degrades as becomes shorter. Such a degradation with has been experimentally found in previous works [52] and is mainly due to the loss of current saturation when the channel is made short.
Model comparison to experimental values
We have benchmarked the model of this work against an experimental 70 nm long GFET from ref. [28] .
To simulate the device, we have used the parameters extracted experimentally. They can be found in Table   3 . We have supposed a puddle concentration of 1.25·10 12 cm -2 , that is, a minimum carrier density, and a constant temperature of 300 K was taken. In figure 8 we represent both the simulation and the experimental and experiment show quantitative agreement even assuming in the simulation that the device is working in the drift-diffusion regime. We believe that the quality of the sample and/or interaction effects with the substrate is preventing the device to work in the ballistic regime. 
Negative differential resistance
We have found that the model can describe NDR behavior, which has been experimentally observed in GFETs [29, 30] . constant while the diffusion contribution starts to decrease. The hole contributions are still negligible, so this trend is the origin of the NDR. Then, the hole contribution starts to grow when the pinch-off moves away from the drain towards the source. The NDR effect is then caused because of the electron diffusion current reduction when the pinch-off point is around the drain. We have calculated the bias regime where the NDR comes out -shown in figure 9 (c) . Because of SCEs, some differences arise with respect to the simple 1D model used in ref. [30] . 
Conclusions
In this work, we have developed a model that self-consistently solves the 2D Poisson's equation and the 1D drift-diffusion transport equation to investigate both the stationary and frequency response of GFETs.
With this model we have studied SCEs, unveiling the role played by both the electrostatics and the velocity saturation effect. We have found that a degradation in both the transconductance and the output conductance in short-channel devices strongly affect the transistor RF behavior. We have found an extrinsic cutoff frequency scaling law that can be locally expressed as 1⁄ with 1 < ≤ 2, where the scaling index has the mixed signature of SCEs and series resistance. The case = 2 corresponds to long-channel GFET with low series resistance, while = 1 takes place when the current is dominated by the series resistance. At high applied drain bias and short channels, the scaling index could even be < 1, which severely limits the HF performance. Assuming state-of-the-art values for the source/drain and gate series resistances, we have found that cutoff and maximum oscillation frequencies above 1 THz needed a channel length well below 100 nm for our reference device. We have also studied the influence by the gate series resistance on the max optimization and we have demonstrated that the impact of SCEs could be reduced by an appropriate choice of the top oxide thickness. Finally, it has been checked that the model compares very well to short channel experimental results and that it is able to capture NDR behavior and we have explained its origin.
Appendix. Self-consistent algorithm
We start the explanation of our model by recalling the principles on which it is based. According to the thermal statistics of graphene, the sheet concentrations and can be calculated by the following equations [31, 32] :
In equations (A.1) and (A.2), is the elementary charge, is Boltzmann's constant, is the absolute temperature of the device, is the effective graphene sheet density of states, given by equation (A.3), and ℱ ( ) is the complete Fermi-Dirac integral with index , given by equation (A.4).
Here, ℏ is the reduced Planck's constant, is the Fermi velocity of carriers in graphene (10 8 cm/s) and Γ( ) is the gamma function. Equations (A.1) and (A.2) are deduced from the linear dispersion relation of graphene and the Fermi-Dirac thermal distribution [31, 53] .
In the Poisson's equation, equation (1), we have considered that the graphene has a thickness of = 0.6 nm and a relative dielectric permittivity of = 3.3 [54] . The free charge free ( , ) is then:
free ( , ) = � ( ) − < < 0 and 0 < < 0 inside the dielectrics (A.5)
Our method follows the steps as presented in the flow diagram of figure A.1. From an initial guess 0 ( ) (with 0 < < ), the system is solved for a bias point (fixed , and ). The initial guess is obtained from previous 1D long channel models [17] or from a solution of the same device at the same gate voltage but at a lower .
Charge distribution ( ) at the iteration is used in Poisson's equation to determine the electric potential profile ( , ) within the whole domain. Equation 1 is solved inside the 2D region of figure 1 (b) by the advanced Finite Element Method of the Matlab ® partial differential equation solver.
Then, the electrostatic profile ( ) = (0, ) is introduced into the drift-diffusion equation, equation (2), to determine the quasi-Fermi level profile ( ) together with the drain current , , which acts as a parameter in the first-order Ordinary Differential Equation (ODE). The integration of this equation is carried out by a first-order predictor-corrector method [55] . We use the shooting method to ensure that the quasiFermi potential reaches at = , using an adapted bisection algorithm to find the parameter , . 
If the algorithm has not converged yet, a new guess must be calculated to reintroduce it in Poisson's equation. This +1 ( ) is obtained from an optimized linear combination of ( ) and new, ( ) of previous iterations, following Pulay's method [56, 57] . The procedure to calculate it is thoroughly explained in the supplementary material. This way, the algorithm is able to find the drain current , the electrostatic and electrochemical potentials ( , ) and ( ), and carrier distribution, ( ) and ( ), for any given bias point. 
Pulay mixing
As stated in the main text, the initial guess of the iteration + 1, +1 ( ), is obtained from an optimized linear combination of ( ) and new, ( ) from the previous iterations by means of Pulay mixing [1, 2] .
This method was developed to accelerate the convergence of large systems of equations. First, the residual Δ must be defined.
We must recall that the initial guess ( ) is introduced in equations (1) and (2) in order to obtain the potential profiles ( ) and ( ). From these profiles, equations (A.1) and (A.2) allow us to obtain new, ( ) (see figure A.1). For the th iteration we must build a symmetrical × matrix according to this equation:
The indexes and run the iterations that precede iteration + 1, that is, , = − ( − 1), … , .
The linear combination of the previous iterations is obtained from the following equation:
where is a parameter of the method, fulfilling 0 < < 1. The coefficients are obtained from the inverse of the matrix according to:
The new initial guess is introduced again in equations (1) and (2) . The loop continues until the residual , defined in equation A.6, decreases below the tolerance min . Parameters of the method used in this work are summarized in Table S1 . Figure S1 (a) depicts the typical values of the transconductance for the reference device (Table 1) for a 100 nm channel length. This magnitude is represented as a function of for two different (0.1 and 0.6 V). In the same way, output conductance is represented in figure S1 (b). Figure S1 (c) shows both the source and drain capacitances ( and , respectively). They are compared with the top gate oxide capacitance , as calculated by = 0 ⁄ . The dashed line correspond to the top oxide capacitance . The reference GFET with parameters given in Table 1 was considered with = 100 nm.
Small-signal parameters of the GFET
Regarding the voltage gain , the extracted values, represented in figure 8 (d) , correspond to the maximum value at = 2 V.
Capacitance, transconductance and output conductance scaling
In order to get a deeper insight into the scaling behavior of GFET, we have represented in figure 1⁄ trend continues for at short lengths when the bias is as low as 25 mV, as it can be seen in red symbols of figure S3 (b), but it deviates for higher biases. Downscaling also produces a degradation of .
SCE are especially important at high bias (0.6 V): the transconductance saturates and even decreases for < 100 nm. At low bias, the output conductance saturates as scales, while it further increases at high bias. 
EOT influence on long-and short-channel devices
In the main text, we state that the current characteristics of a short-channel GFET are influenced not only by the gate capacitance but also by the top oxide thickness and permittivity value themselves, because of SCE. Transfer characteristics are shown in figure S4 , where we have compared two GFETs with the same EOT but with different oxide thickness and permittivity for both long channel (a) and short channel (b) cases. The long-channel transistor maintains its characteristics since the gate capacitance is kept constant. However, the drain current is influenced in a dissimilar way by either or in the short-channel device. 
