ABSTRACT: Understanding the operation mode of a two-dimensional (2D) material-based field-effect transistor (FET) is one of the most essential issues in the study of electronics and physics. The existing Schottky barrier-FET model for devices with global back gate and metallic contacts overemphasizes the metal-2D contact effect, and the widely observed residual conductance cannot be explained by this model. Here, an accumulationmode FET model, which directly reveals 2D channel transport properties, is developed based on a partial topgate MoS2 FET with metallic contacts and a channel thickness of 0.65~118 nm. The operation mechanism of an accumulation-mode FET is validated and clarified by carefully performed capacitance measurements. A depletion capacitance-quantum capacitance transition is observed. After the analysis of the MoS2 accumulationmode FET, we have confirmed that most 2D-FETs show accumulation-mode behavior. The universal thickness scaling rule of 2D-FETs is then proposed, which provides guidance for future research on 2D materials.
INTRODUCTION
An accurate understanding of the operation mechanism of electronic devices is critical, especially for new channel materials, because the extraction of physical properties and the further control of the device characteristics are based on the operation mode. In recent years, transition metal dichalcogenide (TMD) field effect transistors (FETs) have attracted significant attention due to their potential application in ultimate scaled devices. [1] [2] [3] [4] [5] Typical TMD-FET devices are composed of a metallic source and drain contacts, and the metal/channel interfaces are under gate control, that is, a typical global back gate structure, as shown in Figure 1a . One of the key performance-limiting factors in 2-dimensional (2D) FETs is the 2D/metal contact. 6 Based on this idea and on the historical background of similar structures for carbon nanotubes, 7 ultrathin silicon on insulator (SOI), 8 and silicon nanowire FETs, 9 the Schottky barrier FET (SB-FET) model is proposed and developed to explain the 2D-FET operation mechanism. [10] [11] [12] Since the tunneling transport at the SB junction is dominant, studies on achieving low contact resistance by choosing metal types and inserting van der Waals materials and so on [13] [14] [15] are promoted. The most important success in SB-FETs is the explanation of the ambipolar behavior. However, this model oversimplifies the channel effect in many cases. Although the injected carriers from contact will inevitably be scattered through the commonly used micro-long channel, the scattering issues are often neglected in the SB-FET model. Moreover, the residual conductance observed in most multilayer 2D-FETs when over the critical thickness [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] cannot be explained by only the SB-FET model, suggesting a 2D depletion nature. 18, 28 These contradictions suggest the existence of an additional operation mode focusing on the channel properties.
Here, to clarify this channel depletion-related operation mode, we focus on partial top-gate FETs with ohmic metallic contacts, where the 2D/metal contact is not modulated as shown in Figure 1b . 4, 32, 33 This type of device structure is often explained by accumulation mode (ACCU) FETs in a Si nanowire, 34 as shown in Figure 1a , where the gate controls the on and off states via accumulation and depletion of the majority of carriers in the partial gate region. The unipolar behavior is achieved due to p/n junction formation. However, the accumulation mode mechanism in 2D-FETs and Si nanowires has not been systematically investigated.
The fundamental technique to directly detect carrier density and interface states in semiconductors is capacitance measurements (C-V), [35] [36] [37] which provides critical insights to elucidate the 2D-FET operation mechanism. Although C-V measurements are quite informative, blindly applying this method established based on a bulk metal-oxidesemiconductor (MOS) FET/capacitor to an ACCU-FET can lead to incorrect conclusions. Experimentally, C-V measurements in small-area 2D materials are very sensitive and always suffer from several difficulties. 5, [38] [39] [40] [41] A systematic study on the parasitic capacitance resulting from an n + -Si/SiO2 substrate and the channel resistance effect in C-V is necessary. Theoretically, quantum capacitance in monolayer MoS2 has been clarified in our previous work. 33 However, the study on capacitance transition from monolayer to bulk MoS2 is still lacking.
In this work, mechanically exfoliated MoS2 with a thickness from 0.65 (monolayer) ~ 118 nm is selected as the channel material for top-gate FET devices. Systematic investigations of C-V and I-V measurements are carried out for the same samples. For C-V, the parasitic capacitance is totally suppressed by using a quartz substrate. Frequency dispersion for low-mobility thin 2D channels mainly comes from the channel resistance effect. A transition from quantum capacitance (CQ) to depletion capacitance (CD) is observed from monolayer to bulk MoS2. Having clarified the electrostatic field-effect control mechanism of carriers by C-V, the electrical transport data are explained by ACCU-FET for all channel thicknesses. The thickness scaling rule is proposed based on the ACCU-FET mechanism, which provide the complete picture of the transport properties for most of the 2D materials.
RESULTS AND DISCUSSION
I-V characterization; increase in on-state conductivity and residual conductance. Figure 1a , b shows a schematic drawing and optical image of the Al2O3 top-gate MoS2 FET on the insulating quartz substrate. Figure 2a shows the typical conductivity (σ) -top gate voltage (VTG) characteristic at VDS = 0.1 V with a MoS2 thickness (tMoS2) of 0.65, 16, 44, and 58 nm. It should be noted that  is normalized by the width and length without the thickness of MoS2 flakes. Monolayer MoS2 shows a clear off and subthreshold region. There are two distinct features observed by increasing the MoS2 thickness. One is the increase in the on-state conductivity for the 16-nm-thick sample, which gradually saturates for thicker MoS2 samples. The other is the abrupt increase in the residual conductance for the 44-nmthick sample.
To focus on these two features, the maximum conductivity and the ratio of on-state to off-state current (ION/IOFF) are shown in the range of tMoS2 = 0.65 ~ 118 nm in Figure 2b . The maximum conductivity is controlled by the conductivity of the for the 58-nm-thick sample. Therefore, it is discussed similarly to the mobility analysis in MoS2. Instead of intrinsic photon scattering, coulomb scattering due to interfacial impurities is found to be dominant in the scattering mechanism of ultrathin MoS2. [42] [43] [44] The extrinsic Debye length (LD) is given here for the screening length of Coulomb scattering since most of the 2D materials are intrinsically charged by defects and impurities.
εMoS2, kB, T, and e are defined as the dielectric constant of MoS2 in the direction normal to the basal plane, the Boltzmann constant, the temperature, and the elementary charge, respectively. ND is the density of the donors (density of acceptors NA for p-type 2D). 2×LD is used in the following discussion to account for both the top and bottom interfaces. The MoS2 with tMoS2 > 2LD will be undisturbed by the interfaces and maximum conductivity is saturated. Consideration of quantum-mechanical effect of accumulation capacitance (CA) would give a more accurate carrier distribution in thick MoS2 flakes. 45 For ION/IOFF, two regions are clearly observed: ION/IOFF > 10 5 for tMoS2 = ~0.65 -35 nm and ION/IOFF < 10 for tMoS2 > 60 nm. The transition occurs at tMoS2 = ~48 -55 nm. For ACCU-FET, 46 the conduction comes from "body current flow", which is modulated by the depletion region in the channel. The screening length (λACCU-FET) is determined by the maximum depletion width (WDm), which can be expressed as follows:
where ni is intrinsic carrier density. λACCU-FET is independent of oxide capacitance (Cox). An increase in the residual conductance occurs (e.g., 44-nm-thick sample in Figure 2a ) when tMoS2 becomes close to WDm due to screening of the gate control. The present data indicates that WDm is ~48 -55 nm. It should be noted that this WDm is roughly consistent with that in the previous data for global back gate MoS2 FETs. 18, 19 Generally, the global back gate 2D layered channel FET has been considered an SB-FET. In the case of SB-FETs, the off-state is achieved by controlling the barrier height at the MoS2/metal contact independent of the channel thickness. This behavior is inconsistent with the SB-FET model. Moreover, for the global back gate 2D devices, the degradation of subthreshold swing (S.S.) with increasing channel thickness has been claimed as evidence for SB-FETs. 12 However, the similar degradation of S.S. is clearly observed due to the reduction in gate control by tMoS2 ~ WDm, as shown in In the following discussion, we use "bulk" for MoS2 with tMoS2 > WDm and "multilayer" for tMoS2 < WDm for simplicity.
C-V characterization; CQ & CD. To gain further insight into the operation mechanism of ACCUMoS2 FETs, C-V measurements with a frequency range of 1 k ~ 1 MHz are also conducted for the same devices. It should be noted that MoS2 flakes with a large area (> 30 μm 2 ) were selected to improve the signal-to-noise ratio in the capacitance measurement. The full equivalent circuit used to model the top gate MoS2-FETs is shown in Figure 3a . Here, Cpara1 and Cpara2 are the two types of commonly observed parasitic capacitance. Raccess is defined as the sum of MoS2/metal contact resistance and MoS2 resistance at the access region indicated in Figure 1b , which is constant. Rchannal is the MoS2 channel resistance just below the top gate electrode and is modulated by the top gate bias. Cit and Rit are the interface states' capacitance and resistance, respectively, which account for carrier capture and emission processes. CD(A) and CQ are the focus of this paper, and they determine the carrier density in the conduction band (CB). RD is the resistance that models the supply of carriers to the depletion layer when CD(A) dominates the capacitance.
Several pitfalls are first discussed for MoS2-FET-based C-V. The first pitfall is the parasitic capacitance effect, which comes from the widely used n + -Si/SiO2 substrate. As indicated previously, 47 in the double-gated geometry, there is capacitive coupling between back and top gates through the large contact pad area, which induces large parasitic capacitance. Cpara1 refers to the parasitic capacitance that is charged or discharged through constant Raccess. This will induce large frequency dispersion (>Cox) in C-V and corresponding peaks in the conductancefrequency (Gp/ω-f) measurement (Supplementary Figure S1 ). Cpara2 refers to the parasitic capacitance that could shift the baseline of the C-V curve. A quartz substrate is used in this paper to totally remove these parasitic capacitances (Supplementary Figure  S2) . In this situation, Cox can be obtained in the strong accumulation region when no frequency dispersion is observed.
The second pitfall is the access resistance effect, which could induce frequency dispersion in the accumulation region in C-V. Raccess is experimentally measured as the residual impedance at the highfrequency limit in the strong accumulation region where the other resistance is shunted. The measured Raccess is on the order of ~10 kΩ in most of the samples due to the natural n-doped property of MoS2 and the low contact resistance with Ni. As shown in Supplementary Figure S3 and Note S1, the Raccess effect in our measured frequency range can be neglected since Raccess is smaller than ~510 4 . We have to mention that Raccess can still severely affect capacitance measurements at low temperature and for other 2D materials with higher resistance. Now, the equivalent circuit can be simplified (Figure 3a) , where the experimentally measured source/drain to gate capacitance is defined as Ctotal. Since the parasitic capacitance is totally removed by using the quartz substrate, the minimum capacitance plateau observed for tMoS2 = 58 nm results from the contribution of CD with WDm. That is, the inversion layer is formed, resulting in a constant depletion width. The electrical communication still passes through free electrons at the edge of the depletion region because the p-n junction is formed between the inversion layer and ungated n-channel region. This C-V curve is consistent with that of a 1-μm-thick MoS2 capacitor, 38 which also supports that WDm is shorter than tMoS2 = 58 nm. This cannot occur in SB-FET but is unique to the depletion behavior in ACCU-FET. As a result, the undepleted MoS2 layer will always remain, which results in residual conductance and low ION/IOFF in I-V. On the other hand, for monolayer MoS2, CQ contributes to Ctotal, instead of CD. It originates from the partially occupied density of states (DOS) of CB modulated by the Fermi energy (EF) in the Fermi-Dirac distribution. 48, 49 Distinct from CD, one of the main behaviors of CQ is that it follows an exponential decrease with respect to EF when EF is modulated in the band-gap. Due to the large band-gap of MoS2, CQ can reach a small value, which results in an extremely low carrier density. This will be experimentally observed as a decrease to almost zero in C-V ( Figure  3b ) and a clear subthreshold/off region in I-V ( Figure   2a ). Although Figure 3c , d shows the transition from CQ to CD, it is somewhat complicated. Therefore, it will be discussed later with the help of the quantitative analysis.
Frequency dispersion by channel charging effect in C-V. Before considering the transition from CQ to CD with increasing tMoS2, Rchannel is discussed since it could induce frequency dispersion in the depletion region in C-V. Shockley-Read-Hall (SRH) theory is the basis to study carrier capture and emission process by the traps. 50 Based on this theory, a series Rit-Cit network is modeled in the equivalent circuit, and experimental impedance spectroscopy always tries to capture this Rit-Cit-induced signal by excluding other capacitance or resistance effects. Large frequency dispersion is widely observed in the capacitance measurement of thin MoS2 and other 2D-FET. 5, 33, 39, 41 It is often treated as Rit-Cit-induced signals. However, other resistance effects could also introduce frequency-dependent signals. Rchannel is always parasitic in the FET structure, which cannot be avoided. In this section, Rchannel effect will be studied quantitatively. Monolayer MoS2 is selected here because it shows the largest frequency dispersion and the simplest CQ expression. . Experimentally, when resistance exists in the equivalent circuit, it will give the R-C circuit, in which the time constant (τ) is determined. Ctotal will decay from C1 for ωτ > 1, where ω is angular frequency. τRch and τit are defined as the time constants from Rchannel and Rit, respectively. Figure  4b shows measured Ctotal as a function of frequency (C-f) at different VTG for the monolayer device in Figure 3b . The clear decay of Ctotal at a specific frequency indicates that the capacitance is limited by one type of resistance.
For τRch, it is derived from a transmission line model 51, 52 as follows (Supplementary Figure S4 , Note S2):
where L is the channel length and RS,channel is the sheet resistance of MoS2 channel. The drift current model 49 is applied to express RS,channel. Because the channel is on the order of micrometers in length and the drain bias is small, the diffusion current is negligible. Moreover, the drift current model reveals free carrier transport in the conduction band, which enables us to correlate C-V with I-V in the next part. , ℎ = 1 ℎ , where nch is the channel carrier density and μ is the drift mobility. Cit and μ are extracted from the I-V characteristics 33 (Supplementary Figure S5) . A higher Cit means that more states need to be charged, which results in a larger τRch. On the other hand, τit is calculated based on SRH theory 53 in a 2-dimensional system as follows:
where σcapture-2D is the capture cross section of interface states, which largely depends on the type of interface states. For point defects (e.g., sulfur vacancy), it would be close to the atom size of ~0.3 nm. For band tail interface states induced by bond bending of Mo-d orbitals, 33 it could be on the order of 10 nm. 54 Therefore, σcapture-2D is assumed to be in the range of 0.3~10 nm. vth is thermal velocity of ~1.2×10 7 cm/s at room temperature by considering the electron effective mass of monolayer MoS2 as m* = 0.6 m0, where m0 is the electron mass in a vacuum.
The calculated time constant as a function of EF is shown in Figure 4c . τRch w/ Cit is ~3 orders of magnitude larger than τit and is distributed across the measured frequency range of 1 kHz to 1 MHz, which indicates that the time constant due to the channel charging effect is the origin of the frequencydependent capacitance behavior in Figure 4b . It is noted that both τit and τRch with Cit have a similar exponential EF dependence because the parameter nch is included. VTG is calculated as follows:
VTG,mid-gap is a fitting parameter to compensate the MoS2 n-doping effect. Equation (6) will be used to correlate EF with VTG. Later, we will study multilayer MoS2, where the surface potential ψS is used instead of EF/e. Cit is included in equation (6) since the interface states always respond to the direct current (dc) VTG. The simulation reproduces the experimental data quite well (Figure 4 d,e) , suggesting that the experimentally observed frequency dispersion in C-V does not result from the electron capture/emission process at the interface traps but from the channel charging effect. From the above study of Rchannel effect, let us review our previous work on the C-V study of monolayer MoS2. 33 CQ with a clear temperature dependence is correctly extracted since it is obtained at the strong accumulation region where Rchannel is shunted. Although the band-tail type energy distribution for the interface states is also reserved qualitatively, the widely used high-low frequency method on 2D-FET-based C-V 5,33,39,41 will not reveal the true Cit value quantitatively because the extracted time constant is indeed τRch instead of τit.
To provide guidance on how to avoid the channel charging effect in all 2D-FET-based C-V with different thicknesses from monolayer to multilayer, the universal expression is derived. The region where CQ << Cox should be considered since the Rchannel effect is severe due to the low carrier density. We assume that Cit is smaller than CQ, that is, attention should always be paid to improve the interface. In this case, C1 = CQ. Then, based on equations (3, 8) and the definition of = ℎ , τRch will have a constant maximum, which is similar to τRch without Cit in the monolayer case (Figure 4c) . This is because nch in both CQ and RS,channel cancel with each other. This constant maximum is shown as follows:
The maximum τRch is shown as a function of μ for various L in Figure 4f . τRch should be smaller than the measured frequency range to avoid the channel charging effect. For L = 1 μm, the allowable μ can be as low as 1 cm 2 V -1 s -1 . However, due to both experimental difficulty and small signal-to-noise ratio, L is usually in the range of 5 ~ 20 μm in our samples. In this case, μ is very important. μ is usually low in monolayer 2D materials, i.e., < 100 cm 2 V -1 s -1 , at room temperature, while multilayer 2D materials have a higher μ, which has the potential to avoid the channel charging effect. This has been confirmed in our 16-nm-thick device with suppressed frequency dispersion in Figure 3d . On the other hand, for graphene-based FETs, this effect can usually be neglected due to the extremely high μ, which accounts for the recently observed frequency dispersion-free Cit in a bilayer graphene/h-BN/graphite heterostructure. . Moreover, channel resistance-induced frequency dispersion is totally suppressed because Rchannel is shunted by the unmodulated conductive MoS2 region, which results in a low charging resistance RD. Thus, the equivalent circuit can be simplified as a lumped circuit, and conventional CD(A) analysis method can be applied. This enables us to extract parameters such as ND and εMoS2 of bulk MoS2. The minimum CD is given as = 2 . By considering that WDm is 48~55 nm and the minimum CD is ~ 0.1 μF/cm 2 , bulk εMoS2 is extracted as 6.3. This is roughly consistent with the calculated bulk εMoS2 in the z direction. 56 Based on equation (2), ND is determined to be 2~3×10 17 cm -3 . With these parameters, by using conventional CD expression (Equation (11) in Supplementary Note S3) and equation (6) without Cit, the C-V of 58-nm-thick MoS2 is fitted (Figure 5a) . The simulated C-V fits well with the experimental data. The slight deviation is due to Cit-induced distortion and the stretch-out effect.
CD-CQ transition always occurs in multilayer
MoS2 FET-based C-V. Firstly, free electrons at the edge of the depletion region still communicate electrically with S/D through the ungated n-channel region. By modulating VTG negatively, the depletion width will reach tMoS2 (16 nm). As a result, the electrical communication in C-V occurs between S/D and the quite small density of free electrons in the "depletion region". Based on this scenario, when the depletion width reaches tMoS2, it can be considered that the CD-CQ transition occurs, since the carrier density in the "depletion region" can be controlled by CQ. Therefore, the C-V curve goes to zero even for the multilayer. After the whole channel is depleted, the surface potential will be continuously increased by further decreasing VTG. Finally, the inversion layer will be formed. However, inversion capacitance corresponding with the p-branch in I-V cannot be observed because of the p-n junction, as schematically illustrated in Figure 5b . Now, let us reproduce the C-V curve for tMoS2 = 16 nm by simple analytical calculation. Since the expression for CD(A) is already obtained, the surface potential (ψs) is calculated in order to obtain the expression for CQ of multilayer MoS2. The boundary condition of electric field = 0 at z = tMoS2 is used for the solution of the one-dimensional Poisson equation. This is the intrinsic condition for the present 2D-FET structure, where the channel is always surrounded by the insulator or other insulating environment. The calculated potential distribution is shown in the inset in Figure 5b . By modulating the surface potential with the change of ∆ψs, the potential in the channel changes everywhere (∆ψz) with the same value, that is, ∆ψz =∆ψs, indicating that the whole channel can be fully controlled by ψs and ψs has a similar function as EF in monolayer CQ to modulate nch. With the calculated potential distribution, CQ is shown below (Supplementary Note S3):
where NQ is a constant independent of ψs. It is not surprising to see that CQ in the multilayer has a similar formula as that in the monolayer case with the same exponential e/kBT dependence. 49 Then, using CQ and CD(A) without Cit, the experimental data are well fitted, as shown in Figure 5b . The cross point indicates the transition from CD to CQ at tMoS2 = WD. The slight deviation from the analytical CQ comes from the contribution of Cit. In Figure 3b -e, the transition from CD to CQ is clearly seen with decreasing MoS2 thickness. Moreover, it is interesting that the frequency dispersion is observed only in the CQ-dominant region. This is because the charging resistance RD is low enough for the CD(A)-dominant region, while Rchannel is quite high for the CQ-dominant region.
A large advantage for C-V on the FET structure is that it directly estimates the carrier density in the transport phenomenon of I-V. Meanwhile, for C-V on the capacitance structure, the potential distribution in the channel is affected by the additional back metal contact, and the whole depletion channel cannot be obtained. 57 Having theoretically calculated all of the components in Ctotal, it is possible to further reproduce the I-V characteristics by introducing the drift current model. nch is calculated as ℎ = ∫ / ( ) . The VTG-ψs relation is again calculated from equation (6) With a scaling thickness of 2D from bulk to monolayer, three regions will be observed. They are divided by WDm and 2LD. Different types of 2D materials are shown here as a function of ND (NA). Most of the 2D materials summarized here come from the mechanical exfoliation method, which gives a relatively stable ND (NA). This situation might be different when using synthesis approaches. current model used above gives a complete picture of carrier band transport under the gate field-effect condition. Now, the transport mechanism is divided into three regions as a function of tMoS2. When tMoS2 > WDm, the channel is only partially controlled by the gate and shows band transport. The existence of residual conductance is the sign of this region. When 2LD < tMoS2 < WDm, the channel is fully controlled with optimized mobility because of screening of interfacial Coulomb scattering. Band transport also dominates in this region. When tMoS2 < 2LD, band transport is still dominant at room temperature, but it often suffers from mobility degradation due to prominent interfacial Coulomb scattering. In the subthreshold region at low temperature, the localized states induced transport such as hopping will become dominant. [58] [59] [60] It should be noted that both WDm and 2LD are independent of Cox, which enables us to propose the thickness scaling rule of transport properties for various 2D materials as a function of ND (NA) (Figure 6 ). The summarized 2D materials here have a band-gap of 1~2 eV and a similar dielectric constant. A WSe2 global back-gate FET on a SiO2 (90 nm)/n + -Si substrate was fabricated and characterized for comparison (Supplementary Figure  S6) . Although the observation of both n-and pbranches is explained by the SB-FET model, the increase in the maximum conductance and residual conductance with increasing WSe2 thickness also reveals ACCU-FET behavior. As mentioned before, transport properties for the present top gate MoS2 FET are consistent with that from global back gate MoS2 devices. Therefore, almost all of the data on WDm and LD in Figure 6 are obtained from global back gate devices in the previous literature. At high ND (NA) region (>10 19 cm -3 ), WDm will decrease substantially, resulting in a small thickness window for "fully controlled band transport", that is, fully depleted. In fact, full control of channel will be lost when the 2D thickness become greater than WDm. Moreover, it will be more degraded by considering a heavy doping effect such as band gap narrowing. 61 This explains why well-controlled FETs with high ION/IOFF are difficult to achieve in recent heavily doped 2D materials such as PtS2, PtSe2, SnS, and SnSe. Meanwhile, 2LD is scaled down to just several atomic layers of thickness. This strong electrostatic confinement effect combined with increased ND (NA) will introduce strong scattering. Band transport is difficult to achieve in atomically thin flake of these heavily doped 2D materials, and the Anderson localization phenomenon is suggested to be observed. 62 Moreover, in terms of 2D/metal contact, heavily doped 2D materials generally show low contact resistance because of the thin Schottky barrier width. From the above analysis, well controlled doping approaches on 2D crystals are in great demand for improving the performance of 2D ACCU-FET.
CONCLUSION
As a conclusion, the top gate MoS2-FETs are found to work at accumulation-mode, whose operation mechanism is clarified by capacitance measurement with special precautions. The ACCU-FET study here provides a new platform and analytical mode for the electronics and physics of novel nanomaterials. Moreover, the universal thickness scaling rule of 2D-FETs is proposed in terms of WDm and LD, which is applicable to most of semiconductor 2D materials.
Methods
MoS2 films were mechanically exfoliated onto the insulating quartz substrate from natural bulk MoS2 flakes. Ni/Au was deposited as source/drain electrodes. Then, Y metal with a thickness of 1 nm was deposited via thermal evaporation of the Y metal in a PBN crucible in an Ar atmosphere with a partial pressure of 10 -1 Pa, followed by oxidization in the laboratory atmosphere to form the buffer layer. 63, 64 The Al2O3 oxide layer with a thickness of 10 nm was deposited via atomic layer deposition, followed by the Al top-gate electrode formation. Raman spectroscopy and atomic force microscopy (AFM) were employed for determining the flake thickness. I-V and C-V measurements were conducted using Keysight B1500 and 4980A LCR meters, respectively. All electrical measurements were performed in a vacuum prober with a cryogenic system. Sim.
R access
Supplementary Figure S4 . 1 Drift mobility is slightly higher than conventional experimentally extracted field-effect mobility due to reduced carrier controllability of the gate by Cit. But field-effect mobility extraction is still one of the fastest way to study carrier transport properties. Drift mobility μ is assumed to be independent of EF with a constant value for simplicity. Here for studied monolayer MoS2, μ = 2.2 cm 2 V -1 s -1 . This is a rough assumption because μ is usually dependent on carrier density through screening effect. But the dominant factor in determining IDS is the carrier density instead of the drift mobility especially at subthreshold region. This explains why we can give a good I-V fitting even at constant μ condition. Note S2. MoS2 Channel resistance effect on capacitance measurement. Transmission line model has been applied to study channel resistance effect on C-V for Si MOSFET. 2, 3 Here, transmission line model will also be used in MoS2 FET to model Rchannel effect (Supplementary Figure S4) . Notice that substrate is insulating in MoS2 FET, which simplifies the mathematical expressions of equivalent circuit by neglecting charge supply from the substrate. Assume that all variables are in phasor quantities. v0 refers to small ac variation. RS,channel refers to sheet resistance of Rchannel. i refers to current from one side (source or drain) of the electrode. So the total current from both sides iD,S is 2×i. Firstly,
Differentiating eq. 2 with respect to x and substituting it into eq. 3, 
where γ = √ , ℎ 1 . The solution of this eq. 5 is 
Based on eq. 2 at = 0, source/drain current:
The propagation constant λ and channel resistance limited time constant τRch are given as below:
So the experimental measured equivalent parallel capacitance and conductance Ctotal and Gtotal are: 
Notice that often used EF in monolayer MoS2 discussion is unsuitable in multilayer and bulk MoS2 since potential ψ changes from surface to body. Instead, surface potential ψS is used. By neglecting free carriers in the depletion region, the depletion layer width is shown as:
As for bulk MoS2 (Thickness > 55 nm), WDm is obtained when ψS saturates at strong inversion region. While for multilayer MoS2 (Thickness < 35 nm), transition from depletion capacitance to quantum capacitance occurs when WD reaches MoS2 body thickness. Transition condition of ψS is given as below: 
By integrating eq. 19 with z from 0 to tMoS2, and replacing ψz with eq. 18, we get total channel free electron per unit area as: 
To simplify the calculation of NQ, capacitance continuity condition is finally used. That is 
