ABSTRACT Even though the hysteresis in the gate transfer characteristics of two-dimensional (2D) transistors is a frequently encountered phenomenon, the physics behind it are up to now only barely understood, let alone modeled. Here, we demonstrate that the hysteresis phenomenon can be captured accurately by a previously established non-radiative multiphonon model describing charge capture and emission events in the surrounding dielectrics. The charge transfer model is embedded into a drift-diffusion based TCAD simulation environment, which was adapted to 2D devices. Our modeling setup was validated against measurement data on a back-gated single-layer MoS 2 transistor with SiO 2 as a gate dielectric. We use the modeling approach to gain a thorough understanding of the hysteresis, which will help to control this problem in future devices.
I. INTRODUCTION
Molybdenum disulfide (MoS 2 ) is a 2D material of the large group of transition metal dichalcogenides (TMDs), which has received a lot of attention over the past few years. It combines a superior electrostatic control over the channel due to its ultimate thinness, leading to a sufficient suppression of shortchannel effects [1] , with an inherent, comparatively large transport band gap [2] . This renders it an ideal candidate for applications in digital electronics [3] - [5] , as it enables high current on/off ratios and a large transconductance [6] .
However, up to now, MoS 2 based field effect transistors (FETs) have not met the high expectations for example when judging device performance in terms of mobilities. When accurately accounting for the non-negligible contact resistances [7] , the mobility extracted for MoS 2 layers using multi-terminal measurements at room temperature does not exceed about 100 cm 2 /Vs [8] . Besides that, while this mobility value lies in a range comparable with standard silicon technology, the large variability observed in the device characteristics and performance issues like the frequently observed hysteresis in the gate transfer (I D (V G )) characteristics [9] - [13] and the typically large drifts of the threshold voltage (V th ) over time [14] remain critical obstacles inhibiting industrial applications of MoS 2 FETs.
Complementing our previous experimental works on the hysteresis and drift of 2D FETs [14] - [16] , here we present a detailed study on the main mechanisms governing the hysteresis phenomenon in the I D (V G ) characteristics of MoS 2 FETs. Our study is based on a drift-diffusion TCAD model [17] , which is adapted to 2D devices in the first part of this paper using experimental results to test our model.
In the second part the degradation of the devices due to charge trapping in the underlying gate dielectric is discussed by coupling the model to a four-state non-radiative multiphonon (NMP) model [18] . This model was originally developed and established for charge transfer reactions in silicon (Si) technologies [19] , [20] and is applied here to devices based on 2D materials such as MoS 2 . Our results confirm that the ubiquitous charge trapping at oxide traps is the main reason for the hysteresis in MoS 2 FETs [14] and provide new insights into the details on the occurrence of a hysteresis. We demonstrate that the hysteresis is caused by the same defect bands in silicon dioxide (SiO 2 ) that also govern the degradation in silicon devices. However, the degradation in MoS 2 -based devices is more pronounced due to the unfavorable band alignment of the SiO 2 defect bands to the band edges of MoS 2 .
II. DEVICES AND MEASUREMENTS
A description of the device fabrication and measurement techniques we use here to calibrate our model have been reported in detail elsewhere [14] . For the sake of completeness, we give a short summary of the details which will be important for understanding the simulation results later on. We study a back-gated MoS 2 FET on a 90 nm thick thermal SiO 2 serving as a back gate dielectric. The FET consists of a single layer (SL) flake of MoS 2 (around 6.5 Å), obtained via mechanical exfoliation and selection in an optical microscope [21] . The titanium/gold (Ti/Au) electrodes for the source and drain contacts were deposited using electron beam lithography and metal evaporation [9] , forming a device with the dimensions W = 6.8 µm and L = 1 µm. As a final fabrication step, the device was annealed in vacuum for 12 hours (< 5 × 10 −6 Torr, T = 120 • C) in order to reduce the contact resistances and to remove adsorbed impurities. While the sweep time t sw is an important parameter since it determines the time scale on which the hysteresis occurs, the primary property of the voltage step size V is that it limits the measurement resolution of the hysteresis widths.
III. SIMULATION OF INITIAL DEVICES
In this section the general simulation methodology using drift-diffusion based TCAD [17] is verified against measured characteristics. The drift-diffusion equations [22] can be used because the lateral dimensions of our devices are in the micrometer range (W × L ≈ 7.0 µm 2 for the device discussed here). As a consequence, there is a large number of scattering centers in the channel region resulting in scattering-dominated drift-diffusion charge transport. In several recent works [23] - [25] compact models describing devices based on 2D channel materials with drift-diffusion equations have been developed. The usage of drift-diffusion based TCAD has many advantages, among them a high computational efficiency [26] and a good adaptability to variations in device design.
The agreement between measurement and simulation obtained with our method is presented in Fig. 1 . Our model is able to capture all aspects of the I D (V G ) and the I D (V D ) characteristics on a logarithmic scale as well as on a linear scale.
Critical for this good agreement is the correct choice of material parameters which are given in Table 1 . The list of parameters is divided into two sections. The first section contains material constants which are extracted from the band structure calculated with density functional theory. The band structure of TMDs in the single-layer form has been studied in detail by the group of Thygesen [2] , [27] , [28] , [30] , [33] , [39] . The shape of the band structure and thus also the parameters of this section depend on the dielectric surroundings and the layer number [39] , thus the data in Table 1 refers only to the case of SL-MoS 2 on SiO 2 .
The second section contains material parameters which are strongly influenced by defects in the channel region and are therefore processing dependent or which are determined by VOLUME 6, 2018 973 the choice and the processing conditions of the metal contacts [7] , [35] . For these parameters we only give meaningful ranges according to literature, within which the values should be chosen. At the current stage of research, where there are no generally acknowledged and standardized processing conditions, these parameters have to be treated as fitting values and have to be adjusted to each device separately. The impact of the most important material parameters on the I D (V G ) characteristics is illustrated in Fig. 2 . The central Fig. 2(c) demonstrates the quality of the established fit and Fig. 2(a) shows the effects of the mobility. The mobility (μ) is degraded by impurities in the channel, which act as scattering centers and determine the overall current level (Fig. 2(a) ).
While the mobility is the most important parameter of the drift-diffusion model, the impact of interface defects was considered by using the standard Shockley-Read-Hall (SRH) model [40] , coupled to the drift-diffusion based TCAD simulator [17] . The density of interface traps (D it ) strongly affects the subthreshold slope ( Fig. 2(b) ). This parameter has been studied in detail by Takenaka et al. [38] , who associated the typical density of interface traps observed for MoS 2 FETs with sulfur vacancies in the MoS 2 layers.
Contrary to standard CMOS devices, MoS 2 FETs are accumulation devices and the switching process is governed by the modulation of the Schottky barriers at source and drain [41] , [42] . The work function difference between the metal contacts and the MoS 2 layer is an important parameter as it defines the Schottky barrier height. The impact of the work function difference is illustrated in Fig. 2(e) , demonstrating that already small variations of this parameter can decrease the saturation current level (I D,sat ) by several orders of magnitude.
In Fig. 2(d) the impact of different models for describing the current transport across Schottky barriers is shown. In general one distinguishes between thermionic emission (TE), thermionic-field emission (TFE) and field emission (FE), depending on whether the thermionic current over the barrier or the tunneling current through the barrier dominates [43] , [44] . The TFE model, where the total current over the Schottky barrier is given by the sum of the thermionic component and the tunneling component is the most accurate model and was used in our work. It was 974 VOLUME 6, 2018 stated previously [7] , [45] that the reduced Schottky barrier width in a 2D layer gives rise to large tunneling currents, which is confirmed by Fig. 2(d) , where the tunneling component dominates. In summary, our modeling approach for the characteristics of 2D based devices combines the drift-diffusion equations, which describe the scattering dominated transport through the channel, with a model for the current modulation by the Schottky barriers at source and drain, similar to the one outlined by Appenzeller et al. [42] and Penumatcha et al. [46] . With this approach a versatile model is obtained, which can serve as a first step towards enabling TCAD-aided device design.
IV. DEFECT MODELING
Having successfully developed a model to describe the current through a fresh device we now take the next step towards modeling the hysteresis. In order to see a shift in the threshold voltage between the up sweep and the down sweep of an I D (V G ) curve, there has to be charge trapping in the vicinity of the channel. While several groups claim that charge trapping takes place at the interface [12] , [47] , [48] , we argue here in accordance with our previous works [14] - [16] that the fact that the largest hysteresis is observed for the largest sweep time is a strong argument in favor of oxide traps, as they usually have larger time constants than interface traps. For example, according to the SRH model with a band gap of 2.2 eV SL-MoS 2 provides a maximum time constant of 1 µs for a typical capture cross section of 1 × 10 −15 cm 2 .
Moreover, oxide traps are located at a finite distance from the interface, the most important ones for the charge transfer processes lying typically within the first few nanometers [49] . This leads to an increased bias dependence, as observed in the hysteresis in MoS 2 FETs. The enhanced bias dependence of charge trapping in the oxide in comparison to charge trapping at the interface is illustrated in Fig. 3 . Interface traps provide trap levels inside the band gap, thus as the Fermi level sweeps across the band gap for increasing gate voltages, the interface traps become discharged. This charging process is five orders of magnitude faster than the sweep time, thereby effectively reducing the subthreshold slope. Once the Fermi level reaches the conduction band edge (roughly at V G ≈ V th ), it remains pinned there due to the high concentrations of injected electrons. To a first approximation there are no trapping and detrapping events at interface states above threshold voltage, which is confirmed by comparing the charge state of the interface traps in Fig. 3(b) and 3(c) . However, for increasingly positive gate voltages, the oxide defect band is bent downwards, leading to more trapping events in the oxide, consistent with the experimentally observed increase in the hysteresis for increased high levels of the gate voltage V G,H [16] . The charge capture and emission events for gate voltages above the threshold voltage cause an effective shift of the whole I D (V G ), which produces the observed hysteresis. For the modeling of the hysteresis we use the four-state non-radiative multiphonon (NMP) model, which accurately describes charge transfer reactions in conventional Si/SiO 2 devices [18] . It does not only account for the energy of the transferred electrons, as it is usually done when using the SRH model [40] , but it also considers the energetic relaxation of the structure around the defect, where the electron is captured or emitted [18] , [52] . Depending on the microscopic nature of the defect, which has been studied in great detail for SiO 2 based on Si/SiO 2 FETs [53] , [54] , one usually speaks either of hole or of electron trapping. As the charge transfer process is exactly the same in both cases, the two processes can only be distinguished by the charge change of the trapping defect in the oxide, which either goes from positive to neutral (hole/ donor-like trap) or from neutral to negative (electron/ acceptor-like trap). Therefore, the difference between electron traps and hole traps is only visible in an offset of the transfer characteristic, as it only changes the balance of fixed charges.
Thus, in order to explain the hysteresis in MoS 2 FETs we use the two defect bands of SiO 2 known from silicon technologies [20] , [50] , [51] , with the first being a donor-like hole trapping band located at E L T ±σ E L T = 4.6±0.3 eV below the conduction band edge of SiO 2 [20] , and the second being an acceptor-like electron trapping band at E U T ±σ E U T = 3.0± 0.2 eV below the conduction band edge of SiO 2 . The second defect band has already been observed for Si-based devices with dielectric gate stacks [50] , [51] and has been used in our previous works for the modeling of the hysteresis and of biastemperature instabilities in FETs based on MoS 2 [14] , [16] and black phosphorus [15] . The first defect band is located in the lower half of the band gap of SL-MoS 2 and does not contribute to the degradation in nMOS devices, where the Fermi level only scans across the upper half of the band gap. Therefore, only the second defect band has to be taken into account and is shown in Fig. 3 .
Besides the bias dependence of the charge trapping process the second important aspect is the temporal behavior of the responsible defects, as determined by the four-state If for V G < V th it holds that τ e < τ c and for V G > V th it holds that τ e > τ c , this trap can in principle contribute to the hysteresis. In Fig. 4 (a) and (b) the simulated hysteresis for two different frequencies is shown and compared to measurement data. There were two subsequent measurement rounds to demonstrate that the hysteresis is a reproducible phenomenon and there is no general drift of the characteristics interfering with the hysteresis. Our simulation results clearly corroborate the previously observed [14] PBTI-like hysteresis. PBTI stands for positive bias temperature instability and PBTI-like hysteresis means that the down sweep is shifted to higher threshold voltages just as the threshold voltage is increased under positive stress bias. When comparing Fig. 4(a) and (b) there is an apparent dependence of the hysteresis width on the sweep frequency. In fact, the traps contributing to the hysteresis at different frequencies are not the same. The time for one up-sweep or one down-sweep (2t sw = 1/f ) sets to a first approximation an upper limit (τ cut ) for the time constants of the contributing traps. If either τ e < τ cut at the maximum negative V G or τ c < τ cut at maximum positive V G , this trap can change its charge state for the first time towards the end of the up sweep and for the second time towards the end of the down sweep. If this condition is not fulfilled, it is very unlikely that this trap changes its charge state at all during the sweeping process. In addition, the trap also must not change its charge state too quickly. To be precise, it cannot contribute to the hysteresis if it captures a charge before the hysteresis width is measured at V G = V th . This corresponds to meeting the condition τ c > t(V th ) at threshold voltage, with t(V th ) being the sweep time until threshold voltage is reached.
In Fig. 4 (c) and (d) the bias dependence of the time constants of all traps fulfilling all of the above mentioned criteria and thus causing the hysteresis are displayed. The electron capture time shows a larger bias dependence than the emission time. At shorter sweep times, less defects meet the criteria, reducing the hysteresis width.
To quantify the hysteresis phenomenon, we extracted the hysteresis width at the threshold voltage for all measured sweep frequencies. In Fig. 5 the hysteresis width is shown as a function of the sweep frequency for the measurements and for the simulations. The error bars denote the minimal measurement errors given by the voltage stepping used in the measurements. Additionally, the offset between subsequent measurement rounds is displayed to make sure that no permanent shift of the characteristics interferes with the hysteresis. Good agreement between measured and simulated data is obtained, clearly demonstrating that our simulation methodology captures the time behavior of the hysteresis phenomenon correctly. 
V. CONCLUSION
A general drift-diffusion based TCAD simulation methodology for devices based on 2D materials such as MoS 2 was established. By coupling a non-radiative multiphonon model to this simulation framework, charge capture events at oxide traps can be described. As oxide traps show a larger bias dependence and larger time constants than interface traps, 976 VOLUME 6, 2018
we concluded that oxide traps are responsible for the hysteresis observed in the I D (V G ) characteristics of back-gated SL-MoS 2 FETs. The NMP model correctly captures the sweep time dependence of the hysteresis width, confirming the suitability of our approach.
