Abstract: The majority of nanocrystal floating gate research has been done at the device level. Circuit-level research is still in its early stages because of the lack of a physical device model appropriate for circuit simulations. In this study, a comprehensive and accurate SPICE-compatible physical equation-based model of nanocrystal floating gate devices is developed based on uniform direct tunnelling and Fowler-Nordheim tunnelling. The main contribution is a Verilog-A module that captures the physical behaviours of programming and erasing the device. A predictive NMOS model is then used for modelling the conduction channel to determine the behavioural I-V characteristics. The proposed model uses only explicit formulae resulting in fast computation appropriate for circuit simulation and can be used in any SPICE simulator supporting Verilog-A. It interacts dynamically with the rest of the circuit and includes charge leakage which enables power consumption analysis. The simulation results of the proposed model fit well to experimental results of various fabricated devices. Additionally, it is verified in HSPICE, demonstrating a significant speedup and good agreement with a numerical device simulator. This study is important in bridging the gap between device-and circuit-level research.
Introduction
Nanocrystal floating gate (NCFG) devices have not only drawn much attention in the non-volatile memory market but have also recently been proposed for use in other applications such as a tunable gain operational amplifier, programmable reference circuits and DRAM [1, 2] . Compared to continuous floating gate devices of equivalent dimensions they are less susceptible to stress-induced leakage current (SILC) because of the isolated storage nodes [3 -6] . This enables thinning of the oxide, which results in a more efficient charge transfer. The potential benefits can be summarised as: reduced operational voltage, reduced power consumption, higher programming/erasing speed and/or lower programming/erasing voltage with a comparable retention time to the continuous floating gate device [3 -8] . In addition, the reduction of drain to floating gate coupling, and the increased gate capacitance because of the thinner oxide, reduces the device susceptibility to drain-induced barrier lowering (DIBL). This improves channel length scalability of the NCFG device compared to its counterpart [3, 4, 9] .
To facilitate circuit designers using NCFG devices and enable them to explore the advantages of the aforementioned features of these devices, a compact physical device model for circuit simulation is required. Several NCFG device models have been proposed in the past. Many use a numerical approach either by incorporating Monte Carlo simulations [9 -11] or by solving Schrödinger -Poisson equations (4) - (8) which require long computation time.
Therefore these device models are not suitable for dynamic simulations of large circuits. Alternatively, a circuit simulation model is reported in [12] , which is easy to integrate in the existing SPICE-based infrastructure, but since it is not a physical device model it cannot be used for a circuit's power consumption computation.
The objective of this work is to provide a dynamic, accurate, configurable and fast computing model of NCFG devices, which is suitable for both DC and transient circuit simulations. It is behavioural in nature based on the physics of the device and no correction factors or curve fitting are used. This proposed model is parameterised in terms of process and geometry to provide configurability, which allows circuit designers to explore the trade-offs in speed, power and retention time for different design choices. Only analytical and physical equations are used to help reduce computation time. In addition, the proposed model interacts dynamically with the rest of the circuit throughout the entire transient circuit simulation. Therefore, charge leakage is included and power consumption of the entire circuit can be analysed. The accuracy of the proposed model is compared with a reference model with varying tunnel oxide thicknesses from Silvaco's device simulator ATLAS which uses accurate and robust numerical discretisation techniques [13] . It is also benchmarked to experimental results from a variety of fabricated nanocrystal (NC) devices in terms of threshold voltage shift, flat-band voltage shift and retention time. In addition, a run-time analysis of the proposed model is provided to demonstrate its relative fast computation. Application of this model in shifting the switching threshold of an inverter is then shown in an HSPICE DCsimulation.
Physical device model
The concept of the physical device model is shown in Fig. 1 . The block diagram consists of a Verilog-A module and a predictive NMOS transistor [14] . The model is approximated by assuming that the floating gate of the device can be treated as the gate of a traditional MOS device with very similar effects from the point of view of typical MOS characteristics, as in continuous floating gate devices [15] . Therefore the floating gate and threshold voltage of the device replace the gate and threshold voltage of an MOS transistor [16] . The input for the Verilog-A module is the control gate voltage V CG . The module contains the physical behaviour of programming and erasing the device, and it refreshes the voltage on the floating gate V FG dynamically with respect to time. This voltage is then applied to the gate terminal of any SPICE-compatible NMOS model, which is designed with an oxide thickness equal to the tunnel oxide thickness d tox of the NCFG device. This work uses a predictive NMOS transistor which is based on physical models, published data of new generation NMOS and model parameters from previous well-characterised technologies [14] . It predicts the channel I-V characteristics on advanced technology nodes which include emerging physical effects and correlations among model parameters [14] .
Algorithm for computing V FG
At the starting point t ¼ 0 s, it is assumed that there is no charge Q FG on the floating gate. V CG is the only input-state variable and a constant time step Dt is chosen by the user. Given this information the voltages across the tunnel oxide V tox and control gate oxide V cox are calculated, which is further explained in Section 2.3 along with the derivation of V FG . Next, the tunnel current densities in the tunnel oxide and the control gate oxide are computed. Depending on V CG and Q FG , tunnel current flows either to the floating gate or out of the floating gate as indicated in Fig. 2a . The total tunnel current density J total into the floating gate is given by
J in,sub2FG and J out,FG2sub represent the tunnel current densities through the tunnel oxide for both directions, whereas J in,CG2FG and J out,FG2CG represent the tunnel current densities through the control gate oxide for both directions.
The proposed model includes the carrier transport through the tunnel and control gate oxide because of uniform direct tunnelling (DT) and Fowler -Nordheim tunnelling (FNT) of electrons from conduction band to conduction band that can be observed in Fig. 2b . It shows the conduction band diagram of the designed device in the reference model from Section 3.1 with a platinum (Pt) control gate and a palladium (Pd) floating gate. DT and FNT are the dominant charge transfer mechanisms when the source and drain terminals of the device are grounded. If a voltage is applied to either one of them along with a relatively high-control gate voltage, the floating gate may also be programmed because of hot carrier injection (HCI). HCI is primarily localised in the drain or source terminal and is not applicable for uniform tunnelling [5] . It is not included in the proposed model and the model only applies for uniform tunnelling in the program mode with no lateral field across the channel, that is, negligible impact ionisation is present. This implies that the proposed model does not take charge transfer between the NCs into account, which may occur when the device is programmed because of HCI where the NCs that are in close proximity to the drain or source terminal are highly charged compared to the adjacent NCs. The physical device model assumes that all NCs are equally charged or discharged by utilising uniform tunnelling.
DT is dominant for ultra-thin oxides and low electric fields with qV tox , f b,Si and qV cox , f b,Pd , in the case of programming the device. The electrons tunnel through a rectangular SiO 2 barrier and the current density of DT is given by [15] 
with
where m si ≃ m o represents the mass of the free electron in the silicon and m ox,n ¼ 0.4m o [17] is the effective mass of the tunnelling electron in the oxide, which is assumed to be constant for both the control gate and tunnel oxides. f b is the conduction band barrier height between the interfaces of the materials, which are illustrated in Fig. 2b along with the uniform tunnelling mechanisms DT and FNT. For the Si/SiO 2 interface a barrier height of f b,Si ¼ 3.1 eV [17] is chosen. For the interface between the metals, and the SiO 2 it is determined by the difference between the work function (WF) of the metal, and the energy level corresponding to the SiO 2 conduction band with respect to the vacuum level, which is given by 0.9 eV [18] . Hence, the barrier height at the Pd/SiO 2 interface is f b,Pd ¼ 4.22 eV (assuming a Pd WF of 5.12 eV [19] ) and at the Pt/SiO 2 interface f b,Pt ¼ 4.75 eV (assuming a Pt WF of 5.65 eV [19] ). Note that the barrier height at the interface between the NCs and the oxide depends on the nature of the NC surface and may vary for different dot sizes [20] . The quantum confinement effect has an increasing impact on smaller dot sizes and shifts the NC energy level upward which lowers the barrier height [21] . The designer can then configure the barrier height and is not constrained to use the values given in this paper. For example, is reported in Guan et al. that reducing the dot size of nickel (Ni) NCs from 2 nm to 1 nm results in a shift of approximately 0.2 eV [22] . In this case the designer can reduce the barrier height from 3.6 eV [22] to 3.4 eV. FNT is applied for qV tox , f b,Si and qV cox , f b,Pd as well but plays the dominant role for high electric fields qV tox . f b,Si and qV cox . f b,Pd , or thicker oxides. Hence, the electrons tunnel through a triangular SiO 2 barrier and the current density is given by [15] 
Equations (2) - (5) account for erasing the device as well, where the barrier heights are now given by f b,Pd and f b,Pt . The next step is to compute the new Q FG after one time step. It is the sum of the charge after the previous time step plus the additional charge during the current time step [23] 
Area dot is the surface area of the NC and R dot is the floating gate surface portion covered by the NCs. A variation in the number of NCs in the floating gate can be modelled with R dot . However, the proposed model does not include a potential displacement of NCs nor does it account for neighbouring NCs contacting each other. It assumes a uniform distribution of equally sized NCs that are isolated from each other. In theory, the physical device model may also be used for a continuous floating gate device with a filling rate of R dot ¼ 100%. However, continuous floating gate devices are typically programmed because of FNT and HCI; the latter is not included in the model. Thus, it is left to the designer to use the model for continuous floating gate devices while programming it solely because of uniform tunnelling mechanisms. Finally, the output V FG is updated and the algorithm repeats for the next time step. The key advantage of this approach is that V FG of the device is always dynamically refreshed throughout the entire transient circuit simulation. Thus, this physical device model considers charge leakage, and possible unintended charging of the device during operation mode, as well as other potential disturb conditions.
Surface potential
Very accurate computations of the surface potential V surface can be obtained by solving Schrödinger -Poisson equations through an iterative numerical approach, such as in the device simulator [13] or in other model papers [4 -11] . However, this requires heavy computations, which makes it unsuitable for circuit simulation. In this work an explicit model for surface potential calculation is used. The analytical formulae for the surface band bending in the accumulation, weak, and strong inversion regions are well known for MOSFETs [24, 25] , and it is assumed that they are also valid for NCFG devices, where V FG acts as the gate voltage of a MOSFET [16] . Therefore the voltage difference between the gate voltage and the flat-band voltage of a MOSFET is replaced by the voltage difference between V FG and a modified flat-band voltage V FB . Since the NCs are surrounded by the insulator, the contact potentials at the interfaces between the NCs and the insulator cancel out everywhere. Therefore the flat-band voltage is independent of the contact potential and modified to
which includes the Fermi potential in the channel w F and the charge per unit area Q ′ 0 at the Si/SiO 2 interface that the user may specify in the parameter list. Fig. 5 in Section 3.1 illustrates the accuracy of the surface potential computation.
Program and erase mode
Figs. 3a and b show typical band diagrams of the reference model from Section 3.1 in the program mode and erase mode, respectively. When applying a positive voltage on the control gate, electrons uniformly tunnel from the channel to the floating gate as long as the condition V FG . V surface is met. As the floating gate is being programmed, the tunnelling electrons contribute to an increasing negative Q FG and therefore V FG decreases. As V FG decreases it is possible that in the operation mode V FG , V surface , such that the electrons tend to tunnel back to the channel, which results in charge leakage. Erasing of the device occurs when a negative voltage is applied on the control gate. Electrons uniformly tunnel back from the floating gate to the channel as long as the condition V FG , V surface is met. Additionally, in both the program and erase mode, the electrons may tunnel through the control gate oxide if V cox is sufficiently high. Based on DT and FNT equations the high field may overcome the relatively thick control gate oxide and enable electron tunnelling through the SiO 2 barrier. This is not desired and therefore scaling of d cox and V CG is limited.
The tunnelling current densities for DT and FNT are unique functions of the voltages across the tunnel and control gate oxides [26] , which can be derived from the band diagrams as
Next, considering charge neutrality on the floating 
The second term takes into account the transferred charges from the channel and Q FG represents the initial charge trapped on the NCs. The oxide capacitances C cox and C tox are calculated based on the geometry of the NCs and w m is the contact potential of the control gate. Note that (10) differs from those reported in [16, 27] , which do not consider Q FG , w m , and V surface . Furthermore, the equations derived for V surface and V FG are interdependent. Therefore the value for V surface calculated from the previous time step is used in computing V FG at the current time step. The physical device model now includes every possible scenario of uniform electron tunnelling because of DT and FNT. The results of its accuracy and speed are investigated in the following section.
Results and discussion
The physical device model is compared with simulation results from the device simulator and experimental results from [2, 28, 29] . A timing comparison between the proposed model and the device simulator is provided. In addition, a DC-simulation of an inverter with the NCFG device is demonstrated.
Comparison to the device simulator
The NCFG device is designed and simulated in the device simulator to obtain a benchmark for the proposed model not only for the computation of the threshold voltage shift but also of the current densities, charge on the floating gate, surface potential etc. Fig. 4 shows a two-dimensional structure of a device with a gate length of 45 nm. Pt is chosen for the control gate which has a WF of 5.65 eV [19] to obtain better erase characteristics while program characteristics are negligibly affected [30] . Metal is also used for the NCs in the floating gate because they are superior to their semiconductor counterparts because of higher density of states, stronger coupling with the channel, better scalability and the design freedom of engineering the WF to optimise device characteristics [31 -33] . In the past work, controlled Pd NC deposition has been demonstrated [34] and is proposed for use in the floating gate. Pd's relatively high WF of 5.12 eV [19] creates a deep potential well for efficient data retention [33] . The device is projected 1 mm deep into the third dimension, which is the default in the device simulator. The oxide is composed of SiO 2 for both the control gate and tunnel oxide. The doping profile is generated using analytical doping distributions. The p-type substrate is uniformly doped with boron (B) at a concentration of 5 × 10 16 cm
23
, and the channel has a Gaussian doping profile using B ions with a peak concentration of 4.5 × 10 17 cm
. The n-type source and drain also have a Gaussian distribution and are highly doped with arsenic (As). The tunnel oxide thickness d tox chosen in the reference model varies from 2 to 3.2 nm. For thin oxides, less programming/erasing voltage and time is required to achieve the same threshold voltage shift DV T than for thicker oxides. The drawback is increased charge leakage and reduced retention time. The control gate oxide thickness d cox is kept constant at 4 nm. Reducing this thickness would result in significant charge leakage through the control gate oxide, which is not desired [28] . Fig. 5 illustrates how well the surface potential computation in the proposed model is matched with the reference model in device simulator. The process and geometry parameters are identical for both models. The shapes of the curves are similar and the maximum deviation in accuracy compared to the reference model in the accumulation and strong inversion regions, where erasing and programming of the device takes place, is less than 3.45 and 3.71%, respectively.
Figs. 6a and b demonstrate a close matching of the physical device model to simulations obtained from the device simulator in the charging and discharging behaviour of the NCFG device. A DV T against V CG plot of programming the device for various d tox and a programming time of 10 ms is illustrated in Fig. 6a . Simulations have shown that the plots for programming times of 0.1, 1 and 100 ms look very similar. For d tox ¼ 3.2 nm FNT is dominating, whereas for d tox ¼ 2 nm DT is dominating. In the case of d tox ¼ 2.6 nm a minimal discrepancy can be observed for lower biases on the control gate. This offset may be due to the combination of DT and FNT, where the device simulator demonstrates a slightly lower programming V CG than the proposed model. For high fields FNT is dominating and the curves merge together again. Fig. 6b also demonstrates good agreement between the physical device model and the device simulator when applying erase voltages for 1 ms. The floating gate has been precharged to obtain an initial DV T of 2 V. Owing to the deep asymmetric potential well in the floating gate the electrons require more time and/or energy to tunnel through the higher SiO 2 barrier compared to the program mode. Therefore a longer erasing time of 1 ms is chosen. Similar agreement has been observed for other erasing times of 0.1, 10 and 100 ms. The curves show a slight offset for a lower negative V CG where the proposed model predicts less erasing current. Again, it is anticipated that this may be because of the combination of DT and FNT. For higher negative voltages only FNT is present and the curves align very well.
Comparison to experimental data from fabricated devices
Comparisons between the physical device model and various experimental results are demonstrated in Figs. 7 -9 in terms of threshold voltage shift DV T , retention time and flat-band voltage shift DV FB , respectively. The geometry and process parameters in the model are matched in each case to the corresponding fabricated device to provide a fair comparison. Fig. 7 illustrates V T curves of a fabricated NCFG device with respect to time for different programming voltages ranging from 8 V to 14 V and an erasing voltage of 214 V [2] . The control gate is composed of n + -polysilicon and the NCs of Si, which have a surface coverage of 50%. The device has a d tox and d cox of 4 and 6 nm, respectively. It can be observed that the V T curves are saturating for longer times. This is a result of the tunnel currents from the channel to the NCs and from the NCs to the control gate being in balance [2] . The device is then simulated using the proposed model with the same parameters as the fabricated device in [2] . The barrier height at the interface between the Si NCs and the oxide is now chosen to be f b,Si ¼ 3.1 eV [17] . For the control gate a WF of 4.17 eV is assumed, and therefore a barrier height of f b,n-Poly ¼ 3.27 eV is used at the interface between the control gate and the oxide. The initial V T0 is then modelled in the predictive NMOS transistor and DV T is computed in the Verilog-A module of Fig. 1 . The V T curves obtained from this proposed model are inserted in Fig. 7 within the experimental measurement. The maximum offset has been measured for the 8 V programming curve at 100 ms where the physical device model predicts a threshold voltage of 1.8 V, which is 20% less than the experimental data measured at 2.25 V.
To test the retention time, experimental data and its corresponding extrapolation line are taken from [29] and compared with simulation results from the physical device model. In each case the retention time is measured at a charge loss of 25%. The fabricated devices use n + -polysilicon for the control gate and contain Si NCs, which have a diameter of 6 nm and a density of 3 × 10 11 cm 22 . This results in a surface coverage of the NCs in the floating gate of 8.5%. The NCFG device is then modelled with the equivalent geometry parameters and a barrier height of f b,Si ¼ 3.1 eV [17] is chosen again for the interface between the channel and the oxide and between the NCs and the oxide. The barrier height between the control gate and the oxide is selected by f b,n-Poly ¼ 3.27 eV. Figs. 8a and b illustrate the experimental data along with their extrapolation lines from [29] . The results obtained from the proposed model are directly inserted in each graph. In Fig. 8a , the retention time is shown for different d tox ranging from 2.5 to 6.5 nm with a fixed V CG of 0 V. The data points are obtained Fig. 8b where the retention time for the data point V CG ¼ 24 V reaches 4000 s; however, the proposed model predicts a V CG of 23.7 V to achieve this retention time, resulting in a deviation of 7.5%.
A comparison between the proposed model and a fabricated NC MOSCAP [34] is shown in Fig. 9 in terms of the flat-band voltage shift DV FB . Gold (Au) NCs are embedded in the oxide, which have a WF of 5.10 eV [19] . They have a diameter of 10 nm and a density of 4 × 10 11 cm
22
, and therefore a surface coverage of 31%. The bulk of the device is n-type and the control gate is composed of molybdenum (Mo) with a WF of 4.60 eV [19] . A voltage sweep on the control gate from 25 V to 5 V and back in steps of 0.05 V is performed. In each step the applied bias lasts for 0.5 s. The C-V curves in Fig. 9 show hysteretic characteristics during the voltage sweep. In the fabricated device DV FB has a value of 0.8 V, whereas in the physical device model, DV FB is at 0.85 V.
Note that the slopes, however, are slightly different. This may be due to the assumptions made in this work that the silicon is uniformly doped and that there are no trapped charges in the Si/SiO 2 interface. In reality, trapped charges, material impurities and defects in the silicon are inevitable, and cause a reduction in the slope of the C -V curve [35] .
Run-time analysis
The speed of the proposed model is simulated on one NCFG device with d cox ¼ 4 nm and d tox ¼ 2 nm in HSPICE and compared to the reference model from the device simulator. The control gate is constantly biased with 4.5 V for 100 ms. The mesh of this reference model consists of a total of 8790 grid points and 17 220 triangles. Care has been taken to create a good triangulation where long and thin triangles as well as obtuse triangles are minimised because they may impair accuracy and robustness [13] . The amount of obtuse triangles in the reference model is as low as 0.12%. Theoretically accurate simulation results are achieved by defining a fine mesh as thin as 100th of a nanometer at all material interfaces. The robustness has been verified by running multiple simulations with a different mesh. In comparison, if a mesh as thin as a 50th of a nanometer is used at all material interfaces, the grid points and triangles are reduced to 6794 and 13 286, respectively, which results in an average decrease in the computation time of 20%, and in an average deviation in DV T of 1.3%. Therefore the performed device simulations are based on robust test structures.
The simulations are executed multiple times for different time steps on a 2.66 GHz Intel Core i7 server with 12 GB DDR3 and the average results are presented in Table 1 . The device simulator fully utilises the 8 parallel process threads of this CPU, whereas the physical device model is only utilising a single-process thread. Table 1 shows that the computation time of the physical device model is significantly reduced compared to its counterpart while only slightly sacrificing accuracy (2.43% deviation for a time step of 1 × 10 27 s, or 100 ns). The results also suggest use of a time step of 100 ns to achieve relatively fast and accurate simulations of complex circuits that include a large number of NCFG devices. Reducing this time step will result in longer simulation times while only gaining slightly better accuracy.
HSPICE circuit simulation with implementation of the physical device model
To demonstrate both the fast computation and the SPICE compatibility of the physical device model, an inverter is implemented in HSPICE and a DC-simulation is tested. The NMOS transistor is replaced by an NCFG device as shown in the inset of Fig. 10 Fig. 10 . The NCFG device is initially uncharged and then programmed for 500 ms by a voltage pulse of 3 V. After the applied voltage is taken, the device is kept in a retention state for 1 s with V CG ¼ 0 V where a large amount of charge is leaking because of the ultra-thin tunnel oxide. In the last step an erasing voltage pulse of 21 V is applied for 100 ms. The voltages used in this particular simulation are relatively low such that oxide breakdown of the PMOS transistor (and thus simulation failure) is avoided. The supply voltage is set to 1 V.
Conclusions
This paper presents a physically inspired behavioural model of NCFG devices. Owing to explicit analytical formulas, no heavy computations are required, which makes the proposed These promising results in accuracy, speed and flexibility suggest that the developed physical device model can be used in large and complex circuit simulations. Therefore this work is important in bridging the gap between deviceand circuit-level research.
