Abstract-We developed a Multi-Subband Ensemble Monte Carlo simulator for non-planar devices, taking into account twodimensional quantum confinement. It couples self-consistently the solution of the 3D Poisson equation, the 2D Schrödinger equation, and the 1D Boltzmann transport equation with the Ensemble Monte Carlo method. This simulator was employed to study MOS devices based on ultra-scaled Gate-All-Around Si nanowires with diameters in the range from 4 nm to 8 nm. The transfer characteristic was simulated and explained in terms of the charge distribution and the computed mobility. Finally the scaling behavior, down to the channel length of 8 nm was analyzed.
I. INTRODUCTION
Non-planar MOS transistors with multiple gates have already entered mass production [1] , [2] and represent the most promising solution to the ultimate scaling of CMOS technology [3] . For these devices, the effect of two-dimensional quantum confinement is essential; to properly take it into account it is necessary to solve the Schrödinger Equation (SE) in the cross section of the device. The Multi-Subband Ensemble Monte Carlo (MS-EMC) approach employs the semiclassical Monte Carlo (MC) method to solve the Boltzmann transport equation, coupled with the solution of the SE in the perpendicular direction. The MC method, compared to common transport framework such as Drift-Diffusion, takes directly into account non-equilibrium carrier transport. On the other hand, compared to full quantum approaches, MC allows a simpler implementation of scattering mechanisms and a (relatively) reduced computational effort.
In this work, we describe an MS-EMC simulator for 3D devices developed by our group, and show the possibilities it offers by studying the scaling properties of ultimate Gate-AllAround (GAA) MOS transistors with ultra-thin Si nanowires. Section II is devoted to the description of the simulator, while in the following one (Section III) we describe the simulated devices and report the obtained results. Finally, conclusions are drawn in Section IV II. SIMULATOR DESCRIPTION To simulate a non-planar device, such as GAA MOSFETs, we employ the space-mode approach [4] , where the SE is solved in several cross sections perpendicular to the transport direction z. Thus, we obtain the eigen-energies, E i,ν (z), and the wave functions, ξ i,ν (x, y, z), for different values of z, where i and ν are the valley and subband indices, respectively.
The MC method is employed to model the resulting 1D carrier transport in each subband (specified by its index i and its valley ν), where the driving force is computed through the derivative of E i,ν (z) with respect to z. Self-consistency is achieved by solving the 3D Poisson Equation (PE) in the whole structure, where the charge is obtained by the population sampling of the MC particles, weighted by the corresponding distribution function |ξ i,ν (x, y, z)| 2 . The MS-EMC method has been widely and successfully employed for the simulation of planar devices [5] , [6] , [7] , [8] .
In this case, the device under consideration is described employing a 3D finite element mesh obtained by extruding a 2D triangular mesh, so that it can be used to discretize the SE in different cross sections. The finite element mesh allows a good representation of complex geometries (e.g. round nanowires, rounded corners, leaning sidewalls in FinFETs) and a natural formulation of the equations (PE, SE) near material boundaries. Non-parabolic electron band-structure is taken into account after the solution of the SE by correcting the energy levels and the subband structure as described in [9] . The MC simulation includes carrier scattering by acoustic and optical phonons [10] , taking into account Pauli exclusion principle [11] . To improve the MC statistics, especially in the sub-threshold regime, we employ a variance reduction technique based on non-uniform super-particle weight: the weight is determined based on energy [12] .
For a given value of the gate bias V GS , the simulation starts with V DS = 0 and the non-linear PE is solved with the SE at equilibrium (i.e. employing the Fermi-Dirac distribution): to speed-up the convergence to a self-consistent state we employ a predictor-corrector scheme based on the one described in [13] . Then, MC carriers are generated in the whole device employing the equilibrium population and a selfconsistent loop including MC, PE and SE is started. Here, the non-linear PE is employed to improve the convergency of the self-consistent loop [14] . In a first stage the boundary conditions at the drain are gradually changed in order to reach the desired value of V DS . After that, the boundary conditions are kept fixed while a self-consistent solution is reached within a prescribed tolerance. Finally, the drain current, I D , is computed keeping the self-consistency, until the required accuracy for the value of I D is achieved.
It is also possible to compute the low-field mobility, µ, by a similar procedure. Only the (infinitely long) channel of the device is considered and small values of a uniform electric field are applied in the longitudinal direction. As before, an equilibrium (i. F , are applied; mobility is extracted by fitting the obtained velocity values with the linear relation v = µF . To improve the performance of the simulator, a high level of parallelism is employed. In the MC code, the independent super-particle flights are simulated in a parallel fashion through the use of OpenMP; particular care is needed for synchronization to inject and delete particles at contacts and to gather particle statistics. On the other hand, specialized and optimized sparse matrix parallel routines are employed for the solution of the PE and the SE.
III. RESULTS
We simulate GAA FET transistors based on cylindrical Si nanowires with channel along the 100 direction, with diameter D ranging from 4 nm to 8 nm. The chosen gate oxide (SiO 2 ) thickness is T ox = 1 nm. The channel is considered undoped (N A = 1 × 10 13 cm −3 ) and midgap metal is assumed for the gate (with work function 4.56 eV). The source and drain doping density is N SD = 1 × 10 20 cm −3 , with an underlap of L sp = 2 nm and Gaussian distribution with σ = 0.8 nm. The simulated devices have nominal channel length L G ranging from 14 nm down to 8 nm, while the total length of the source and drain regions, including extensions, is L SD = 14 nm each. As stated in Section II, we employ a 3D mesh obtained by extrusion of a 2D mesh, ash the latter is used for the cross sections. This 2D mesh makes use of a variable number of triangles depending on the nanowire diameter: the minimum is ∼ 1500 for D = 4 nm and the maximum is ∼ 3500 for D = 8 nm. The mesh spacing in the transport direction is finer in the channel region (∆z min = 0.5 nm) and gradually grows in the source and drain regions (up to ∆z max = 2 nm). quantum confinement. A strong V th roll-off is observed for the larger values of the diameter (6 nm and 8 nm), while only a small drift of V th is present in the case of D = 4 nm.
Restricting our analysys to the longest devices (i.e. with L G = 14 nm), if we plot the drain current vs. the gate voltage overdrive, V G − V th , the curves corresponding to devices with different diameters collapse (see Figure 3 ) except for very large applied biases. This means that the sub-threshold behavior is independent of the diameter. To understand the cause of the differences at large values of gate voltage overdrive, we plot the linear electron density n inv , that is the charge density integrated in a cross section, near the middle of the channel. We can see in Figure 4 that the curves corresponding to different values of D collapse; only very small differences can be observed for values of V G − V th larger than approximately 0.2 V. Such behavior can be explained by considering the local charge distribution, n, along a cross section in the middle of the gated region as it is shown in Figure 5 . In such narrow nanowires the charge concentration is almost always peaked around the center of the nanowire. Only for the widest device (D = 8 nm) and large gate bias the maximum value of n is located at a position different from the geometrical center of the nanowire. In particular, due to the anisotropy of the Si conduction band valleys, four maxima appear along the principal crystallographic directions.
The previous analysis shows that the differences in the drain current observed in Figure 3 for a gate overdrive larger than 0.1 V cannot be attributed exclusively to the inversion charge and must stem from other causes. To shed light on this issue we also computed the phonon limited mobility µ, shown in Figure 6 , where it can be seen that µ is severely degraded for decreasing D.
Finally, we turn to the analysis of short channel effects (SCEs). We can expect the narrower devices to possess better electrostatic control of the channel, especially for larger channel lengths. However, the MS-EMC simulator allows us to obtain quantitative results which properly take into account lateral quantum confinement. A first indication of the SCEs is given by the Drain Induced Barrier Lowering (DIBL), a measure of the variation of the threshold voltage shift caused by the drain bias:
In this case, we employed V D1 = 0.05 V and V D2 = 0.5 V to compute the DIBL values shown in Figure 7 . Here, we can observe that for D = 4 nm the DIBL hardly increases for the shortest devices, reaching its largest value (as small as 58 mV/V) at L G = 8 nm. For devices with D = 6 nm and D = 8 nm, channel length scaling produces a larger increase of the DIBL, which exceeds 100 mV/V at L G = 8 nm and L G = 10 nm, respectively. Similar conclusions can be drawn by observing the subthreshold swing (SS), represented in Figure 8 . The longest devices, with L G = 14 nm, show the same SS value for every D (within the numerical accuracy): SS 65 mV/dec., which is close to the ideal MOSFET limit at room temperature. However, the scaling behavior is quite different: while the narrowest device keeps SS values lower than 70 mV/dec. for all the considered channel lengths, the device with D = 8 nm shows a linear increase reaching SS 110 mV/dec. at L G = 8 nm. 
IV. CONCLUSION
This paper describes a numerical simulator that selfconsistently solves the 3D Poisson equation, the 2D Schrödinger equation, and the 1D Boltzmann transport equation through the Ensemble Monte Carlo method. It can be used to study the static characteristics of ultra-scaled MOS-FET devices, also in the sub-threshold region thanks to the implementation of a variance reduction technique based on variable super-particle weights. It also allows us to inspect the carrier distribution taking into account quantum confinement effects and to compare the phonon limited mobility and drain current including the short-channel and high-field effects, both LG (nm) subthreshold swing (mV/dec.) computed with the MC method and the same code, employing the same scattering models.
Applying this simulator to narrow Si nanowire MOSFETS, we showed that cylindrical devices with D = 4 nm can keep a good control of the short channel effects down to L G = 8 nm.
