2 Figure S1 . The main fabrication processes of 2DVFETs. For the fabrication of the sourceinsulating spacer-drain (SID) patterns, we could use the etching approach for hBN (self-aligned etching) or normal patterning deposition for oxides (self-aligned continuous deposition). Both layered hBN and deposited oxides could be used for the insulating spacer and gate insulator. Table S1 . Technical summary of typical short channel 2D semiconductor transistors.
EBL

Self-aligned 1D gate
Reference CM Hu, et al. [1] CM Hu, et al. [2] CM Hu, et al. [3] P. D. Ye, et al. [4] Eric Pop, et al. [5] K Banerjee, et al. [6] Lei Liao, et al. [7] Ali Javey, et al. [8] Schematic of device structure
Key strategy
Hybrid Si/TMD [9] Jun He, et al. [10] A Nourbakhsh, et al. [11] GY Zhang, et al. [12] A Nourbakhsh, et al. [13] Chuan Wang, et al. [14] This work Table S2 . Comparison between 2DVFETs and Si FinFETs.
2DVFETs Intel 14 nm [15] Intel 10 nm [16] Channel materials [8] , and for the spacer, in ideal case, it is also able to be 1 nm to avoid leakage) ? 13.4~16.8 nm [17] Channel thickness ~0.31 nm [18, 19] (Here we didn't consider the van der Waals gap) (supposing with the same EOT of 0.9 nm and consider the thermionic limit of 60 mV dec -1 ) [20] ~65 mV dec -1 ~70 mV dec -1
(supposing with the same EOT of 0.9 nm) Rc is approximately 1.85 kΩ μm. Table S3 . Summary of main issues to realize ideal 2DVFETs.
Hysteresis
Key issues for ideal 2DVFETs
Channel 1) To avoid short-channel effect, thickness of the channel should be small enough. And this is the key advantage of atomically thin 2D semiconductors as channel materials of FETs. [8] 2) In ideal case, if the channel length can shrink to the mean free path of the carriers, scattering in the channel will be minimal, and the carrier transport will approach the ballistic transport. The on current will be significantly improved. The Fermi function, carrier velocity, transmission probability and density of energy states turn to be the key considered material-related parameters for channel materials. [25] 3) Wafer scale fabrication requires wafer scale channel materials. For 2D semiconductors, wafer scale growth and transfer is becoming possible. [26] [27] [28] [29] 4) Channel material should attach closely with the SID patterns especially at the step part to realize the short channel length defined by the spacer thickness. The flexibility of 2D semiconductors [30] is another advantage in the view of fabrication. Reducing the SID pattern height and post-annealing after etching the 2D channel to defined size will enhance the attachment.
Insulating spacer
1) The spacer layer should also be thin and flat to realize the short channel length. Layered hBN [28] is a promising candidate, and nanometer-thin oxides deposited with ALD [31] are other good candidates available for industry.
2) Leakage current should be small enough, and it requires high quality of insulators with proper thickness.
3) Wafer scale fabrication also requires wafer scale materials of insulating spacer. Wafer scale growth of hBN [28] and ALD deposition of oxides [31] is available.
4) Roughness issue.
The edge roughness of the insulating spacer will induce the roughness of the channel. In our experiments, the roughness of the hBN edge is mainly determined by the etching process (reactive ion etching with SF6 in the present work). Such a rough edge of hBN can reduce the mobility of our devices.
A feasible solution for improving the roughness of the channel is to generate a quasi-suspended 2D semiconductor vertical channel as described in the following figure. The hBN underneath the top electrode can be selectively etched further in the lateral direction to form a partial empty spacer, and the suspended vertical channel can be generated after transferring 2D semiconductors. It is noted that a similar approach has been demonstrated in Nature nanotechnology 2019, 14, 579. [32] Schematic of feasible processes to generate a quasi-suspended 2D semiconductor vertical channel. a) Original sourceinsulating spacer-drain pattern. b) Further etching the spacer underneath the top electrode. c) Suspended channel after transferring 2D semiconductors.
Contact electrode 1) Ohmic or near Ohmic contact is required to realize high on current density with small contact resistance.
Metallic materials with work function matched with 2D channel materials is required. Van der Waals type contact between metal and 2D semiconductors in 2DVFETs will help to avoid the pinning effect and realize such matching. [33, 34] Fabrication in glove box will be one resolution to avoid the surface oxidization of metals with low work function. Graphene [35] or other metallic layered materials [36] would also be candidates as contact materials.
2) The contact electrode should also be thin and flat to help maintain the attachment between the channel material and the SID pattern. Ultra-thin metallic electrodes would be possible with layered metallic materials [35, 36] or normal metals using ALD deposition [37] . Table S3 continued. Summary of main issues to realize ideal 2DVFETs.
Gate 1) Gate insulator should be thin with small EOT to realize small SS values for the device. High-K metal gate is already well applied in industry [38] , and it is also available to fabricate on 2D semiconductors [39] .
2) Gate leakage current should also be small enough.
3) Gate length and overlap capacitance issue.
Gate length is as important as the channel length in high-performance and low-power logic transistors.
Only scaling down the channel length while not gate length will induce large overlap between the gate and source/drain electrodes, inducing the parasitic capacitance. Such capacitance will enlarge the total capacitance and affect both the intrinsic gate delay (CV/I) and energy-delay product (CV/I×CV 2 ) of the transistor. [17, 40] However, in the current work, we mainly focused on the concept demonstration for the new device structure design of vertical-type short channel transistors, while the optimization of gate length is not investigated.
Here we would like to propose one feasible solution to scale down the gate length in our 2DVFETs. As shown in the following figure, with the step-shape feature of 2DVFETs, we can take use of the sidewall etching technique normally used in the spacer patterning [15, 41] in industry, and etch back the gate electrode to decrease the gate length, approaching the size of channel length.
As discussed above, the long gate length can strongly affect the speed of the devices. To estimate the speed limit of the device, we consider the intrinsic gate delay defined as τ = CV/I ~ C/Gm, where C is the total capacitance including the overlap capacitance, and Gm is the transconductance defined as Gm = dIds/dVg. For simplified estimation, we consider that the total capacitance contains mainly two parts: gate- was reduced approaching the size of channel length, the limit turns to be at the level of 10 0~1 0 1 ps. It is over one order larger than the ultimate intrinsic gate delay for Si-FETs (~0.1 ps). [42] This value should be improved by enhancing the transconductance, which is mainly limited by the contact resistance.
Taking consideration of the local enhancement effect as discussed in Figure S7 , we can see that the capacitance value in our 2DVFET is approximately 5 times reduced from 8.46*10 -16 F/μm in the long gate length case (300 nm) to 1.6*10 -16 F/μm in the short gate length case (20 nm). The difference is expected from the parasitic capacitance. However, the simulation ratio is much smaller than that of expected simple length-scale approximation (300 nm/20 nm). The origin of this phenomenon is caused by the local field enhancement. It suggests that the local field enhancements could improve the effective gate-channel capacitance.
a) Simplified schematic illustration to show the gate length and overlap capacitance issue. b, c) Schematic illustration of sidewall etching to reduce the gate length. The basic idea is to take advantage of the step-shape feature of 2DVFETs to post-etch the deposited electrode. 
