I. INTRODUCTION

S
INGLE-FLUX-QUANTUM (SFQ) superconductor electronics (SCE) has well documented advantages over semiconductor electronics for energy efficiency, clock speed, and potential for reversible computing. Over the years, however, these advantages have largely remained unrealized because of low scale of integration and limited sophistication in functionality of superconducting circuits. While the semiconductor industry maintained an exponential growth of electronic circuit density, the density of JJ circuits has remained nearly stagnant for the last 25 years. This stagnation has been attributed in part to the gross disparity in funding of SCE compared with CMOS and the resultant lack of access to modern fabrication tools and systematic process development [1] - [3] .
Achieving very large scale integration (VLSI) with 10 6 JJ/circuit and beyond requires decreasing the scale of all components to deep submicron dimensions. While there have been a number of reports on submicron JJs [4] - [11] with both low and high critical current density, J c , the JJ definition was produced with e-beam lithography in circuits with a small number of JJs on small-size wafers [12] . There are no data and statistics on submicron JJs defined by modern steppers and scanners with 248 nm and 193 nm exposure wavelength, on a very large scale and on large-size wafers.
The goal of this work has been to develop a robust fabrication process for Nb/Al-AlO x /Nb junctions, using 200-mm production class tools of a typical CMOS foundry. In this paper we report on JJ fabrication as well as yield, JJ quality, repeatability and uniformity statistics on a large set of JJs to assess the JJ technology for VLSI circuits. This work builds on a decade of fully planar, niobium-trilayer process development targeting superconducting qubits for quantum information science and technology applications. In the present context, however, we focus solely on aspects related to the development of a robust niobium JJ fabrication process for classical digital and mixed signal applications. Further, the broader effort includes increasing the number of niobium wiring layers from 8 in the current process [13] to 10, and decreasing the size of inductors [14] , resistors, and vias [15] to achieve circuits densities > 1 M JJs/cm 2 .
II. JOSEPHSON JUNCTION FABRICATION PROCESS
A. General Process Description
High-resolution photolithography requires high planarity of circuit layers because of the small focus depth ∼λ/(NA) 2 of modern photolithography tools using a short exposure wavelength, λ, and a high numerical aperture, NA, objective lenses. Therefore, in the MIT LL fabrication process all layers are planarized using chemical mechanical polishing (CMP) of dielectric layers. This ensures all metal layers are deposited on a nearly flat surface with topography height less than ∼40 nm, except near etched vias in the dielectric which provide interlayer connections. A planarized stud-via process has also been developed [15] but is not used in this work.
Planarization also allows the junction layer to be placed at any process level, since the layers are virtually identical, such that the choice may be dictated by SFQ circuit design considerations. For example, in our 4-metal-layer (4M) process, JJs are placed over one Nb ground plane layer, whereas in the 8-metallayer (8M) process the JJ trilayer is above 4 planarized layers of Nb wiring, as shown in Fig. 1 .
Nb/Al-AlO x /Nb trilayers were deposited in-situ in a cluster tool with separate chambers for magnetron sputtering of Nb, Al, and for AlO x formation by thermal oxidation. Nb base and counter electrodes were 150 nm and 250 nm, respectively, and the Al layer was 8-nm thick. The trilayer was deposited over a 200-nm thick planarized SiO 2 layer covering the bottom wiring layers M0-M4 of the 8M process or just one layer, M4, of the 4M process. 200-mm oxidized Si wafers were used with, typically, eight wafers per lot. Oxygen exposure (pressure·time) for oxidation was adjusted to target the following Josephson critical current densities: 10 kA/cm 2 , 20 kA/cm 2 , and 50 kA/cm 2 [16] . Most of the data presented hereafter were obtained using our main, 10-kA/cm 2 process, from hundreds of processed wafers. The data for higher J c junctions were obtained on a much smaller set of wafers used to explore these J c values as the targets for future process nodes.
B. Photolithography of Submicron Josephson Junctions
We used a Canon FPA-3000 EX4 stepper with 248 nm KrF excimer laser exposure wavelength, 5x reduction and NA = 0.6 for photolithography of all layers. We used a positive photoresist and antireflection coating to minimize standing waves caused by the high reflectance of Nb layers.
The optical patterning of JJs reduces to photolithography of posts or contact holes in semiconductor manufacturing. The important differences are: first, a high accuracy and precision of the area definition is required as the critical current I c is proportional to JJ area, and second, a broad range of JJ sizes needs to be printed in a manner preserving the proper relationship between their areas and critical currents.
We used a circular-shaped design for JJs, as it retains circular symmetry after diffraction and gives the minimum area for a given critical dimension (CD). The square shape maximizing the area and used in [17] - [19] requires resolution enhancement techniques (RET) to correctly print corners rounded by diffraction effects, making the area control more difficult. In practice, the JJs were circular-shaped polygons (manhattans) defined on a 5-nm grid (25-nm grid on 5x photomask). Hereafter we give all dimensions in the 1x wafer scale after the 5x reduction.
The theoretical resolution limit for a line-space structure with pitch p = w + s is p min = (λ/NA) · 1/(1 + σ), where w, s, and σ are the linewidth, space, and exposure coherence factor, respectively [20] . This gives p min = 413 nm for coherent exposure (σ = 0) and 250 nm for partially coherent exposure with σ = 0.65. There is no theoretical resolution limit for single circles or squares. A practical limit exists, however, and it is set by the photoresist properties, focus depth budget, and the desired linearity of the relationship between the printed size on the wafer, d w , and the drawn size on the mask, d.
Typical SEM images of photoresist features used for JJ definition are shown in Fig. 2 . After photoresist exposure and development, the diameter of obtained features, i.e., the photoresist mask for JJ etching, was measured using a Hitachi CD-SEM by taking the mean of the measurements of the feature diameter at 49 points around the feature perimeter, as shown by the dotted white line in the right panel of Fig. 2 . The measured dependence of the photoresist features on the wafer, d w , on the drawn size, d, is shown in Fig. 3 . It is similar to the dependence we reported earlier in [15] for printing Nb posts in the context of our stud-via technology. Over a very wide range of JJ diameters, the dependence is very well approximated by
for d > d c , and d w = 0 otherwise, where d c is the minimum printable size (i.e., the photolithography cut-off) and b is the process bias. The typical value of d c we observed over a several hundred wafers is 245 nm ± 20 nm, and the minimum value was ≈225 nm. The origin of the cut-off is as follows. The requirement to print a range of JJ sizes sets an exposure dose-dose to print-that gives the proper sizing of the desired features. Diffraction of light on the mask features creates a nonzero exposure of the photoresist in the "dark" region of the geometric shade. This unwanted exposure increases as the feature size decreases, Fig. 4 . At a certain size, d c , the exposure exceeds a critical level at any point on the wafer and renders all the photoresist completely soluble in the developer; the photoresist mask develops away in the development process.
To provide a mathematical description of the image size for comparison with the measurements, we use the standard theory of image formation in optical imaging systems based on Fraunhofer scalar diffraction theory and Fourier optics [21] , [22] . In this theory, the image formed by an objective is a Fourier transform of the diffraction pattern of the optical mask filtered by the finite numerical aperture of the objective (exit pupil). For coherent illumination, the amplitude of light in the image plane on a wafer can be written as whereÕ(f, g) is the mask spectrum, i.e., the Fourier transform of the mask pattern,P (f, g) is the so-called pupil function describing the response of the projection system to spatial frequencies (f, g), and the term exp[i2π(fx + gy)] represents the interference of light rays propagating in a direction θ given by | sin θ| = λ f 2 + g 2 [23] .
For an opaque disk with diameter d on the photomask (a drawn JJ), the mask transmittance function is
where r = x 2 + y 2 . The Fourier transform of the circ(r) function is an Airy function [22] , so the mask spectrum is
where q = f 2 + g 2 , δ is Dirac delta-function, and J 1 (u) is a first-order Bessel function of the first kind. For a circular-symmetric imaging system with coherent illumination, the pupil function is P (f, g) = 1 if f 2 + g 2 ≤ NA/λ, and 0 otherwise. (5) That is, an image is formed only by the light rays arriving at angles with | sin θ| less than the numerical aperture of the lens, and all higher spatial frequencies are filtered out. This sets the limits of integration in (2) from −NA/λ to +NA/λ.
In general, (2) requires numerical integration. To get an analytical expression, we consider the case of very small mask diameters d λ/N A and use a zeroth-order expansion of the Airy function J 1 (u)/u = 1/2. Then, the mask function in (4) becomesÕ
and the integral in (2) can be evaluated [22] to obtain the field distribution in the image plane, i.e., the aerial image,
which retains the circular symmetry of the photomask, and ρ is the distance from the center of the aerial image. The normalized intensity of light at any point is I(ρ) = E 2 (ρ). To find the size of the image realized in the photoresist, we apply the standard model of an infinitely thin, threshold resist [23] , [24] . In this model, the photoresist is developed away at any point on the wafer that receives an exposure dose higher than a threshold dose corresponding to the normalized light intensity I th and remains on the wafer otherwise. The contour of the image in the image plane is then given by points satisfying the solution of the equation
which reduces to E(ρ) = √ I th in our circular-symmetric case. A solution of this equation exists only if the mask diameter (JJ drawn size) is larger than the minimum size d c , given by
Otherwise, the light intensity at any point of the image is larger than I th and the resist will be developed away. Indeed, at the center of the image where 2J 1 (u)/u = 1, the light intensity is
2 , and (8) can only be satisfied if d ≥ d c as given by (9) . A similar treatment for square posts can be found in [25] , giving for the minimum printable size of squares an extra factor of 2/ √ π in (9). For mask diameters slightly larger than d c , the size of the image in photoresist can be found by expanding the Airy function in (7) to the next leading order, 2J 1 (u) = 1 − u 2 /8 to obtain, using (8)
with d c given by (9) . (10) gives the same functional dependence as the empirical relation (1). Note, that according to (7) the light intensity grows as
An intensity threshold I th = 0.3 is a good representation of the typical photoresists [23] - [25] . This gives d c = 177 nm for our exposure conditions, a bit lower than the typical values of d c > 220 nm observed in this work. The difference is attributable to a partially coherent exposure used, finite photoresist thickness, defocusing, and other photoresist effects.
Light intensity distributions following from (7) For a wider range of sizes, approximation (6) for the mask function (4) is insufficient and (2) needs to be integrated numerically. Numeric results obtained using the photolithography The simulated photoresist image diameter at σ = 0.65 (used in this work) is very well fitted by the empirical function (1) that was derived to fit the photoresist feature diameters measured in the SEM, see Figs. 3 and 5. This gives yet another justification for using (1) in further discussions of JJ properties.
C. Etching and Anodization of Josephson Junctions
After patterning the photoresist mask, the JJ counter electrode was etched using a high-density plasma etcher and then anodized to prevent degradation of the exposed AlO x barrier [27] - [29] . The sizes of the etched JJs were measured before and after the anodization step, and are shown in Fig. 3 along with the size of the photoresist features used as an etch mask. For statistics, multiple JJs of the same nominal size were measured on the same test chip at different locations on the wafer.
We developed an etch recipe that provides a constant positive bias Δb ≈ 80 nm to the etched features, thereby obtaining a larger diameter of the etched JJs than the photomask diameter to compensate for a consumption of Nb in the anodization process. The post-etch diameter of JJs is shown in Fig. 3 by blue dots along with the fits to (1), shown by blue curves. The post-etch diameters differ from the etch-mask diameters, shown in black, by a size-independent bias-i.e., all the data points are shifted up by the same amount.
Anodization adds yet another bias component, increasing the JJ diameter by about 50 nm, Fig. 3 . This increase is caused by the Nb 2 O 5 layer grown on the sidewalls of JJs during the anodization. Its thickness was measured using cross-section TEM, and was found to be ≈50 nm, see Fig. 6 . During the anodization, about 0.9 nm/V of Nb is consumed and converted in about 2.5 nm/V of niobium oxide. Therefore, the actual diameter of niobium in the counter electrode of JJs, i.e., the final JJ size as seen in SEM, is given by d w = d anod − 100 nm, where d anod is the JJ diameter after the anodization. This dependence is shown in Fig. 3 (a) and (b) by a green dashed curve; it corresponds to the final bias b in the range from 50 nm to 70 nm in (1) .
The final bias is the sum of the photolithography, etching, and anodization components: b = b photo + b etch + b anod − 100 nm. The purpose of the described bias settings was to obtain the final diameter of Nb in the counter electrode of JJs within ±50 nm from the diameter of the photoresist etch mask. This is convenient from the in-line process control standpoint. Note that etching and anodization do not affect d c , which is determined solely by the photolithography process.
D. Planarization and Further Processing of JJs
After patterning the base electrode of the JJs using the photolithography and high-density plasma etching, the junction mesa was planarized by depositing a thick layer of SiO 2 and polishing it down to the level of the JJ counter electrode by using CMP. This creates a flat surface of the interlayer dielectric, on to which a Nb wiring layer is deposited and patterned for contacting the tops of the JJs. The SiO 2 thickness was measured using an ellipsometer. The surface roughness of SiO 2 layers after the CMP was measured by an AFM and found to be less than 0.4 nm over a 10 μm × 10 μm window.
Further processing steps involve depositing and patterning the resistor layer, adding an interlayer dielectric above the resistor layer, opening contact holes to the buried JJ base electrode and resistors, and depositing and patterning the wiring layer used to contact the JJs and the resistors; see the final process cross-section in Fig. 1 . A cross-section TEM of a fully processed 500-nm JJ is shown in Fig. 6 .
E. Electrical Characterization of Josephson Junctions
The electric resistance of individual JJs on the processed wafers was measured by semi-automated wafer probers, using 4-probe measurements of JJs in a cross-bridge Kelvin geometry (CBKR). For a description of CBKR technique see [15] , [31] , [32] , [61] , [62] and references therein. A JJ design with a minimum surround by the counter electrode and the top wiring layer was used to minimize its parasitic contribution of the JJ tunnel barrier resistance. For the JJ sizes used in this work, this parasitic contribution is negligible, but it may become noticeable at d > 5 μm and even less for higher-J
The test structures from the same set, but with d > d c , were then measured at LHe temperature after dicing the wafers and wire bonding devices on individual chips. Current-voltage (I-V ) characteristics of single JJs and 1000-JJ series arrays were measured in a magnetically shielded cryoprobe allowing for automated testing of up to fourteen 5 mm by 5 mm chips in liquid He.
III. EXPERIMENTAL RESULTS AND DISCUSSION
A. General Approach, Justification, and Statistics
Our approach is to evaluate continually the reproducibility (between wafers) and repeatability (within a wafer) of the fabrication process and its components, e.g., JJs, on a very large scale. Cryogenic measurements of individual JJs on this scale, however, are practically prohibitive due to expense and time investment. Also, critical currents of small junctions with I c less than about 20-50 μA cannot be directly measured at 4.2 K because their switching is affected by noise. Procedures for extracting I c in such cases exist, e.g., measuring the switching statistics and subsequent fitting to a theoretical dependence, or measuring the devices in a dilution refrigerator at much lower temperatures. However, these techniques even more time consuming and expensive for the task at hand. Instead, our approach is to establish a clear connection between waferscale statistical data (on a large number of devices per wafer) acquired at room temperatures and cryogenic measurements (fewer devices per wafer) that are used to validate and supplement them.
It has been very well established during the last 50 years that the critical current of a Josephson junction and its normal state conductance G N = 1/R N are related, because both depend on the very same property of the junction -the barrier transparency, for a review see, e.g., [47] . This is commonly expressed by the statement that I c R N product at a given temperature is a constant for the given type of JJs. For tunnel junctions the relationship is known as Ambegaokar-Baratoff (AB) formula [33] 
where V g is the gap voltage (Δ 1 + Δ 2 )/e, Δ 1 (T ) and Δ 2 (T ) are the superconducting energy gaps in the junction electrodes, and g is a numerical factor of order unity which accounts for deviations from the ideal model [33] , such as barrier nonideality, proximity between Nb and Al overlayer in the base electrode, strong coupling effects in Nb, etc. We determined V AB at 4.2 K for our junctions and used it to infer the critical current of JJs from their conductance. Also, we experimentally established the relation between the normal state conductance G N at 4.2 K, entering into the AB formula (11) , and the JJ conductance G 300 at 300 K. This chain of relationships allows us to use G 300 to characterize the value of I c of the junctions at 4.2 K.
However, when considering the relative statistical quantities, e.g., on-chip and on-wafer standard deviation of I c and run-torun reproducibility, the knowledge of the exact value of V AB is not strictly required in all cases of interest.
For example, if quantities G N and V AB are independent statistical variables X and Y , the variance of their product is
2 , where E(X) and E(Y ) are the expectation values, i.e., their mean values [48] . Hence, a normalized value of I c standard deviation
1/2 /E(I c ) can be written as
where σ G and σ V ab are the normalized standard deviations of the JJ conductance and the Ambegaokar-Baratoff voltage-I c R N product,-respectively. In turn, the σ V ab is the same as the relative standard deviation of the gap voltage σ V g in the ensemble of junctions since the variation of the hyperbolic tangent term in (11) is negligible at 4.2 K where V g k B T . Thus, the standard deviations σ G and σ V g can serve as a good estimator of σ Ic by using (12) , and a complex problem of the exact determination of I c and its statistics, especially in small JJs, is replaced by the simpler task of finding σ G and σ V g from relatively straightforward measurements of the gap voltage and JJ conductance.
The results of the implementation of the approach described above are given in the subsequent sections and followed by a discussion.
B. 10 kA/cm
2 Process Fig. 7 shows the relation between the room-temperature JJ resistance R 300 and JJ normal state resistance R N extracted from I-V characteristics measured at 4.2 K. The relation is linear, R 300 = kR N with k = 0.905 ± 0.010 and, respectively, G N = kG 300 . Values of k less than unity are consistent with the tunnel barrier and indicate negligible contribution of Nb leads to the junction resistance [30] - [32] . I-V characteristics also show high junction quality as indicated by a relatively large subgap resistance R sg /R N > 10 and gap voltage V g = 2.75 mV. The AB voltage I c R N = V AB (T ) at 4.2 K was determined to be 1.75 mV for our 10 kA/cm 2 JJs. The linear relation between R 300 and R N (and respective conductances G 300 = 1/R 300 and G N ) enables us to use the automated wafer-prober measurements of R 300 to screen rapidly of a very large number of JJs on 200-mm wafers, in lieu of performing time consuming measurements at 4.2 K. This enables a statistical assessment of JJ uniformity and reproducibility on the wafer scale, and it provides an accurate prediction of JJ critical currents and trilayer's J c based on room-T measurements.
The gap voltage standard deviation σ V g was measured using single JJs and series arrays of 100 and 1000 JJs with drawn sizes from 500 nm to 2200 nm. It was determined to be less than 0.4%, limited by the accuracy of our voltage measurements (about ±10 μV). Consequently, the first and the third term in (12) can be neglected, and the critical current standard deviation is primarily determined by the standard deviation of JJ conductance.
The JJ conductance, both G N and G 300 , and the critical current I c are proportional to JJ area
where parameters d 0 and d 1 account for the deviation in the junction diameter as determined by electrical measurements at, respectively, 300 K and 4.2 K from its physical diameter d w as measured by the SEM. Since d w is described by (1), parameters d 0 and d 1 simply modify the size bias b. Parameters G 0 and J c characterizing the tunnel barrier can be found by fitting √ G 300 and √ I c to (13) with d w given by (1) with G 0 , J c , and b taken as fitting parameters. An example of such a fitting is shown in Fig. 8 for nine locations on a 200-mm wafer. All data are fitted exceptionally well by accounting for the nonlinear relationship between the JJ electric size and the drawn size due to the optical process bias described by (1) . Simple fits assuming solely a constant bias, the so-called "missing diameter," fail to fit the data in the range of sizes below ∼800 nm, where the optical bias is most nonlinear. Consequently, a manifestly "liner fit" would predict an incorrect (and size-dependent) value of the specific conductance G 0 and, hence, J c . Typical histograms of the conductance of nominally identical junctions at room-T are shown in Fig. 9 for all JJs tested on a wafer. The distributions are fitted well by a Gaussian distribution with a standard deviation σ G that depends on the junction size. The data in Fig. 9 represent aggregate distributions of JJs located on 9 chips over 200-nm wafers, and so the σ G values represent wafer-level spreads of JJ conductance. Local, on-chip spreads, which are more important for SFQ circuits, are in fact smaller as will be discussed below. The wafer spread is a convolution of several factors: local variations of JJ area and tunnel barrier transparency within a 5 × 5 mm 2 chip, global variation of the mean area caused by focus variations from one exposure field to another, and global variation of the trilayer tunnel barrier specific conductance on the wafer scale.
We have explained above that the distribution of the critical currents, the relative standard deviation σIc, is expected to be dominated by the distribution of the JJ conductance, characterized by σ G . To verify this, we measured the I-V characteristics of series arrays of 1000 JJs, Fig. 10 , and determined the distribution of the switching currents of JJs in the arrays, Fig. 11 .
In general, the distribution of switching currents I s in the array of JJs is a convolution of the switching distribution of a single JJ [49] and the distribution of the I c of all JJs in the array. At sufficiently large I c , i.e., for I c 2πk B T eff /Φ 0 , where Φ 0 is the flux quantum and T eff is the effective noise temperature seen by the junction, the switching current I s is very close to the actual I c . In this limit, the I s distribution is practically a good estimate for the I c distribution.
Conversely, for JJs with small I c , the switching current can be significantly affected by thermal and external noise, leading to an I s distribution wider than the I c distribution if there is no sympathetic switching. Therefore, the width of the I s distribution for an array of JJs should be viewed as an upper boundary. We have found good agreement between the on-chip variations of JJ conductance at room-T , characterized by standard deviation σ G , and the on-chip variations of the switching current of JJs characterized by standard deviation σ Is for JJ drawn sizes above ∼700 nm. For smaller sizes, down to the smallest size of 300 nm measured, the switching distributions at 4.2 K get progressively wider than the conductance distribution for the same size of JJs, as expected.
Inthefollowingdiscussions,wewillusetheconductancedistribution as an estimator for the distribution of JJ critical currents both on-chip and on-wafer. We furthermore will use relative standard deviations expressed as a percent of the mean value.
The conductance distribution standard deviation depends on the size of the junctions. This dependence is shown in Fig. 12 for the on-chip and wafer-averaged conductance standard deviation. If we assume that all variations of JJ conductance arise solely from variations of the JJ area and take the barrier to be absolutely uniform, we can develop a simple model to describe the data. The JJ area is A = (π/4)d 2 w , and its fluctuations are twice the fluctuations of JJ diameter (to leading order in the fluctuating term). If we assume that the area fluctuations come from fluctuations of the JJ area on the chrome photomask, socalled mask errors, we obtain using (1)
where δd is the variation of the JJ diameter on the chrome photomask introduced by the photomask fabrication process. Since the address size used to e-beam-write the junction photomask is 5 nm, it is reasonable to assume δd to be approximately this value and treat it as a fitting parameter in fitting (14) to the data in Fig. 12 .
To confirm this model, we measured the on-chip variations of the JJ diameters at different steps of JJ fabrication, using a Hitachi CD-SEM as described previously. Unfortunately, the accuracy of the SEM measurements suffers at JJ sizes larger than about 1 μm, because at the relatively low magnification needed to be used to image the JJ, nm-scale variations become undetectable. In contrast, 200k magnification was used for the smallest JJ sizes.
The measured relative standard deviation of the diameter, σ d / d , where d is the mean diameter, was converted into the relative standard deviation of the area σ A / A . The results shown in Fig. 13 are for a typical wafer from the 8M process run. The measured area variation can be explained by the model (14) at δd ≈ 4 nm. This value is very close to the grid size of 5 nm used to define the polygons approximating the circular JJs, corresponding to the address size used for e-beam writing of the photomask patterns. While this value is consistent with a "mask error" model, it does not account explicitly for any imperfections introduced during the etch process. It is worth noting that we did not observe any significant difference in the area variations of JJ features after different processing steps, except perhaps for very small JJs.
The best fit to the JJ conductance variations gives a 50% larger δd ≈ 6 nm, see the solid lines in Fig. 12 . Although the range of mask errors is small and likely independent of the JJ mask size, the strong enhancement of the area fluctuations observed near the minimum printable size is due to the strongly nonlinear relation between the drawn and realized sizes (1), see in Section II-B. This enhancement makes printing of sizes near the cut-off size d c impractical if a precise area control is required.
The JJ sizes that can be implemented in VLSI SFQ circuits with J c = 10 kA/cm 2 range from 700 nm to 2000 nm, giving the I c range from 38 μA to 314 μA. Smaller critical current (i.e., smaller sizes) would give unacceptable bit error rates in Fig. 12 . Relative standard deviation of JJ conductance, in percent of the mean, as the function of JJ drawn diameter on the photomask for the two typical wafers. The actual diameter of the JJs on the wafers is shown on the right axis. The data were obtained by measuring 9 chips per wafer and the minimum of 36 to 60 JJs of the same size per chip. The wafer-averaged data are shown by squares, whereas the data from the best locations on the wafer are shown as dots. The expected conductance variations following from the model of JJ area variations (14) at δd = 4 nm as measured in SEM, see Fig. 13 , is shown by the dashed curve. The best fits to the best data are shown by the solid curves and give δd = 6 nm and δd = 6.5 nm for w6 (a) and wafer 8 (b), respectively. logic cells, and this determines the lower bound. For this range, the on-chip spreads of JJ conductance and I s are less than 2%, see Fig. 12 , and the on-wafer spread is less than 3%, see Fig. 9 . We believe that these spreads are sufficiently low for yielding SFQ circuits with 10 6 JJs and beyond, and they meet the requirements specified in [1] and [46] .
C. 20 kA/cm
2 Process
Further increasing the speed of SFQ circuits may require J c higher than 10 kA/cm 2 (the present target of our 8M and 10M processes) and correspondingly smaller JJ sizes to maintain a fixed I c . To make a preliminary evaluation of the expected parameter spreads, we made a process-development run targeting J c of 20 kA/cm 2 and 50 kA/cm 2 . We used the same set of photomasks and the same processing as was used for the 10 kA/cm 2 process presented above. The I-V characteristics of individual JJs and 1000-JJ arrays were measured at 4.2 K and are shown in Fig. 14 . The switching distributions were extracted from the I-V s, and the room-T Fig. 13 . Relative standard deviation of JJ feature area variations at different steps of JJs fabrication: photoresist mask, etched counter electrode, anodized counter electrode, and un-anodized part of niobium counter electrode. The diameters of multiple JJs were measured by CD-SEM to calculate the standard deviation. The accuracy of the measurements decreases at large sizes because progressively lower magnification needs to be used to image the features. Equation (14) at δd = 3.5 nm is shown by a solid curve. measurements on multiple individual junctions were done on the same set of JJs as was used in Section III-B.
D. 50 kA/cm 2 Process
The same fabrication and electrical measurements were done for a 50 kA/cm 2 process run. The typical I-V curves for single JJs are shown in Fig. 15 .
An "excess" current can be seen in I-V curves defined as the zero voltage intercept of the solid line in Fig. 15 . It is about 0.33I c . This "excess" current results from the time averaging of Josephson oscillations above the gap voltage of the currentbiased JJs [34] , [35] .
The advantage of the 50 kA/cm 2 process is that JJs at this current density intrinsically become almost nonhysteretic as can be seen in Fig. 15 . The return current is about 85% of the critical current, indicating self-shunting with the damping parameter 1 < β c < 2. Therefore, the junctions can be used in SFQ circuits without external resistive shunts. Their damping is provided by the internal subgap resistance R n and the resultant I c R n product is about 2 mV, a factor of 2 higher than for the externally shunted JJs in the 10 kA/cm 2 process at the critical damping β c = 1. Such junctions would enable much faster and denser circuits at future technology nodes.
The JJ conductance spreads and I s spreads for JJs with 24 kA/cm 2 and 45 kA/cm 2 are shown in Fig. 16 . They are somewhat larger than the typical spreads shown in Section III-B for the 10 kA/cm 2 process, but follow similar trends. The spreads can be fitted by (14) with δd = 10.5 nm. Since we used the same photomasks and fabrication process, this difference may indicate the existence of another factor that contributes to the spreads of the effective, electrical, area of the JJs in addition to the geometric fluctuations described by (14) and that scales with JJ diameter in the same manner. Although the nature of a higher variability of high-J c JJs has not yet been firmly established, we anticipate that these initial spreads can be reduced to the same level as those for the 10 kA/cm 2 with additional fabrication runs and process maturity.
E. Discussion
In the prior discussion we left aside other possible contributions to JJ variability beyond the enhanced mask errors described by (14) . The first is the variation of the cut-off size d c caused by variations of focus, local photoresist chemistry, etc. This contribution is also enhanced by the strongly nonlinear projection present near the cut-off. Indeed, differentiating (1) with respect to d c we get a similarly diverging expression
where δd c is the variation of the cut-off size. The contribution of (15) to on-chip variability should be much smaller than (14) , due to the much smaller numerator-δd c within the die should be very small. Nonetheless, it may contribute to cross-wafer variation, due to focus variation between stepped exposure fields, which could produce δd c in the range of ±10 nm across the wafer. The second is variations of electric biases d 0 and d 1 in (13), which is equivalent to variations of b in (1) ∂A A = 2(δb)
These δb variations could be caused by contamination of Nb around JJ sidewalls, e.g., hydrogen absorption during etching [36] - [39] , that affects superconducting properties of Nb near the Nb/Nb 2 O 5 interface. The size of this potential variability, or if it even exists, is not known. The total area variance is the sum of all variances described by (14) - (16) .
Yet another potential source of JJ variability is fluctuations of the barrier transparency, especially in high-J c junctions. The possibility of the existence of nanoscale regions with high barrier transparency, often called pinholes or quantum point contacts (QPCs), in ultrathin AlO x tunnel barriers was assumed from the early days of transport measurements on tunnel junctions, see [52] and references therein. It was also invoked to explain the details of the subgap transport, see, e.g., [7] , [8] , [35] , [50] - [54] . Recently, wide distributions of AlO x barrier thickness in Al/AlO x /Al junctions were found using scanning transmission electron microscopy [55] , [56] , confirming the existence of nanoscale regions with much thinner barrier than the mean thickness.
The existence of these QPCs does not necessarily imply a higher variability of high-J c junctions. It depends on their spatial and transparency distribution functions. For instance, for junctions effectively saturated with QPCs channels, if the distribution of QPC transparencies is very broad and universal [57] , as was assumed in [35] and [58] , the distribution-averaged conductance and I c should be self-averaging quantities that do not vary significantly from junction to junction. On the other hand, if the QPCs transparency distribution is quite narrow and fabrication-dependent [50] , [54] , such that the transport properties are governed by QPCs with the transparency close to 1, the number of these QPCs, n, in a high-J c junction is expected to be on the order of a few hundred [51] , [53] , and scale proportionally to the junction area. Statistical fluctuations in the number of high-transparency QPCs are expected to be on the order of n −1/2 and scale as A −1/2 . This would result in σ G dependence similar to (16) with a quantity in the numerator characterizing the mean spacing between the QPCs. We leave a more detailed investigation of these issues for a separate study.
As we have shown above, the on-chip and cross-wafer spreads that we have achieved in our 10 kA/cm 2 processes with 4 and 8 metal layers are at a sufficient level to enable the fabrication of VLSI superconducting circuits with 10 6 JJs and beyond. The spreads reported here are significantly lower than those reported for the best industrial [41] - [45] , or advanced research processes [17] , [19] , [40] . However, further lowering of JJ spreads is always desirable, especially for higher-J c processes with smaller JJs. The results obtained in this work show three clear tactics to accomplish this.
The first tactic is the reduction of "mask errors"-fluctuations of JJ area on the photomask-by implementing a smaller grid size for JJ design and a smaller address size for photomask e-beam writing. We are planning to utilize photomasks with a 1-nm design grid and assess improvements in future process runs. This should also allow us to verify our model (14) .
The second tactic is the minimization of mask error enhancement caused by the proximity of JJ sizes to print to the photolithography cut-off size d c at which the area fluctuations diverge. This can be done by transitioning to 193-nm photolithography, which significantly reduces d c . Indeed, using (9), we can estimate the cut-off size for our 193-nm exposure tool with NA = 0.75 to be 110 nm for the threshold resist and coherent illumination, and about 130 nm for actual resists. This is a factor of 1.6-2 less than for 248-nm photolithography used in this work. Then, the ratio of the minimum JJ size we need to print d min ∼ 500 nm, e.g., as could be used in a 50 kA/cm The third tactic is the identification and elimination of the extra JJ variability that accounts for approximately a 50% increase of the conductance spreads with respect to the geometric area fluctuations described by (14) , see Figs. 12 and 16. We leave the implementation of these approaches for future work.
IV. CONCLUSION
We have developed a fabrication process for fully planarized Nb/Al-AlO x /Nb Josephson junctions on a 200-mm-wafer tool set available in a CMOS foundry with 248-nm photolithography. We have measured the on-chip and cross-wafer spreads of the junction conductance and critical current for junction sizes ranging from 200 nm to 1.5 μm in diameter and critical current densities up to 45 kA/cm 2 . We have determined explicitly the relationship between the drawn junction size, the size realized on the wafer, and the minimum printable size for circularshaped junctions. We have shown that a model accounting for the enhancement of mask errors near the minimum printable size accurately describes the variations of JJ parameters as a function of the drawn size. For the range of JJ sizes intended for use in SFQ circuits, we have achieved levels of JJ repeatability and yield consistent with the large-scale integration of superconducting circuits comprising 10 6 JJs and beyond. The junctions and 8-metal-layer fully planarized fabrication process described in this work have been utilized to fabricate with high yield and successfully demonstrate RQL shift registers with 72800 JJs per 5 mm by 5 mm chip [59] . A truncated, 4-metallayer, version of this process has been used to successfully demonstrate a new type of ac-powered SFQ circuits with 32 800 Josephson junctions [60] . To the best of our knowledge, these are the largest SFQ circuits demonstrated up to date.
