Quantum bits, or qubits, are an example of coherent circuits envisioned for next-generation computers and detectors. A robust superconducting qubit with a coherent lifetime of O(100 µs) is the transmon: a Josephson junction functioning as a non-linear inductor shunted with a capacitor to form an anharmonic oscillator. In a complex device with many such transmons, precise control over each qubit frequency is often required, and thus variations of the junction area and tunnel barrier thickness must be sufficiently minimized to achieve optimal performance while avoiding spectral overlap between neighboring circuits. Simply transplanting our recipe optimized for single, standalone devices to wafer-scale (producing 64, 1x1 cm dies from a 150 mm wafer) initially resulted in global drifts in room-temperature tunneling resistance of ± 30%. Inferring a critical current I c variation from this resistance distribution, we present an optimized process developed from a systematic 38 wafer study that results in < 3.5% relative standard deviation (RSD) in critical current (≡ σ I c / I c ) for 3000 Josephson junctions (both fixed frequency and asymmetric SQUIDs) across an area of 49 cm 2 . Looking within a 1x1 cm moving window across the substrate gives an estimate of the variation characteristic of a given qubit chip. Our best process, utilizing ultrasonically assisted development, uniform ashing, and dynamic oxidation has shown σ I c / I c = 1.8% within 1x1 cm, on average, with a few 1x1 cm areas having σ I c / I c < 1.0% (equivalent to σ f / f < 0.5%). Such stability would drastically improve the yield of multi-qubit chips with strict frequency requirements, such as processors utilizing the cross resonance gate [1] and photodetectors relying on ensembles of identical qubits [2] .
Introduction
Josephson junctions, fabricated by isolating two superconductors with a thin insulating barrier, are the core circuit element for superconducting solid state quantum coherent devices. When shunted with a capacitor, the non-linear inductance from the junction forms an anharmonic oscillator making energy levels individually addressable [3] -a necessary requirement for quantum processors or other qubit-based detectors, such as single microwave photon detectors [2, 4, 5, 6] . While arrays of large, weakly nonlinear Josephson junctions are often utilized in amplifiers [7, 8] , in this work we specifically focus on the reproducibility of shadow-evaporated sub-micron Al/AlO x /Al Josephson junctions common to nearly all current qubits. Fig. 1 Device geometry for the asymmetric SQUID used in this study. Left Sketch of resist stack for "Manhattan Style" junctions. Developed features are designed to be deeper than their width to allow metal to reach the substrate when evaporated parallel to a given channel, but block metal in orthogonal channels. Thermal oxidation of layer 1 occurs before rotating the substrate and depositing layer 2. A third layer, rotated by φ = 180 • relative to layer 2 (not shown) is needed to make the second SQUID junction. A high sensitivity resist (MMA) results in a undercut of the high resolution top layer (CSAR) to improve liftoff quality. Middle Left Optical microscope image of the developed resist stack. Middle Right Optical microscope image of the final SQUID structure. Right Scanning electron microscope images of the two Josephson junctions forming the 8:1 asymmetric SQUID on wafer 37. (Color figure online.)
The critical current, I c , of a Josephson junction, inversely proportional to its inductance, is tuned by either varying the critical current density, J c , or the junction area. The former involves modifying the tunnel barrier thickness via the oxidation time or pressure when using a thermally grown barrier. Our wafer-scale fabrication process produces 64, 1 cm 2 dies from a 150 mm wafer -the maximum size accommodated by our evaporator. The junctions are located within the central ≈ 49 cm 2 of the die array and thus high uniformity is desired over this length scale. Previous works describe two types of Josephson tunnel junctions: large junctions, I c O(µA), typically realized with a Nb/AlO x /Nb trilayer process suitable for superconducting digital electronics or microwave amplifiers; small junctions, I c O(nA), typically realized with Al/AlO x /Al suitable for qubits. Regarding the former, 2-4% intrachip variations have been reported [9] and ≈ 15% variation is observed across a wafer [10, 11] ; a notable exception is [12] where 8.2% and 2.9% variation in resistance is reported for 300 nm and 800 nm diameter junctions, respectively, across a 200 mm wafer. Junctions with sizes ranging from 0.015 to 3.27 µm 2 mentioned in [13] had variations of 2.3% on 39 mm 2 chips. For qubits, it is advantageous to reduce the physical size of the junction to minimize the inclusion of noisy two level defects [14] . Authors fabricating deep sub-micron junctions typically report fluctuations of ≈ 5% within chips smaller than 50 mm 2 [15] and fluctuations of 2-3% for 0.04 µm 2 junctions patterned with hard masks across 50 mm wafers [16] .
In this work, we strive to further improve this absolute level of resistance variation, and to realize it over a larger wafer size in order to increase the yield of functional multi-qubit chips which have tight tolerances on qubit frequency. Furthermore we investigated designs where a SQUID replaces a single junction and the magnetic flux-tunability of the circuit inductance is limited by introducing asymmetry in the SQUID junction areas (≥ 5:1) to reduce the susceptibility to flux noise [3, 17] . As such, we produced small junctions over a range of areas spanning 0.0036 to 0.013 µm 2 . As a note, in such SQUIDs, the smaller junction only affects the tuning range and we focus on tight control over the critical current of the larger junction.
Methods and Observations
For this study, both 100 and 150 mm wafers were used. Junctions were fabricated using the bridgefree "Manhattan Style" [18, 19] on > 8000 Ω-cm intrinsic (100) Si using e-beam lithography, see Fig. 1 . Bridgeless junctions have an advantage over bridged designs, such as Dolan style [20] , that the junction area is independent of resist thickness. Layouts were generated in python with GDSpy [21] , proximity effect corrected with Beamer from GenISys, and exposed with 100 keV elections in a Raith EBPG 5150. The EBPG is housed in an enclosure made by MCRT within a class 100 cleanroom. The enclosure re-filters the air to at least class 10 and stabilizes temperatures to ± 0.05 • C over month-scale time frames. A Spicer Consulting SC24 provides active 3-axis magnetic field cancellation from DC-13 kHz, measured at a single point next to the e-beam column. The environmental stability of the setup, combined with the Raith EBPG 5150 self-calibration protocol, provides highly reproducible lithography. Once exposed, samples are developed and subsequently coated with e-beam evaporated Al in a Plassys MEB550s with a base pressure of 3 × 10 −8 mbar. After liftoff, junctions were individually probed to measure their room temperature resistance from which I c can be inferred using the Ambegaokar-Baratoff formula [22] . These values can be converted into a qubit frequency using an estimate of the shunt capacitance. Initially, wafers were probed by hand but then, a Micromanipulator P200L semi-automatic probe station was used for the last 11 wafers to gather statistics on larger number of junctions. Plots highlighting improvements made during this study can be found in Fig. 2 . 
Resist/Exposure
The resist bi-layer was spun with a Laurell Technologies WS-650-23B spin coater. MicroChem MMA-EL13 (copolymer in ethyl lactate) was used as the high sensitivity bottom undercut layer for all wafers. Zeon Corp. ZEP 520A-7, MicroChem 950k PMMA A4, and AllResist GmbH AR-P 6200.9 (CSAR) were all tested as the high resolution upper layer. It was found that the small (≈ 20 mm diameter) vent hole in the top of the spin coater had to be covered to create a uniform spin of the MMA, which was unnecessary for the CSAR and ZEP likely because of the differences in viscosity of anisole and ethyl lactate. We initially had difficulties spinning defect free CSAR on MMA, behavior which was not observed with ZEP. This issue was solved by degassing the CSAR to equalize pressure and humidity by opening the lid and letting it sit at room temperature for 2 hours. CSAR was ultimately selected as the resist of choice over PMMA because of the flexibility it offered having (mostly) orthogonal development chemistry to MMA and over ZEP because of its lower cost. For our developers, described below, MMA and CSAR had an optimal dose of 180 and 1100 µC/cm 2 respectively. We note that partial clearing of CSAR in MMA developer was observed for doses above 1100 µC/cm 2 when immersed for extended times. Proximity effect correction (PEC) in Beamer was first optimized by observing the uniformity (or lack) of residual undercut as the MMA provides a sensitive indicator of long range substrate backscattering compensation. The software's 3D-Edge mode of 3D PEC was chosen due to its ability to simultaneously proximity effect correct both resist layers which require different doses and a default point spread function (PSF): 500 nm PMMA on Si at 100 keV (Z-Position: 0.325) was used initially. Before the addition of short range corrections to this PSF, we had low yield of sub 100 nm features with CSAR which we did not observe with ZEP. The short range corrections that were added to improve yield were: an effective short range blur FWHM of 50 nm, a short range separation value of 5 µm, and a mid-range activation threshold of 2%. A 200 pA beam and 200 µm aperture (calculated spot size = 2 nm) was used with a 1 nm beam step size to ensure that designed area variations on the order of a few nm were reproduced. Backscatter dosing from the probe pads (which are not written on device wafers) were written 130 µm away (∼ 4x the backscattering parameter for 100 keV electrons on Si) to ensure test wafers created junctions equivalent to device wafers.
SEM observations of as-evaporated junctions showed worse line edge roughness (LER) on the second evaporation compared to the first (see Fig. 1 ). Our theory is that Al deposited on the sidewall of the CSAR in the first evaporation introduces additional LER for subsequent evaporations. A trilayer resist (MMA/CSAR/MMA) was briefly considered in an attempt to reduce this effect utilizing the top layer of MMA to shield the CSAR during off-axis evaporations. We did observe an improvement in LER, but since it did not reduce global I c variations, it was abandoned due to its added complexity and the additional forward beam scattering from the top MMA would result in increased developed linewidths [23] , limiting achievable SQUID asymmetry ratios.
Development
Cold development with manual agitation (or ultrasonication for wafer 36) was used for CSAR and ZEP. A Thermo Scientific PC200 immersion circulator filled with 50:50 H 2 O: Propylene Glycol was used to chill N-amyl acetate (NAA) baths to 0 ± 0.02 • C. NAA from Zeon corp. (ZED-N50) was used initially and AllResist GmbH AR 600-546 was used after wafer 26. No difference was noted between these nominally identical developers. The MMA was developed at room temperature and puddle development was briefly considered, but led to many CSAR constrictions so was abandoned in favor of immersion development on PTFE wafer holders. Initially IPA:MIBK was used to develop the MMA but we observed many open junctions due to small resist bridges constricting the CSAR near the junction, especially for < 0.01 µm 2 junctions. Our hypothesis was that swollen, gel-like MMA removed by the developer [24] was the cause of these constrictions. Studies with PMMA (which has much higher molecular weight than MMA), showed that the co-solvent IPA:H 2 O was a superior developer, resulting in reduced swelling and the addition of sonication was shown to increase the rate at which developed resist is removed [25, 26, 27, 28] . Although the switch of developer alone did not drastically improve small junction yield, the addition of sonication did. Care had to be taken to attenuate the ultrasonication power to prevent collapse of the CSAR overhang which was accomplished by using the lowest bath power and, crucially, lining the bath with a polyurethane/vinyl sound absorbing foam, leaving the central 1x1 cm open to allow some power transmission.
After development, oxygen plasma ashing of the newly opened channels is performed. We used a Plasma Etch PE-50 with a 50 kHz pure oxygen plasma (80 s, ≈ 500 mbar, ≈ 60 W). It was found that large, non-radially symmetric I c gradients were reduced and made ∼ radially symmetric by splitting a single ashing step into four, 20 s steps with 90 • substrate rotations between steps. In an attempt to further improve the ashing uniformity, the sample was rotated four times in each corner of the chamber, for a total of 16 x 5 s ashes. This resulted in the best wafer-scale statistics at the time: σ I c / I c = 3.5% for single junctions across 49 cm 2 . Following this wafer, one with no ashing was made. σ I c / I c degraded but most interestingly, J c halved, strong evidence that residual organics have an effect on tunnel barrier properties.
After implementing 16x ashing, the dominant source of non-uniformity was found to be junction area variations which showed ∼ radial dependence. First, the effect was reduced simply by increasing the junction area (and decreasing J c to keep I c constant). Then, as it seemed most likely to be caused during development, the manual agitation in NAA was replaced by ultrasonication for wafer 36 due to its assumed higher uniformity and contrast improvements seen in [29] . However, this showed no improvement and a ∼ 1 cm 2 patch of abnormally low J c on the wafer caused an overall σ I c / I c degradation. Pinpointing the cause of, and a solution to, the area fluctuations is the path towards better wafer-scale uniformity in this process. To this end, a hard mask process would be helpful as it should not warp during evaporation or diagnostic post-development SEM imaging.
Evaporation and Oxidation
Before the strong effect of ashing uniformity was discovered, rotations were added to evaporation steps wherever possible to smooth out the possible source of non-uniformity. Although weakly motivated, rotations during pump down, gettering, and oxidation were all employed and kept once they were added. Further studies will be performed to systematically remove these steps to determine if they are indeed helpful. The evaporation rate was changed motivated by the hypothesis that high energy electrons and UV radiation released during the evaporation could warp or distort the resist non-uniformly, possibly being the source of the observed area fluctuations. With a modest increase in beam current, evaporation rate can be increased significantly (in our case an order of magnitude) leaving the wafer exposed to sources of warp for much less time. Since it seemed motivated and did not seem to hurt, this faster rate was kept for many wafers, but ultimately a rate of 0.3 nm/s was selected as it showed better uniformity, likely because of better averaging within the junction that occurs with smaller grains since oxide thickness is not uniform grain to grain or at grain boundaries [30, 31] . Dynamic and static oxidations were also A/B tested. In a static oxidation, the chamber is filled with oxygen (in our case 95%/5% Ar/O) to a set pressure and then evacuated after a set time. In a dynamic oxidation, gas is continuously introduced and pumped out with rates balanced such that the pressures are the same as the static oxidation case. Interestingly, we found dynamic oxidation produced a lower J c and since it provided better uniformity, it was used for the remainder of the study.
Results
Wafers (which each had 1000 fixed frequency junctions, 1000 6:1 SQUIDs, and 1000 8:1 SQUIDs patterned in alternating rows of 50) made after delivery of the automated probe station are summarized in Table 1 . The process used to make wafer 35 (which had the highest uniformity) is listed below and its properties are plotted in Fig. 3 . 
Qubit Coherence
Many measurements are still needed to correlate these improvements in junction uniformity with ultimate device performance. Here we describe a few exemplar measurements to show that the processing steps described to improve uniformity do not come at the expense of qubit coherence. Two different architectures with qubits between 5-6 GHz were made with a process close to wafer 28/29 and characterized at 10 mK. The first architecture is an 8-qubit quantum processor with coupling resonators between nearest neighbor qubits forming a ring topology. Such a device was made following wafer 29's process except MIBK was used instead of H 2 O for MMA development. Characterization was performed without individual qubit control lines connected by driving qubit control pulses through the readout bus. Four qubits had a mean energy relaxation time, T 1 , equal to 73-80 µs with time dependent fluctuations of ≈ 15%. The dephasing time measured with a Ramsey experiment, T * 2 , was found to take an average value of 74-91 µs depending on the qubit and by echoing away low frequency noise with a π pulse in the middle of the Ramsey evolution, the average T 2Echo was measured to be 115-130 µs, again depending on the qubit, with time dependent fluctuations of ≈ 20%. Two qubits on this chip were outliers with T 1 equal to 6 and 31 µs and two qubits were not found in time domain. The second architecture tested with a near final junction recipe was a single microwave photon detector which has four tunable qubits made with asymmetric SQUIDs. For this device, lifetimes are short by design when operated as a detector due to the strong coupling to the environment. Thus this device requires dedicated cooldowns to measure T 1 and permanent destruction as a detector to measure T * 2 by shorting out the input waveguide. One chip made with the same process as the chip above underwent destructive testing and had a T 1 , T * 2 , T 2Echo mean values between 20-48 µs, 16-30 µs and 37-58 µs, respectively, at the flux sweet spot. Another photodetector made with wafer 28's exact process which did not undergo destructive testing had T 1 means between 37-48 µs. 
Conclusions
Motivated by the challenging task of maintaining high Josephson junction uniformity when scaling quantum coherent circuit fabrication beyond a few qubits, we undertook a systematic study to identify and rectify sources of I c variation. We have developed a process which has shown an σ I c / I c as low as 3.1% over 49 cm 2 for single junctions. Looking within a chip sized 1 cm 2 window to remove global drift, an average σ I c / I c = 1.8% was measured with some areas <1.0%. To accomplish this, a reliable resist stack was found by changing proximity effect correction parameters and studying different development strategies, of which ultrasonication played a key role in producing high yield Summary of modified process variables and uniformity results for the 11 wafers measured using automatic probing. The aluminum crucible was refilled after wafer 34. 2P, 2t refers to double oxidation pressure and time compared to unspecified cases. Agitation during MMA development is a gentle manual agitation of the wafer in the ultrasonic bath. Junction design size specifies nominal relative junction areas, useful when comparing average I c between wafers. Wafer-to-wafer repeatability can be evaluated by comparing wafers 35/36 and 37/38. structures. The large gradients introduced by non-uniform ashing were mitigated by adding rotations into that process, and the use of an asher with a higher frequency power supply may provide uniformity without requiring substrate rotations. Smaller crystal grains from slower evaporations and dynamic oxidations were then shown to further improve uniformity. Our current uniformity may be improved by minimizing the observed junction area fluctuations, whose origin is not currently understood. However, since σ I c / I c within chip sized areas is small, detunings between qubits on a single chip can be accurately set and the non-zero global I c drift can be used to target absolute frequencies; a useful capability as tolerances become tighter for quantum processors and microwave photon detectors growing in complexity, size, and qubit number.
