We demonstrate the application of the Google Sycamore superconducting qubit quantum processor to discrete optimization problems with the quantum approximate optimization algorithm (QAOA). Like past QAOA experiments, we study performance for problems defined on the connectivity graph of our hardware; however, we also apply the QAOA to the Sherrington-Kirkpatrick model and 3-regular MaxCut, both high dimensional graph problems requiring significant compilation. Experimental scans of the QAOA energy landscape show good agreement with theory across even the largest instances studied (23 qubits) and we are able to perform variational optimization successfully. For problems defined on the planar graph of our hardware we obtain an approximation ratio that is independent of problem size and observe, for the first time, that performance increases with circuit depth. For problems requiring compilation, performance decreases with problem size but still provides an advantage over random guessing for circuits involving several thousand gates. This behavior highlights the challenge of using near-term quantum computers to optimize problems on graphs differing from hardware connectivity. As these graphs are more representative of real world instances, our results advocate for more emphasis on such problems in the developing tradition of using the QAOA as a holistic benchmark of quantum processors. designed the experiment.
I. INTRODUCTION
The Google Sycamore superconducting qubit platform has been used to demonstrate computational capabilities surpassing those of classical supercomputers for certain sampling tasks [1] . However, it remains to be seen whether such processors will be able to achieve a similar computational advantage for problems of practical interest. Along with quantum chemistry [2, 3] , machine learning [4] , and simulation of physical systems [5] , discrete optimization has been widely anticipated as a promising area of application for quantum computers.
Beginning with a focus on quantum annealing [6] and adiabatic quantum computing [7] , the possibility of quantum enhanced optimization has driven much interest in quantum technologies over the years. This is because faster optimization could prove transformative for diverse areas such as logistics, finance, machine learning, and more. Such discrete optimization problems can be expressed as the minimization of a quadratic function of binary variables [8, 9] . One can visualize these cost functions as graphs with the binary variables as nodes and (weighted) edges that connect bits whose (weighted) product sum to the cost function value. For most industrially relevant problems these graphs are non-planar and many ancilla would be required to embed them in planar or quasi-planar graphs matching the qubit connectivity of most hardware platforms [10] . This limits the applicability of scalable architectures for quantum annealing [11] and corresponds to increased circuit complexity in digital quantum algorithms for optimization.
The quantum approximate optimization algorithm (QAOA) is the most studied gate model approach for optimization using near-term devices [12] . While the prospects for achieving quantum advantage with QAOA remain unclear, QAOA prescribes a simple paradigm for optimization which makes it amenable to both analytical results and implementation on current processors [13] [14] [15] [16] [17] [18] [19] [20] [21] . For these reasons, QAOA has also become popular as a system-level benchmark of quantum hardware. This work builds on prior experimental demonstrations of QAOA on superconducting qubits [22] [23] [24] [25] , ion traps [26] , and photonics systems [27] . We compare results from these past experiments in Appendix B.
We are able to experimentally resolve, for the first time, increased performance with greater QAOA depth and apply QAOA to cost functions on graphs that deviate significantly from our hardware connectivity. Owing to the low error rates of the Sycamore platform, the trade-off between the theoretical increase in quality of solutions with increasing QAOA depth and additional noise is apparent for hardware-native problems. We also apply the algorithm to non-native graph problems with their necessary compilation overhead and study the scaling of solution quality and problem size. Our results reveal that the performance of QAOA is qualitatively different when applied to hardware native graphs versus more complex graphs, highlighting the challenge of scaling QAOA to problems of industrial importance.
For this study, we used a "Sycamore" quantum processor which consists of a two-dimensional array of 54 transmon qubits [1] . Each qubit is tunably coupled to four nearest neighbors in a rectangular lattice. In this case, all device calibration was fully automated and data was collected using a cloud interface to the platform programmed using Cirq [28] . Our experiment was restricted to 23 physical qubits of the larger Sycamore device, arranged in a topology depicted in Figure 1a .
The combinatorial optimization problems we study in this work are defined through a cost function C(z) with FIG. 1. Three families of optimization problems studied in this work. a. Hardware Grid problems with a graph matching the qubit connectivity of the 23 qubits used in this experiment. b. MaxCut on a random 3-regular graph (shown is an example of the largest MaxCut problems implemented with 22 qubits). c. The fully connected Sherrington-Kirkpatrick (SK) model shown at the largest size (17 qubits) that we optimize experimentally. d. QAOA uses p applications of problem and driver unitaries to approximate solutions to optimization problems. The parameters γ and β are shared among qubits in a layer but different for each of the p layers.
a corresponding quantum operator C given by
where z is a classical bitstring, Z j denotes the Pauli Z operator on qubit j and the w jk correspond to scalar weights with values {0, ±1}. Because these clauses act on at most two qubits we are able to associate a graph with a given problem instance; if w jk = 0, there is an edge between j and k in the graph. We will study three families of problem graphs depicted in Figure 1 .
The shallowest depth version of QAOA consists of the application of two unitary operators. At higher depths the same two unitaries are sequentially reapplied but with different parameters. We denote the number of repeated application of this pair of unitaries as p, giving 2p parameters. The first unitary prescribed by QAOA is
which depends on the parameter γ and applies a phase to pairs of bits according to the problem-specific cost function. The second operation is the driver unitary
where X j denotes the Pauli X operator acting on qubit j. This unitary drives transitions between bitstrings within the superposition state. These operators can be implemented by sequentially evolving under each term of the cost function, as suggested by Eq. (2) and Eq. (3). For depth p and n qubits we prepare the state parameterized by γ = (γ 1 , . . . , γ p ) and β = (β 1 , . . . , β p )
where |+ ⊗n is the symmetric superposition of all 2 n computational basis states. The application of the QAOA circuit to this initial state is depicted in Figure 1d . For a given p, we can find parameters to minimize the expectation value of the cost
For comparison among problem instances, we divide by C min = min z C(z), which is negative for all problems we study, so we are in fact maximizing C /C min .
II. COMPILATION AND PROBLEM FAMILIES
We approach compilation as two distinct steps: routing and gate synthesis. The need for routing arises when simulating U C for a cost function C defined on a graph that is not a subgraph of our planar hardware connectivity. To simulate such U C we perform layers of swap gates (forming a swap network) which permute qubits such that all edges in the problem graph correspond to an edge in the hardware graph at least once, at which point the corresponding cost function terms can be implemented. An example of such a swap network is depicted in Figure 2a .
The final compilation step, gate synthesis, involves decomposing arbitrary 1-and 2-qubit interactions into physical gates supported by the device (see, e.g. Figure 2b) . The physical gates used in this experiment are arbitrary single-qubit rotations and a two-qubit entangling gate native to the Sycamore hardware which we refer to as the syc gate and define in Figure 2c . Through multiple applications of this gate and single-qubit rotations, we are able to realize arbitrary entangling gates. Compilation details can be found in Appendix A. The average two-qubit gate fidelities on this device were 99.4% as measured by cross entropy benchmarking [1] and average readout fidelity was 95.9% per qubit. We now discuss compilation for the three families of optimization problems studied in this work.
Hardware Grid Problems. Swap networks are not required when the problem graph matches the connectivity of our hardware; this is the main reason for studying such problems despite results showing that problems on such graphs are efficient to solve on average [9, 29] . We generated random instances of hardware grid problems by sampling w ij to be ±1 for edges in the device topology (and zero otherwise). Gates are scheduled so that the degree-four interaction graph can be implemented in four rounds of two-qubit gates by cycling through the interactions to the left, right, top and bottom of each interior qubit. Each two-qubit ZZ interaction can be synthesized with two layers of hardware-native syc gates interleaved with γ-dependent single-qubit rotations. In total, each application of the problem unitary is effected with eight total layers of syc gates. Sherrington-Kirkpatrick (SK) Model. A canonical example of a frustrated spin glass is the Sherrington-Kirkpatrick model [30] . It is defined on the complete graph with w ij randomly chosen to be ±1. For large n, optimal parameters are independent of the instance [21] . The SK model is the most challenging model to implement owing to its fully-connected interaction graph. Optimal routing can be performed using the linear swap networks discussed in Ref. [31] and depicted in Figure 2 . This requires n layers of the composite e −iγwZZ · swap interaction, each of which can be synthesized from three syc gates with interleaved γ-dependent single-qubit rotations. Thus, one application of the problem unitary can be effected in 3n layers of syc gates.
MaxCut on 3-Regular Graphs. MaxCut is a widely studied problem, and there is a polynomial-type algorithm due to Goemans and Williamson [32] which guarantees a certain approximation ratio for all graphs, and it is an open question whether QAOA can efficiently achieve this or beat it [33] . Unlike the previous two problem families, all edge weights are set to 1, and we sample random 3-regular graphs to generate various instances. The connectivity of the problem Hamiltonian's graph differs for each instance. While one could use the fully-connected swap network to route these circuits, this is wasteful. Instead, we used an automated routing algorithm called t|ket to insert swap operations which move logical assignments to be adjacent [34] . These compiled circuits are of roughly equal depth to those from a fullyconnected swap network, but the number of two qubit operations is roughly quadratically reduced.
III. ENERGY LANDSCAPES AND OPTIMIZATION
QAOA is often realized as a variational quantum algorithm where circuit parameters (γ, β) are optimized using a classical optimizer, but function evaluations are executed on a quantum processor [15, 35, 36] . First, one repeatedly constructs the state |γ, β with fixed parameters and then samples bitstrings to estimate C . On our superconducting qubit platform we can sample roughly five thousand bitstrings per second. One can then modify the parameters to decrease the observed expectation value, via a classical "outer-loop" optimizer. We note that if problem instances come from a suitable distribution the optimal parameters are nearly instance independent, so once good parameters have been found for one instance they can be used for others [37, 38] .
For p = 1, we can visualize the cost function landscape as a function of the parameters (γ, β) = (γ 1 , β 1 ) in a three-dimensional plot (where we drop the subscripts and label the axes γ and β). The presence of features like hills and valleys in the landscape gives confidence that a classical optimization can be effective. Comparison of simulated and empirical p = 1 landscapes is a common qualitative diagnostic for the performance of experiments [23] [24] [25] [26] [27] . For classical optimization to be successful the quantum computer must provide accurate estimates of C . Otherwise, noise can overwhelm any signal making it difficult for a classical optimizer to improve the parameter estimates. Issues such as decoherence, crosstalk, and systematic errors manifest as differences (e.g., damping or warping) from the ideal landscape. Figure 3 contains simulated theoretical and experimental landscapes for selected instances of the three problem families evaluated on a grid of β ∈ [−π/4, π/4] and γ ∈ [0, π/2] parameters with a resolution of 50 points along each linear axis. Each expectation value was estimated using 50,000 circuit repetitions with efficient post-processing to compensate for readout bias (see Ap- pendix C). The hardware grid problem shows clear features at the maximum size of our study, n = 23. For the other two problems performance degrades with increasing n and so we show data at n = 14 for the 3-regular graph problem and n = 11 for the SK model. We highlight the correspondence between experimental and theoretical landscapes for problems of large size and complexity. Prior experimental demonstrations have presented landscapes for a maximum of n = 20 on a hardware-native interaction graph [26] and a maximum of n = 4 for fullyconnected problems like the SK model [24] .
In Figure 3 we also overlay a time-trace of a classical optimizer's progress as a red line on the landscape plots. We used a classical optimizer called Model Gradient Descent (MGD) which has been shown numerically to perform well with a small number of function evaluations [39] . It uses a quadratic surrogate model of the objective function to estimate the gradient. Details are given in Appendix D. We initialized the optimization from an intentionally bad parameter setting and observed that MGD was able to enter the vicinity of the optimum in 10 iterations or fewer. Each of these iterations consists of 6 energy evaluations to estimate the gradient, and each energy evaluation used 25,000 circuit repetitions.
IV. HARDWARE PERFORMANCE OF QAOA
As the name implies, noisy intermediate-scale quantum (NISQ) processors are noisy devices with high error rates and a variety of error channels. Thus, NISQ circuits are expected to degrade in performance as the number of gates is increased. Here, we study the performance of QAOA as implemented on our quantum processor at different n and p using an application-specific metric: the normalized observed cost function C /C min . A value of 1 is perfect and 0 corresponds to the performance we would expect from random guessing. In order to distinguish the effects of noise from the robustness provided by using a classical outer-loop optimizer, here we report results obtained from running circuits at the theoretically optimal (β, γ) values. Each point represents a distinct instance and some have been perturbed along the x-axis to avoid overlap. While Hardware Grid problems show n-independent noise, we observe that experimental SK model solutions approach those found by random guessing as n is increased.
In Figure 4 , we observe that C /C min achieved for the hardware graph seems to saturate to a value that is independent of of n. This occurs despite the fact that circuit fidelity is decreasing with increasing n. In fact, this is theoretically anticipated behavior that can be understood by moving to the Heisenberg operator formalism and considering an observable Z i Z j . The expectation value for this operator is conjugated by the circuit unitary involving p applications of the instance graph. This gives an expression for the expectation value of Z i Z j which only involves qubits that are at most p edges away from i and j. Thus for fixed p, the error for a given term is asymptotically unaffected as we grow n. Recall that C is a sum of these terms, so the total error scales linearly with n; but C min ∝ n so C /C min is constant with respect to n. Note that non-local error channels or crosstalk could potentially remove this property. For non-local graphs such as the SK model, errors quickly propagate to the entire cost function graph, destroying the property of n-independent noise. Furthermore, compilation of these problems to the hardware topology yields circuit depth scaling with n, causing performance to deteriorate. We see experimental evidence of this in the scaling for the SK model approximation ratios in Figure 4 . However, we still observe performance exceeding that of random guessing for problem sizes up to 17 bits, even with relatively deep (p = 3) circuits.
In a noiseless case, the quality of a QAOA solution can be improved by increasing the depth parameter p. However, the additional depth also increases the probability of error. We study this interplay between noise and algorithmic power in Figure 5 . Previously, improved performance with p > 1 had only been experimentally demonstrated for an n = 2 problem [25] . For larger problems (n = 20), performance for p = 2 was shown to be within error bars of the p = 1 performance [26] . Figure 5 shows the p-dependence averaged across all 130 instances where n > 10. The mean finds its maximum at p = 3, although there are variations among the instances comparable in scale to the experimental p-dependence. The relatively flat dependence of performance on depth suggests that the experimental noise seems to nearly balance the increase in theoretical performance for this problem family. For a more meaningful aggregation of the many random instances across problem sizes, we consider each instance individually and identify which value of the hyperparameter p maximizes performance for that particular instance. A histogram of these per-instance maximal values is inset in Figure 5 , showing that performance is maximized at p = 3 for over half of instances larger than ten qubits. Note finally that our full dataset (see Appendix E) includes experiments that are not plotted here, including results for 3-regular MaxCut.
V. CONCLUSION
Discrete optimization is an enticing application for NISQ devices owing to both the potential value of solutions as well as the viability of heuristic low-depth algorithms such as QAOA. While no existing quantum processors can outperform classical optimization heuristics, the application of popular methods such as QAOA to prototypical problem instances can be used to benchmark and compare hardware platforms.
Previous demonstrations of QAOA have primarily optimized problems tailored to the hardware architecture at minimal depth (p = 1). Using the Google "Sycamore" platform, we explored these types of problems, which we termed Hardware Grid problems, and demonstrated robust performance at large numbers of qubits. We showed that the locations of maxima and minima in the p = 1 diagnostic landscape match those from the theoretically computed surface, and that variational optimization can still find the optimum with noisy quantum objective function evaluation. We also applied the QAOA to various problem sizes using pre-computed parameters from noiseless simulation, and observe an n-independent noise effect on the approximation ratios for Hardware Grid problems with sizes from n = 10 to 23 bits. This is consistent with our theoretical prediction that the noise-induced degradation of every term in the objective function remains constant in the shallow-depth regime where correlations remain local. Furthermore, we report the first clear cases of performance maximization at p = 2 and p = 3 for QAOA, owing to the low error rate of our hardware.
Most real world instances of combinatorial optimization problems cannot be mapped to hardware-native topologies without significant overheads. Instead, problems must be compiled by routing qubits with swap networks. This additional overhead can have a significant impact on the algorithm's performance. We studied random instances of the fully-connected SK model. We again show a diagnostic p = 1 landscape for the SK model with good agreement to the theoretical values. Although we report non-negligible performance for large (n = 17), deep (p = 3), and complex (fully-connected) problems, we see that degradation in performance increases with problem size for such instances.
The promise of quantum enhanced optimization will continue to motivate the development of new quantum technology and algorithms. Nevertheless, for quantum optimization to compete with classical methods for real-world problems, it is necessary to push beyond contrived problems at low circuit depth. Our work demonstrates important progress in the implementation and performance of quantum optimization algorithms on a real device, and underscores the various challenges in applications to problems beyond those that are natively realized by hardware interaction graphs.
Appendix A: Hardware and Compilation Details
In this section, we discuss detailed compilation of the desired unitaries into the hardware native gateset, particularly the syc gate defined in Figure 2c . The syc gate is similar to the gate used in Arute et al. [1] but with the conditional phase tuned to be precisely π/6. A √ iswap gate is simultaneously calibrated and available but has a longer gate duration and requires additional (physical) Z rotations to match phases. The required interactions for this study are compiled to an equivalent number of syc and √ iswap, so syc was used in all circuits. Single-qubit microwave pulses enact "Phased X" gates PhX(θ, φ) (alternatively called XY rotations or the W gate) with φ = 0 corresponding to R X (θ) and φ = π 2 corresponding to R Y (θ) (up to global phase). Intermediate values of φ control the axis of rotation in the X-Y plane of the Bloch sphere.
Arbitrary single-qubit rotations can be applied by a PhX(θ, φ) gate followed by a R Z (ϑ) gate. As a compilation step, we merge adjacent single-qubit operations to be of this form. Therefore, our circuit is structured as a repeating sequence of: a layer of PhX gates; a layer of Z gates; and a layer of syc gates. All Z rotations of the form exp [−iθZ] can be efficiently commuted through syc and PhX to the end of the circuit and discarded. This leaves alternating layers of PhX and syc gates. The overheads of compilation are summarized in Table S1 .
Problem
Routing Interaction Synthesis
Swap Network e −iγZZ · swap 3 TABLE S1. Compilation details for the problems studied. "Routing" gives the strategy used for routing, "Interaction" gives the type of two-qubit gates which need to be compiled, and "Synthesis" gives the number of hardware native 2-qubit syc gates required to realize the target interaction. "WESN" routing refers to planar activation of West, East, etc. links.
Compilation of ZZ(γ). These interactions (used for Hardware Grid and MaxCut problems) can be compiled with 2 layers of syc gates and 2+1 associated layers of single qubit PhX gates. We report the required number of single-qubit layers as 2+1 because the initial (or final) layer from one set of interactions can be merged into the final (initial) single qubit gate layer of the preceding (following) set of interactions. In general, the number of single qubit layers will be equivalent to the number of two-qubit gate layers with one additional single-qubit layer at the beginning of the circuit and one additional single-qubit layer at the end of the circuit. The explicit compilation of ZZ to syc is available in Cirq and a proof can be found in the supplemental material of Ref. [1] . Here we reproduce the derivation in slightly different notation but following a similar motivation.
The syc gate is an fSim(π/2,π/6) which can be broken down into a cphase(π/6), cz, swap, and two S gates according to Figure S1 . We analyze the KAK coefficients for a composite gate of two syc gates sandwiching arbitrary single qubit rotations, depicted in Figure S2 , to determine the space of gates accessible with two syc gates. Any two qubit gate is locally equivalent to standard KAK form [40] . The coefficients in the KAK form is equivalent to the operator Schmidt coefficients of the 2-qubit unitary. To find the Schmidt coefficients, we introduce the matrix representation of 2-qubit gates in terms of Pauli operators, i.e., the jk-th matrix element equals to the corresponding coefficient of the Pauli operator P j ⊗ P k , where P 0,1,2,3 = I, X, Y, Z,
The Schmidt coefficients of O M equal to the singular values of M . Any single-qubit gate G 1,2 can be decomposed into the Z-X-Z rotations; the Z rotations commute with the cz and the cphase, and they do not affect the Schmidt coefficients of the two-qubit operation defined in Figure S2 . We neglect the Z rotations and simplify G 1,2 to singlequbit X rotations
The Pauli matrix representation of
where c 1,2 = cos θ 1,2 and s 1,2 = sin θ 1,2 . The rank of the matrix A is one, representing a product unitary. After being conjugated by the cz gates, i.e, O → cz O cz, the matrix A becomes
where we use the relations for O → cz O cz,
The cphase part in the syc gate is
where φ = −π/24. An arbitrary operator O left and right multiplied by cphase part is expressed as
Applying the operation O → 1 2 (Z ⊗2 O + OZ ⊗2 ) to the operator B, we have
Applying the operation O → Z ⊗2 OZ ⊗2 to the operator B, we have
The resulting two-qubit gate at the output of the circuit in Figure S2 takes the form
Two singular values of M are cos(2φ)c 1 c 2 and cos(2φ)s 1 s 2 corresponding to the diagonal matrix elements M 0,0 and M 2,2 , and the magnitudes of these two singular values are bounded by the angle φ. Consider the two-dimensional subspace of the matrix B, C, and D with the two known singular values removed
The Pauli representation matrix in the reduced space is
To calculate the singular values of a 2 × 2 matrix
we used the formula
where 
We have solved all the four singular values of the 2-qubit unitary at the output of Figure S1 ,
For the case s 1 = 0 and c 1 = 1, we have λ 1 = λ 3 = 0 and the other two singular values
Since cos(2φ) 0.966 > 1/ √ 2, we can implement any cphase gate using only two syc gates. This is achieved by matching the Schmidt coefficients of e −iθZZ/2 to λ 0 and λ 2 . If | cos(θ)| > cos(2φ) then we can reset c 1,2 and s 1,2 appropriately to select out the other pair of singular values.
Compilation of swap. A swap gate requires three applications of syc and is used for the 3-regular MaxCut problem circuits. The swap gate was numerically compiled by optimizing the angles of the circuit in Figure S3 to match the KAK interaction coefficients for the swap gate. S3 . Circuit used to match the KAK coefficients of the swap gate. The RXY (φ)(θ) is a rotation of θ around an axis in the XY -plane defined by φ. This is implemented in Cirq as a PhasedXPow gate.
After the the angles in the circuit depicted in Figure S3 are determined to match the KAK coefficient of the swap gate we add single qubit rotations to make the circuit fully equivalent to swap.
Compilation of e −iγwZZ · swap. This composite interaction can be effected with three applications of syc and is used for SK-model circuits. The syc gate KAK coefficients are (π/4, π/4, π/24) which is locally equivalent to a cphase(π/4 − π/24) followed by a swap. Therefore, to implement a ZZ(γ) followed by a swap we need to apply a single syc gate followed by the composite cphase(γ − π/24 + π/4). The total composite gate now involves 3 syc gates, a single Rx gate and two Rz gates.
Scheduling of Hardware Grid gates. An efficient planar graph edge-coloring can be used to schedule as many simultaneous ZZ interactions as possible. We activate links on the graph in the following order: 1) horizontal edges starting from even nodes; 2) horizontal edges starting from odd nodes; 3) vertical edges starting from even nodes; 4) vertical edges starting from odd nodes. Viewed as cardinal directions and choosing an even node as the central point this corresponds to a west, east, south, north (W, E, S, N) activation sequence. Fully Connected Swap Network. All-to-all interactions can be implemented optimally with a swap network in which pairs of linear-nearest-neighbor qubits are repeatedly interacted and swapped. Crucially, the required interactions swap and e −iγZZ between all pairs all mutually commute so we are free to re-order all two-qubit interactions to minimize compiled circuit depth. After n applications of layers of e −iγwZZ · swap interactions (alternating between even and odd qubits), every qubit has been involved in a ZZ interaction with every other qubit and logical qubit indices have been reversed. This can be viewed as a (parallel) bubble sort algorithm initialized with a reverse-sorted list of logical qubit indices. An example at n = 5 is shown in Figure S4 . If p is even, two applications of the swap network return qubit indices to their original mapping. Otherwise, post-processing can reverse the measured bitstrings.
This expression tells us how to adjust the measured observable to compensate for the readout error. In the above analysis, we can replace p 0 and p 1 by their average (p 0 + p 1 )/2 if we perform measurements in the following way: for half of the measurements, apply a layer of X gates immediately before measuring, and then flip the measurement results. In this case, the corrected observable is
We estimated the value of p 0,i on the device by preparing and measuring the qubit in the |0 state 1,000,000 times and counting how often a 1 was measured; p 1,i was estimated in the same way but by preparing the |1 state instead of the zero state. This estimation was performed periodically during the data collection for Figure 3 to account for drift following automated calibration. We measure each qubit via the state-dependent dispersive shift they induce on their corresponding harmonic readout resonator as described in Arute et al. [1] supplementary information section III. We interrogate the readout resonator frequency with an appropriately calibrated microwave pulse (e.g. a frequency, power, and duration). When demodulated, the readout signal produces a 'cloud of In-phase and Quadrature (IQ) Voltage points which are used to train an out state descrimator. Often, we find that optimal single-qubit calibrations extend to the case of simultaneous readout, but this is not always the case. For example, due to the Stark shift induced by photons in readout resonators, new frequency collisions may be introduced that are not present in the isolated readout case. Similarly, the combined power of a multiplexed readout pulses may exceed the saturation power of our parametric amplifier.
At the time of the primary data collection for this experiment, all automated calibration routines were performed with each qubit in isolation. Subsequently, a calibration which optimizes qubit detunings during readout was implemented to mitigate these correlated readout errors caused by frequency collisions. Figure S5 shows |0 and |1 state errors for simultaneous readout of all 23 qubits (which are used to correct ZZ observables) both as they were during primary data taking for Figure 3 (top) and after implementing the improved readout detuning calibration (bottom). During primary data collection, the median isolated readout error was 4.4% as measured during the previous automated calibration. The discrepancy between these figures and the calibration values shown in Figure S5 , top can be attributed to drift since the automated system calibration in addition to the simultaneity effects described above.
Data presented in Figure 4 and Figure 5 was taken on a different date with median isolated readout error as 4.1% as reported in the main text. Readout correction was not used for these two figures. While automated calibrations will continue to improve, drift will likely remain an inevitability when controlling qubits with analog signals. As such, we expect the readout corrections employed here will continue to provide utility 4, 22] over random 3-regular MaxCut problems as described in the main text. Points have been perturbed along the x-axis to avoid overlap. k-regular graphs must satisfy n ≥ k + 1 and nk must be even, hence only even n are considered here. Green: Noiseless Blue: Experimental
