Solving SAT and MaxSAT with a Quantum Annealer: Foundations, Encodings,
  and Preliminary Results by Bian, Zhengbing et al.
Solving SAT and MaxSAT with a Quantum Annealer:
Foundations, Encodings, and Preliminary Results
Zhengbing Biana, Fabian Chudaka, William Macreadya, Aidan Roya,
Roberto Sebastianib, Stefano Varottib
aD-Wave Systems Inc., Burnaby, Canada
bDISI, University of Trento, Italy
Abstract
Quantum annealers (QAs) are specialized quantum computers that minimize objective
functions over discrete variables by physically exploiting quantum effects. Current
QA platforms allow for the optimization of quadratic objectives defined over binary
variables (qubits), also known as Ising problems. In the last decade, QA systems as
implemented by D-Wave have scaled with Moore-like growth. Current architectures
provide 2048 sparsely-connected qubits, and continued exponential growth is antici-
pated, together with increased connectivity.
We explore the feasibility of such architectures for solving SAT and MaxSAT prob-
lems as QA systems scale. We develop techniques for effectively encoding SAT –and,
with some limitations, MaxSAT– into Ising problems compatible with sparse QA archi-
tectures. We provide the theoretical foundations for this mapping, and present encoding
techniques that combine offline Satisfiability and Optimization Modulo Theories with
on-the-fly placement and routing. Preliminary empirical tests on a current generation
2048-qubit D-Wave system support the feasibility of the approach for certain SAT and
MaxSAT problems.
1. Motivations and Goals
Quantum Computing (QC) promises significant computational speedups by ex-
ploiting the quantum-mechanical phenomena of superposition, entanglement and tun-
neling. QC relies on quantum bits (qubits). As opposed to bits, qubits can be in a su-
perposition state of 0 and 1.1 Theoretically, quantum algorithms can outperform their
classical counterparts. Examples of this are Shor’s algorithm [1] for prime-number fac-
toring and Grover’s algorithm [2] for unstructured search. Once the technology is fully
developed, it is expected that quantum computing will replace classical computing for
some complex computational tasks.
However, despite large investment, the development of practical gate-model quan-
tum computers is still in its infancy and current prototypes are limited to less than
1Superposition is perhaps the best-known and most surprising aspect of quantum physics (e.g. the famous
Scro¨dinger’s cat which is both dead and alive prior to observation).
Preprint submitted to Information and Computation November 7, 2018
ar
X
iv
:1
81
1.
02
52
4v
1 
 [c
s.E
T]
  6
 N
ov
 20
18
20 qubits. An alternative approach to standard gate-model QC is Quantum Anneal-
ing, a form of computation that efficiently samples the low-energy configurations of a
quantum system[3, 4, 5]. In particular, D-Wave Systems Inc.2 has developed special-
purpose Quantum Annealers (QAs) which draw optima or near-optima from certain
quadratic cost functions on binary variables. Since 2007, this approach has allowed
D-Wave to improve QAs at a Moore-like rate, doubling the number of qubits roughly
every 1.2 years, and reaching 2048 qubits in the state-of-the-art D-Wave 2000Q an-
nealer in January 2017 (Figure 1). These sophisticated devices are nearly-completely
shielded from magnetic fields (≤ 10−9 Tesla) and are cooled to cryogenic temperatures
(≤ 20 mK).
D-Wave’s QAs can be used as specialized hardware for solving the Ising problem:
argmin
z∈{−1,1}|V |
H(z), (1)
H(z)
def
=
∑
i∈V
hizi +
∑
(i,j)∈E
Jijzizj , (2)
where each variable zi ∈ {−1, 1} is associated with a qubit; G = (V,E) is an
undirected graph, the hardware graph, whose edges correspond to the physically al-
lowed qubit interactions; and hi, Jij are programmable real-valued parameters. H(z)
is known as the Ising Hamiltonian or Ising model. Ising problems are equivalent to
Quadratic Unconstrained Binary Optimization (QUBO) problems, which use {0, 1}-
valued variables rather than {−1, 1}-valued ones.3 In current 2000Q systems, hi and
Jij must be within the ranges [−2, 2] and [−1, 1] respectively, and G is a lattice of
16 × 16 8-qubit bipartite modules (tiles) known as the Chimera topology, shown in
Figures 2 and 3. The quadratic term in (2) is restricted to the edges of G, which is
very sparse (vertices have degree at most 6). Despite this restriction, the Chimera Ising
problem (1) is NP-hard [6].
Theory suggests that quantum annealing may solve certain optimization problems
faster than state-of-the-art algorithms on classical computers [5]. Quantum effects such
as tunneling and superposition provide QAs with novel mechanisms for escaping local
minima, thereby potentially avoiding sub-optimal solutions commonly found by clas-
sical algorithms based on bit-flip operations (including WalkSAT, simulated anneal-
ing and others [7, 8, 9]). Although practical QA systems do not return optimal solu-
tions with probability 1, the D-Wave processor has been shown to outperform a range
of classical algorithms on certain problems designed to match its hardware structure
[10, 11]. This suggests the possible use of QAs to address hard combinatorial de-
cision/optimization problems, in particular NP-hard problems like SAT and MaxSAT
[12, 13], by encoding them into the Ising problem (1).
Our goal is to exploit quantum annealing as an engine for solving SAT, MaxSAT,
and other NP-hard problems. Since current QAs have a limited number of qubits and
connections, we target problem instances which are relatively small but computation-
ally hard enough to be out of the reach of state-of-the-art classical solvers. Since QAs
2http://www.dwavesys.com
3Ising variables zi are related to QUBO variables xi through zi = 2xi − 1.
2
Figure 1: Top: Moore-like progress diagram of the development of D-Wave’s Quantum annealers.
X axis: year of release. Y axis: # of qubits. Notice the logarithmic scale of the Y axis.
Bottom: The state-of-the-art D-Wave 2000Q quantum annealer. (Courtesy D-Wave Systems Inc.)
are not guaranteed to find an optimum and hence cannot certify the unsatisfiability of an
encoded formula (§2.1), we target SAT problems such as cryptanalysis [14, 15, 16] or
radio bandwidth repacking [17] which are surely or most-likely satisfiable, but whose
solution is hard to find.
In this paper, we investigate the problem of encoding the satisfiability of an input
Boolean formula F (x) into an Ising problem (1) from both theoretical and practical
perspectives. In principle, converting SAT to Ising with an unbounded number of
fully-connected qubits is straightforward. In practice, these encodings must be done
both effectively (i.e., in a way that uses only the limited number of qubits and connec-
tions available within the QA architecture, while optimizing performance of the QA
algorithm), and efficiently (i.e., using a limited computational budget for computing
3
Figure 2: The 2048-qubit connection graph of the D-Wave 2000Q quantum annealer architecture.
the encoding). We provide the necessary theoretical foundations, in which we analyze
and formalize the problem and its properties. Based on this analysis, we then provide
and implement practical encoding procedures. Finally, we empirically evaluate the
effectiveness of these encodings on a D-Wave 2000Q quantum annealer.
We start from the observation that SATtoIsing can be formulated as a problem in
Satisfiability or Optimization Modulo Theories (SMT/OMT) [18, 19] on the theory of
linear rational arithmetic, possibly enriched with uninterpreted function symbols. SAT-
toIsing is an intrinsically over-constrained problem, so a direct “monolithic” solution,
encoding the whole input Boolean formula F (x) in one step, would typically require
the introduction of many additional ancillary Boolean variables. These extra variables,
in addition to wasting many qubits, would result in very large SMT/OMT formulas:
solving the SATtoIsing via SMT would become computationally very hard, possibly
4
Figure 3: Example of the Chimera topology: the hardware graph for system of 72 qubits in a 3-by-3 grid of
8-qubit tiles. (D-Wave 2000Q systems have 2048 qubits in a 16-by-16 grid.)
even harder than the original SAT problem.
To cope with these issues, we adopt a scalable “divide-and-conquer” approach to
SATtoIsing. First, we decompose the input Boolean formula into a conjunction of
smaller subformulas. Then, we encode each subformula into an Ising model and place
each subformula model into a disjoint subgraph of the hardware graph. Finally, we
connect the qubits representing common variables from different subformulas using
chains of qubits that are constrained to be logically identical.
To exploit the intrinsic modularity of the architecture graph (Figures 2, and 3),
we partition the input formula F (x) into subformulas which can be naturally encoded
and placed into one or two adjacent 8-qubit tiles of the architecture, so that the en-
coding of each subformula is small enough to be handled efficiently by an SMT/OMT
solver, and the encoded (sub)problems can be placed and interconnected within the
modular structure of the graph. More concretely, we generate a library of encodings of
commonly-used and relatively-small Boolean subfunctions. This library is only built
once and consequently can use a large amount of computational resources. When pre-
sented with a SAT formula F (x), we decompose it, use the library to obtain encoded
(sub)functions and use place-and-routing techniques to place and connect the encoded
(sub)functions within the QA hardware graph.
We have implemented and made publicly available prototype encoders built on top
of the SMT/OMT tool OPTIMATHSAT [20]. We present an empirical evaluation, in
which we have run SATtoIsing-encoded problems and MaxSATtoIsing-encoded prob-
5
lems on a D-Wave 2000Q system. We have chosen input problems that are small
enough to fit into the current architecture but are very hard with respect to their limited
size, requiring some computational effort using a state-of-the-art solver.
We stress the fact that this evaluation is not meant to present a comparison with
state-of-the-art of classic computing; rather, it is intended as a preliminary assess-
ment of the challenges and potential of QAs to impact SAT and MaxSAT solving.
This assessment is “preliminary” due to the limitations in number of qubits and qubit-
connections of current QAs; however novel QAs currently under development at D-
Wave have a more interconnected tile structure and higher per-qubit connectivity (de-
gree 15 instead of 6, see also §8).4
Empirical evaluation shows that most encoded SAT problems are solved by the
quantum annealer within negligible annealing time (≈ 10µs). Although preliminary,
the results confirm the feasibility of the approach. They also suggest that quantum an-
nealers run on SATtoIsing-encoded problems (and to a lower extent, MaxSATtoIsing-
encoded ones) might outperform standard algorithms on classical computers for certain
difficult classes of relevant problems as soon as QA systems contain enough qubits and
connections.
Content of the paper. The rest of the paper is organized as follows: §2 presents nec-
essary background on quantum annealing, SAT, MaxSAT, SMT and OMT; §3 presents
the theoretical foundations of this work; §4 describes SMT/OMT-based encoding en-
coding techniques for small Boolean formulas; §5 describes the process of encoding
larger Boolean formulas by formula decomposition, encoding, placement and routing;
§6 summarizes the related work; §7 presents preliminary empirical evaluation; §8 sug-
gests future developments.
Disclaimer. A preliminary and much shorter version of this paper was presented at the
11th International Symposium on Frontiers of Combining Systems, FroCoS’17 [21].
2. Background
We provide the necessary background on quantum annealing (§2.1) SAT, MaxSAT,
SMT and OMT (§2.2).
2.1. Quantum Annealing
As mentioned in §1, quantum annealers as currently implemented by D-Wave Sys-
tems are specialized chips that use quantum effects to sample or minimize energy con-
figurations over binary variables (qubits) in the form of an Ising model (1) [6, 22, 23].
The qubits are interconnected in a grid of tightly connected groups of 8 qubits, called
tiles, as displayed in Figures 2 and 3. Each tile consists of a complete bipartite graph
between two sets of four qubits: the “vertical” set, which is connected to the tiles above
and below, and the “horizontal” set, which is connected to the tiles to the left and to
4See https://www.dwavesys.com/sites/default/files/mwj_dwave_qubits2018.
pdf.
6
Figure 4: Top: implementation of two coupled qubits. (Courtesy of D-Wave Systems Inc.)
Bottom: graphical representation of the tunneling effect within an energy landscape.
the right. Each qubit is connected to at most six other qubits, so that each variable zi
occurs in (2) in at most 6 non-zero quadratic terms Jijzizj (or Jjizjzi). The graphs in
Figures 2 and 3 are known as Chimera graphs.
Single qubits zi are implemented as inter-connected superconducting rings (Fig-
ure 4, top), and a qubit’s ±1-value represents the direction of current in its ring. The
user-programmable values hi ∈ [−2, 2] (biases) and Jij ∈ [−1, 1] (couplings) in (1)
are real values within the specified interval, and are set by applying magnetic flux to
the rings.5 Overall, H(z) in (2) defines the energy landscape for a system of qubits
whose global minimum correspond to the solutions of problem (1).
During quantum annealing, the state of a qubit will be in a superposition of +1 and
−1 simultaneously. The system of |V | qubits is evolved from an initial Hamiltonian,
whose lowest energy state is an equal superposition of all 2|V | classical states, to a
final, user-defined Hamiltonian as in (2). At the end of the annealing, the system is
measured, and a single, classical state z ∈ {−1, 1}|V | is observed. In theory, if the
5We consider normalized bounds without units of measure and scale because the only relevant informa-
tion for us is that both ranges are symmetric wrt. zero and that the bounds for the his are twice as big as
these for the Jijs in (2).
7
evolution is sufficiently slow,6 then the lowest energy state (the ground state) is main-
tained throughout. As a result, the final state z is a solution to the Ising problem (1)
(with some probability, see below). Unlike classical minimization techniques such as
simulated annealing [8], the QA energy-minimization process can use quantum tunnel-
ing [24] to pass through tall, thin energy barriers, thereby avoiding trapping in certain
classical local minima (Figure 4, bottom).
QA theory shows that in the limit of arbitrarily low temperature, arbitrarily small
noise, and arbitrarily slow annealing, the probability of obtaining a minimum energy
solution converges to 1. In practice, these conditions cannot be achieved, and minimum
energy solutions are not guaranteed. Indeed, practical QA systems are physical, ana-
log devices, subject to engineering limitations, and the optimal annealing rate is often
determined empirically. Moreover, hardware performance is dramatically affected by
the choice of Ising model. Among the most relevant factors are:
Thermal and electromagnetic noise. Despite cooling and shielding, thermal and elec-
tromagnetic noise still have noticeable effects. One (approximate) model of these
effects is based on Boltzmann sampling, in which the probability of seeing a state
z with energy H(z) in (2) is proportional to e−βH(z), with β ∈ [3, 5] being ob-
served for certain problem classes [25, 26, 27].
Intrinsic parameter errors. Due to engineering limitations and sources of environmen-
tal noise, the Ising model realized in QA hardware is not exactly the one pro-
grammed by the user. A simplified model of error is that each specified hi ∈
[−2, 2] and Jij ∈ [−1, 1] value is subject to additive Gaussian noise with stan-
dard deviation 0.03 and 0.02 respectively.
Freeze-out. Because of the limited connectivity, we often use chains of several inter-
connected qubits to represent a single Boolean variable (§3.4). However, the
quantum tunneling effect on which quantum annealing is based is diminished
for chains [24], thereby reducing the hardware’s ability to find global minima.
This effect can be mitigated by constructing Ising models with chains that are as
small as possible.
Energy gaps. From the Boltzmann model, we see that a larger energy gap gmin be-
tween ground and excited states leads to a higher probability of an optimal so-
lution, as a ground state is eβgmin times more likely than a first excited state.
This suggests producing Ising models with large gmin in order to maximize the
probability of obtaining an optimal solution.
The fact that QAs are not guaranteed to return a minimum-energy solution is par-
tially addressed by taking a sequence of N samples from the same Ising model and
6Notice that here and elsewhere “slow” is intended in a quantum-physics sense, which is definitely not
“slow” from a computer-science perspective: e.g., a complete annealing process on a D-Wave 2000Q an-
nealer may typically take ≈ 10µs.
8
selecting the result with smallest energy. Distinct samples are statistically indepen-
dent, so the probability Pmin[N ] of obtaining at least one minimum solution over N
samples converges exponentially to 1 with N :
Pmin[N ] = 1− (1− Pmin[1])N . (3)
Typical annealing times and readout times are very short (≈ 10µs and ≈ 120µs re-
spectively), and many samples can be drawn from the same Ising model within a single
programming cycle, so is possible to obtain a large number of samples in reasonable
time.
2.2. SAT, MaxSAT, SMT and OMT
We assume the reader is familiar with the basic syntax, semantics and properties of
Boolean and first-order logic and theories. In the following we recall the main concepts
of interest for our purposes, referring the reader to [12, 28, 13, 18, 19] for more details.
SAT & MaxSAT. Given some finite set of Boolean variables x (aka Boolean atoms)
the language of Boolean logic (B) is the set of formulas containing the atoms in x
and closed under the standard propositional connectives {¬,∧,∨,→,↔,⊕} (not, and,
or, imply, iff, xor) with their usual meaning. A literal is an atom (positive literal)
or its negation (negative literal). We implicitly remove double negations: e.g., if l
is the negative literal ¬xi, then by ¬l we mean xi rather than ¬¬xi. A clause is a
disjunction of literals. A formula is in conjunctive normal form (CNF) iff it is written
as a conjunctions of clauses.
A truth value assignment x satisfies F (x) iff it makes it evaluate to true. If so, x is
called a model for F (x). A formula F (x) is satisfiable iff at least one truth assignment
satisfies it, unsatisfiable otherwise. F (x) is valid iff all truth assignments satisfy it.
F1(x), F2(x) are equivalent iff they are satisfied by exactly the same truth assignments.
A formula F (x) which is not a conjunction can always be decomposed into a con-
junction of smaller formulas F ∗(x,y) by means of Tseitin’s transformation [29]:
F ∗(x,y) def=
m−1∧
i=1
(yi ↔ Fi(xi,yi)) ∧ Fm(xm,ym), (4)
where the Fis are formulas which decompose the original formula F (x), and the yis
are fresh Boolean variables each labeling the corresponding Fi. (If the input formula
is itself a conjunction, then Tseitin’s transformation can be applied recursively to each
conjunct.) Tseitin’s transformation (4) guarantees that F (x) is satisfiable if and only if
F ∗(x,y) is satisfiable, and that if x,y is a model for F ∗(x,y), then x is a model for
F (x). To this extent, it is pervasively used also as a main recursive step for efficient
CNF conversion of formulas [29].
A quantified Boolean formula (QBF) is defined inductively as follows: a Boolean
formula is a QBF; if F (x) is a QBF, then ∀xiF (x) and ∃xiF (x) are QBFs. ∀xiF (x)
is equivalent to (F (x)xi=> ∧ F (x)xi=⊥) and ∃xiF (x) is equivalent to (F (x)xi=> ∨
F (x)xi=⊥) (aka Shannon’s expansion).
9
Propositional Satisfiability (SAT) is the problem of establishing whether an input
Boolean formula is satisfiable or not. SAT is NP-complete [30]. Efficient SAT solvers
are publicly available, most notably those based on Conflict-driven clause-learning
(CDCL) [28] and on stochastic local search [31]. Most solvers require the input for-
mula to be in CNF, implementing a CNF pre-conversion based on Tseitin’s transfor-
mation (4) when this is not the case. See [12] for a survey of SAT-related problems and
techniques.
Weighted MaxSAT {〈Fk, ck〉}k is an optimization extension of SAT, in which the
input formula is a (typically unsatisfiable) conjunction of subformulas F def=
∧
k Fk
such that each conjunct Fk is given a positive penalty ck if Fk is not satisfied, and an
assignment minimizing the sum of the penalties is sought. (Often F is in CNF and
the Fks are single clauses or conjunctions of clauses.) Partial Weighted MaxSAT is
an extension of Weighted MaxSAT in which some conjuncts, called hard constraints,
have penalty +∞. Efficient MaxSAT tools are publicly available (see, e.g., [13, 9]).
SMT and OMT. Satisfiability Modulo Theories (SMT) is the problem of checking the
satisfiability of first order formulas in a background theory T (or combinations of the-
ories thereof). We focus on the theories of interest for our purposes. Given x as above
and some finite set of rational-valued variables v, the language of the theory of Linear
Rational Arithmetic (LRA) extends that of Boolean logics with LRA-atoms in the
form (
∑
i civi ./ c), ci being rational values, vi ∈ v and ./ ∈ {=, 6=, <,>,≤,≥},
with their usual meaning. In the theory of linear rational-integer arithmetic with unin-
terpreted functions symbols (LRIA ∪ UF) the LRA language is extended by adding
integer-valued variables to v (LRIA) and uninterpreted function symbols. 7 (E.g.,
(xi → (3v1 + f(2v2) ≤ f(v3))) is a LRIA ∪ UF formula.) Notice that B is a
sub-theory of LRA and LRA is a sub-theory of LRIA ∪ UF . The notions of literal,
assignment, clause and CNF, satisfiability, equivalence and validity, Tseitin’s transfor-
mation and quantified formulas extend straightforwardly to LRA and LRIA ∪ UF .
Satisfiability Modulo LRIA ∪ UF (SMT(LRIA ∪ UF)) [18] is the problem of
deciding the satisfiability of arbitrary formulas on LRIA ∪ UF and its sub-theories.
Efficient SMT(LRIA ∪ UF) solvers are available, including MATHSAT5 [32].
Optimization ModuloLRIA∪UF (OMT (LRIA∪UF)) [19] extends SMT(LRIA∪
UF) searching solutions which optimize someLRIA objective(s). Efficient OMT(LRA)
solvers like OPTIMATHSAT [20] are available.
3. Foundations
Let F (x) be a Boolean function on a set of n Boolean variables x def= {x1, ..., xn}.
We represent Boolean value ⊥ with −1 and > with +1, so that we can assume that
each xi ∈ {−1, 1}. Suppose first that we have a QA system with n qubits defined on a
hardware graph G = (V,E), for instance, any n-vertex subgraph of the Chimera graph
of Figures 2 and 3. Furthermore, we assume that the state of each qubit zi corresponds
7A n-ary function symbol f() is said to be uninterpreted if its interpretations have no constraint, except
that of being a function (congruence): if t1 = s1, ..., tn = sn then f(t1, ..., tn) = f(s1, ..., sn).
10
to the value of variable xi, i = 1, . . . , n = |V |. One way to determine whether F (x)
is satisfiable using the QA system is to find an energy function as in (2) whose ground
states z correspond to the satisfying assignments x of F (x).
Example 1. Suppose F (x) def= x1 ⊕ x2. Since F (x) = > if and only if x1 + x2 = 0,
the Ising model H(z1, z2) = z1 · z2 in a graph containing 2 qubits z1, z2 joined by
an edge (1, 2) ∈ E s.t. J12 = 1 has two ground states (+1,−1) and (−1,+1), which
correspond to the satisfying assignments of F , and two excited states (+1,+1) and
(−1,−1), corresponding to the non-satisfying ones.
Because the energy H(z) in (2) is restricted to quadratic terms and graph G is
typically sparse, the number of functions F (x) that can be solved with this approach
is very limited. To deal in part with this difficulty, we can use a larger QA system
with a number of additional qubits, say h, representing ancillary Boolean variables (or
ancillas for short) a def= {a1, ..., ah}, so that |V | = n + h. A variable placement is a
mapping of the n + h input and ancillary variables into the qubits of V . Since G is
not a complete graph, different variable placements will produce energy functions with
different properties. We use Ising encoding to refer to the hi and Jij parameters in (2)
that are provided to the QA hardware together with a variable placement. The gap of an
Ising encoding is the minimum energy difference between ground states (i.e., satisfying
assignments) and the other states (i.e., non-satisfying assignments). In general, larger
gaps lead to higher success rates in the QA process [33]. Thus, we define the encoding
problem for F (x) as the problem of finding an Ising encoding with maximum gap.
Note that the encoding problem is typically over-constrained. The Ising model
(2) has to discriminate between m satisfying assignments and k non-satisfying assign-
ments, with m + k = 2n, whereas the number of degrees of freedom is given by the
number of the hi and Jij parameters, which grows as O(n+ h) in the Chimera archi-
tecture. Thus, in order to have a solution, the number of ancilla variables needed (h)
may grow exponentially with the number of x variables (n).
In the rest of this section, we assume that a Boolean function F (x) is given and
that h qubits are used for ancillary variables a.
3.1. Penalty Functions
Here we assume that a variable placement is given, placing x∪a into the subgraph
G. Thus, we can identify each variable zj representing the binary value of the qubit
associated with the jth vertex in V with either an original variable xk ∈ x or as an
ancilla variable a` ∈ a, writing z = x ∪ a.
Definition 1. A penalty function PF (x,a|θ) is an Ising model
PF (x,a|θ) def= θ0 +
∑
i∈V
θizi +
∑
(i,j)∈E
θijzizj (5)
with the property that for some gmin > 0,
∀x min{a}PF (x,a|θ)
{
= 0 if F (x) = >
≥ gmin if F (x) = ⊥
(6)
11
(a) x3 ↔ (x1 ∧ x2)
with one ancilla.
(b) x3 ↔ (x1 ⊕ x2)
with three ancillas.
(c) x4 ↔ (x3 ∧ (x1 ⊕ x2))
obtained by combining 5(b) and 5(a).
Figure 5: Mappings within the Chimera graph, penalty functions use only colored edges. 5(c) combines
5(a) and 5(b) using chained proxy variables y, y′. The resulting penalty function is obtained by rewriting
x4 ↔ (x3∧(x1⊕x2)) into its equi-satisfiable formula (x4 ↔ (x3∧y′))∧(y ↔ (x1⊕x2))∧(y′ ↔ y).
where θ0 ∈ (−∞,+∞) (“offset”), θi ∈ [−2, 2] (“biases”) and θij ∈ [−1, 1] (“cou-
plers”) such that zi, zj ∈ z, and gmin are rational-valued parameters. The largest
gmin such that PF (x,a|θ) satisfies (6) is called the gap of PF (x,a|θ).
Notice that a penalty function separates satisfying assignments from non-satisfying
ones by a gap of at least gmin. The offset value θ0 is added to set the value of
PF (x,a|θ) to zero when F (x) = >, so that −θ0 corresponds to the energy of the
ground states of (2).
To simplify the notation we assume that θij = 0 when (i, j) 6∈ E, and use PF (x|θ)
when a = ∅.
Example 2. The equivalence between two variables, F (x) def= (x1 ↔ x2), can be en-
coded without ancillas by means of a single coupling between two connected vertices,
with zero biases: PF (x|θ) def= 1 − x1x2, so that gmin = 2. In fact, PF (x|θ) = 0 if
x1, x2 have the same value; PF (x|θ) = 2 otherwise.
Penalty PF (x|θ) in Example 2 is also called a (equivalence) chain connecting x1, x2,
because it forces x1, x2 to have the same value.
The following examples show that ancillary variables are needed, even for small
Boolean functions F (x) and even when G is a complete graph.
Example 3. Consider the AND function F (x) def= x3 ↔ (x1 ∧x2). If x1, x2, x3 could
be all connected in a 3-clique, then F (x) could be encoded without ancillas by setting
PF (x|θ) = 32 − 12x1 − 12x2 + x3 + 12x1x2 − x1x3 − x2x3, so that gmin = 2. In
fact, PF (x|θ) = 0 if x1, x2, x3 verify F (x), PF (x|θ) = 6 if x1 = x2 = −1 and
x3 = 1, PF (x|θ) = 2 otherwise. Since the Chimera graph has no cliques, the
above AND function needs (at least) one ancilla a to be encoded as: PF (x,a|θ) =
5
2 − 12x1− 12x2 + x3 + 12x1x2− x1x3− x2a− x3a, which still has gap gmin = 2 and
can be embedded, e.g., as in Figure 5(a).
Example 4. Consider the XOR function F (x) def= x3 ↔ (x1 ⊕ x2). Even within a
3-clique, F (x) has no ancilla-free encoding. Within the Chimera graph, F (x) can be
12
encoded with three ancillas a1, a2, a3 as: PF (x,a|θ) = 5 + x3 + a2 − a3 + x1a1 −
x1a2 − x1a3 − x2a1 − x2a2 − x2a3 + x3a2 − x3a3, which has gap gmin = 2 and is
embedded, e.g., as in Figure 5(b).
The following fact is a straightforward consequence of Definition 1.
Proposition 1. Let PF (x,a|θ) be a penalty function of F (x) as in Definition 1. Then:
• If x,a is such that PF (x,a|θ) = 0, then F (x) is satisfiable and x satisfies it.
• If x,aminimizes PF (x,a|θ) and PF (x,a|θ) ≥ gmin, then F (x) is unsatisfiable.
Proposition 1 shows that the QA hardware can used as a satisfiability checker for
F (x) by minimizing the Ising model defined by penalty function PF (x,a|θ). A re-
turned value of PF (x,a|θ) = 0 implies that F (x) is satisfiable. If the QA hard-
ware guaranteed minimality, then a returned value of PF (x,a|θ) ≥ gmin would imply
that F (x) is unsatisfiable. However, since QAs do not guarantee minimality (§2.1), if
PF (x,a|θ) ≥ gmin then there is still a chance that F (x) is satisfiable. Nevertheless,
the larger gmin is, the less likely this false negative case occurs [33].
A penalty function PF (x,a|θ) is normal if |θi| = 2 for at least one θi or |θij | = 1
for at least one θij . In order to maximize gmin, it is important to use normal penalty
functions so that to exploit the full range of the θ parameters. Any penalty function
PF (x,a|θ) can be normalized by multiplying all its coefficients by a normalization
factor:
c
def
= min
{
min
i
(
2
|θi|
)
,min
〈ij〉
(
1
|θij |
)}
. (7)
Note that if PF (x,a|θ) is non-normal, then c > 1, so that the resulting gap c ·
gmin > gmin. Normalization also works in the opposite direction to scale down some
PF (x,a|θ) whose θ’s do not fit into the allowable ranges (in which case c < 1).
Hereafter we assume w.l.o.g. that all penalty functions are normal.
3.2. Properties of Penalty Functions and Problem Decomposition
As it will be made clear in §4.1, after a variable placement is set, finding the values
for the θ’s implicitly requires solving a set of equations whose size grows with the
number of models of F (x) plus a number of inequalities whose size grows with the
number of counter-models of F (x). Thus, the θ’s must satisfy a number of linear
constraints that grows exponentially in n. Since the θ’s grow approximately as 4(n+h),
the number of ancillary variables needed to satisfy (6) can also grow very rapidly. This
seriously limits the scalability of a solution method based on (5)-(6). We address this
issue by showing how to construct penalty functions by combining smaller penalty
functions, albeit at the expense of introducing extra variables.
The following properties are straightforward consequence of Definition 1.
Property 1. Let PF (x,a|θ) be a penalty function for F (x) and let F ∗(x) be logically
equivalent to F (x). Then PF (x,a|θ) is a penalty function also for F ∗(x) with the
same gap gmin.
13
Property 1 states that a penalty function PF (x,a|θ) does not depend on the syntac-
tic structure of F (x) but only on its semantics.
Property 2. Let F ∗(x) def= F (x1, ..., xr−1,¬xr, xr+1, ..., xn) for some index r. As-
sume a variable placement of x into V s.t. PF (x,a|θ) is a penalty function for F (x)
of gap gmin. Then PF∗(x,a|θ) = PF (x,a|θ∗), where θ∗ is defined as follows for
every zi, zj ∈ x,a:
θ∗i =
{ −θi if zi = xr
θi otherwise;
θ∗ij =
{ −θij if zi = xr or zj = xr
θij otherwise.
Notice that since the previously defined bounds over θ (namely θi ∈ [−2, 2] and θij ∈
[−1, 1]) are symmetric, if θ is in range then θ∗ is as well.
Two Boolean functions that become equivalent by permuting or negating some
of their variables are called NPN-equivalent [34]. Thus, given the penalty function
for a Boolean formula, any other NPN equivalent formula can be encoded trivially
by repeatedly applying Property 2. Notice that checking NPN equivalence is a hard
problem in theory, but it is fast in practice for small n (i.e., n ≤ 16) [35]. The process
of negating a single variable in an Ising model as in Property 2 is known as a spin-
reversal transform.
Example 5. Consider the OR function F (x) def= x3 ↔ (x1 ∨ x2). We notice that this
can be rewritten as F (x) = ¬x3 ↔ (¬x1 ∧ ¬x2), that is, it is NPN-equivalent to
that of Example 3. Thus, by Property 2 a penalty function for F (x) can be placed as
in Figure 5(a) and defined by taking that in Example 3 and toggling the signs of the
coefficients of the xi’s: PF (x,a|θ) = 52+ 12x1+ 12x2−x3+ 12x1x2−x1x3+x2a+x3a,
which still has gap gmin = 2.
Property 3. Let F (x) =
∧K
k=1 Fk(x
k) be a Boolean formula such that x = ∪kxk, the
xks may be non-disjoint, and each sub-formulaFk has a penalty functionPFk(x
k,ak|θk)
with minimum gap gkmin where the a
ks are all disjoint. Given a list wk of positive ra-
tional values such that, for every zi, zj ∈ x ∪
⋃K
k=1 a
k:
θi
def
=
K∑
k=1
wkθ
k
i ∈ [−2, 2], θij def=
K∑
k=1
wkθ
k
ij ∈ [−1, 1], (8)
then a penalty function for F (x) is:
PF (x,a
1...aK |θ) =
K∑
k=1
wkPFk(x
k,ak|θk). (9)
The gap for PF is gmin ≥ minKk=1 wkgkmin.
The choice of the set of weights wk in Property 3 is not unique in general. Also note
that gmin may be greater than minKk=1 wkg
k
min, because, for example, it might be the
14
case that gmin = wkgkmin for some unique k and no truth assignment violating Fk with
cost wkgkmin satisfies all other Fi’s.
Property 3 states that a penalty function for the conjunction of sub-formulas can
be obtained as a (weighted) sum of the penalty functions of the sub-formulas. The
weights wk are needed because penalty functions of formulas that share variables sum
up biases or couplings, possibly resulting into out-of-range values (8). If the wk’s are
smaller than 1, then the gap gmin of the final penalty function may become smaller.
Also, Property 3 requires placing variables into qubits that are shared among conjunct
subformulas. This may restrict the chances of finding suitable placements for the vari-
ables in the graph.
An alternative way of coping with this problem is to map shared variables into
distinct qubits which are connected by chains of equivalences. Consider F (x) =∧K
k=1 Fk(x
k) as in Property 3. For every variable xi and for every Fk where xi oc-
curs, we can replace the occurrences of xi in Fk with a fresh variable xik
∗, obtaining
a formula
∧K
k=1 Fk(x
k∗) such that the sets xk∗ are all disjoint. Let
F ∗(x∗) def=
K∧
k=1
Fk(x
k∗) ∧
∧
〈xik∗,xik′ ∗〉∈Eq(xi)
(xi
k∗ ↔ xik′
∗
) (10)
where x∗ = ∪kxk∗, and Eq(xi) is any set of pairs 〈xik∗, xik′∗〉 of the variables re-
placing xi such that the conjunction of equivalences in (10) states that of all of them
are equivalent. By construction, F (x) is satisfiable if and only if F ∗(x∗) is satisfiable,
and from every model x∗ for F ∗(x∗) we have a model x for F (x) by simply assigning
to each xi the value of the corresponding xik
∗s.
Now assume we have a penalty function PFk(x
k∗ ,ak|θk) for each k with disjoint
ak. We recall from Example 2 that (1 − xik∗xik′∗) are penalty functions of gap 2
for the (xik
∗ ↔ xik′∗) subformulas in (10). Thus we can apply Property 3 with all
weights wk = 1 and write a penalty function for F ∗(x∗) in the following way:
PF∗(x
∗,a|θ) =
K∑
k=1
PFk(x
k∗,ak|θk) +
∑
〈xik∗,xik′ ∗〉∈Eq(xi)
(1− xik∗xik′
∗
). (11)
Note that the θ’s stay within valid range because the xk∗s and aks are all disjoint and
the biases of the (1 − xik∗xik′∗) terms are zero, so distinct sub-penalty functions in
(11) involve disjoint groups of biases and couplings. Thus we have the following.
Property 4. PF∗(x∗,a|θ) in (11) is a penalty function for F ∗(x∗) in (10). The gap of
PF∗(x
∗,a|θ) is gmin ≥ min(minKk=1 gkmin, 2).
Thus, we can represent a single variable xi with a series of qubits connected by strong
couplings (1−xix′i). (For xi ↔ ¬x′i, we use (1+xix′i).) Notice that it is not necessary
that every copy of variable xi be connected to every other one; rather, to enforce the
condition that all copies of xi are logically equivalent, it suffices that the copies of xi
induce a connected graph. Moreover, additional copies of xi may be introduced on
15
unused vertices of the hardware graph G to facilitate connectedness. A set of qubits all
representing the same variable in this way is called a chain and is the subject of §3.4.
Thus, it is possible to implement PF∗(x∗,a|θ) in (11) by placing the distinct penalty
functions PFk(x
k∗,ak|θk) into sub-graphs and connect them with chains.
Recall from §2.2 that a formula F (x) which is not a conjunction can always be de-
composed into a conjunction of smaller formulas F ∗(x,y) by means of Tseitin’s trans-
formation (4). By Properties 3 and 4, this allows us to AND-decompose F (x) into
multiple and smaller conjuncts that can be encoded separately and recombined. The
problem thus reduces to choosing Boolean functions (yi ↔ Fi(xi,yi)) andFm(xm,ym)
whose penalty functions are easy to compute, have large gap, and whose combination
keeps the gap of the penalty function for the original function as large as possible.
Example 6. Let F (x) def= x4 ↔ (x3 ∧ (x1 ⊕ x2)). Applying (4) and (10) this can be
rewritten as F∗(x, y, y′) = (x4 ↔ (x3 ∧ y′)) ∧ (y ↔ (x1 ⊕ x2)) ∧ (y′ ↔ y). The
penalty functions of the three conjuncts can be produced as in Examples 3, 4 and 2
respectively, and summed as in Property 4:
PF∗(x, y, y
′,a|θ)
=
5
2
− 1
2
x3 − 1
2
y′ + x4 +
1
2
x3y
′ − x3x4 − y′a4 − x4a4
+ 5 + y + a2 − a3 + x1a1 − x1a2 − x1a3 − x2a1 − x2a2 − x2a3 + ya2 − ya3
+ 1− yy′
=
17
2
− 1
2
x3 + x4 + y − 1
2
y′ + a2 − a3 + x1a1 − x1a2 − x1a3 − x2a1 − x2a2
−x2a3 − x3x4 + 1
2
x3y
′ − x4a4 + ya2 − ya3 − yy′ − y′a4
Notice that there is no interaction between the biases and couplings of the three com-
ponents, only the offsets are summed up. The resulting gap is min{2, 2, 2} = 2. Then
they can be placed, e.g., as in Figure 5(c).
Overall, these facts suggest a “divide-and-conquer” approach for addressing the
SATtoIsing problem:
(i) AND-decompose the input formula, by rewriting every conjunct F (x) which is
not small enough into an equivalently-satisfiable one F ∗(x,y) as in (4) such that
penalty functions for all its conjuncts can be easily computed;
(ii) rename shared variables and compute the global penalty functions as in Property 4;
(iii) place the sub-penalty functions into subgraphs and connect by chains equivalent
qubits representing shared variables.
3.3. Exact Penalty Functions and MaxSAT
In order to encode MaxSAT, we require a stronger version of the penalty function
in Definition 1.
16
Definition 2. A penalty function PF (x,a|θ) is exact if for all x such that F (x) = ⊥,
min
{a}
PF (x,a|θ) = gmin.
That is, an exact penalty function separates satisfying assignments from all non-satisfying
ones by exactly the same gap gmin.
Example 7. The penalty function of F (x) def= (x1 ↔ x2) in Example 2 is exact,
whereas those of F (x) def= x3 ↔ (x1∧x2) and F (x) def= x3 ↔ (x1⊕x2) in Examples 3
and 4 are not exact.
Exact penalty functions allow for the encoding of weighted MaxSAT problems,
with some restrictions. The following fact is a straightforward consequence of Property
3 and Definition 2.
Proposition 2. Let F (x) =
∧K
k=1 Fk(x
k) be a Boolean formula s.t. x = ∪kxk, and
PF (x,a|θ) def=
∑K
k=1 PFk(x
k,ak|θk), where a def= ∪kak s.t. the ak are all disjoint,
each PFk(x
k,ak|θk) is an exact penalty function for Fk of gap gk. Let x,a be a truth
assignment which minimizesPF (x,a|θ). Then x is a solution for the weighted MaxSAT
problem {〈Fk, gk〉}k.
Proposition 2 allows for encoding a generic weighted MaxSAT problem {〈Fk, ck〉}k
by setting PF (x,a|θ) def=
∑K
k=1 wkPFk(x
k,ak|θk) where wk def= ckgk · c and c is a nor-
malization factor (7). Notice that in Proposition 2 the penalty functions PFk(x
k,ak|θk)
must be exact; otherwise, a solution x,a that is optimal for MaxSAT but violates some
Fk might not minimize PF (x,a|θ) if PFk(xk,ak|θk) > gk.
In §3.2 we outlined a “divide-and-conquer” approach for SATtoIsing based on the
idea of mapping shared variables into distinct qubits which are then connected by
chains of equivalences. Applying the same approach to MaxSAT is not as straight-
forward, because Property 4 cannot always be combined with Proposition 2 in a useful
way. Consider the scenario in Property 4, and suppose we want to use (11) to solve
the MaxSAT problem {〈Fk, gk〉}k as with Proposition 2. As the following example
shows, there may be minimum-energy solutions of (11) which violate some equiva-
lence (xik
∗ ↔ xik′∗) in (10) if this avoids violating one or more of the Fk’s whose
sum of gaps is greater than 2. Such a solution is not a solution of the MaxSAT prob-
lem, because it corresponds to assigning different truth values to distinct instances of
the same variable in the original problem.
Example 8. Consider the trivial MaxSAT problem {〈Fi(x), c〉}4i=1 for some penalty
value c > 0 where F1(x) = F2(x)
def
= x, and F3(x) = F4(x)
def
= ¬x. The two possible
solutions x = > and x = ⊥ are both optimum with penalty 2c and falsify F3, F4
and F1, F2 respectively. We have the following normal and exact penalty functions:
PF1(x) = PF2(x) = 2 − 2x and PF3(x) = PF4(x) = 2 + 2x, each of gap gi = 4.
Suppose we want to encode the problem in such a way to fit into a linear chain of 4
17
qubits adopting the encoding in Property 4. We introduce four copies of x, namely
x1, x2, x3, x4, and obtain:
F ∗(x1, x2, x3, x4) = x1 ∧ x2 ∧ ¬x3 ∧ ¬x4 ∧ (x1 ↔ x2) ∧ (x2 ↔ x3) ∧ (x3 ↔ x4)
PF∗(x
1, x2, x3, x4) = (2− 2x1) + (2− 2x2) + (2 + 2x3) + (2 + 2x4) +
(1− x1x2) + (1− x2x3) + (1− x3x4)
= 11− 2x1 − 2x2 + 2x3 + 2x4 − x1x2 − x2x3 − x3x4.
The minimum-energy solution to PF∗ is x1 = x2 = 1 and x3 = x4 = −1 with
PF∗(...) = 2, which violates the equivalence (x2 ↔ x3). The correct MaxSAT solu-
tions x1 = x2 = x3 = x4 = 1 and x1 = x2 = x3 = x4 = −1 both have PF∗(...) = 8.
In general, the problem arises when it is energetically cheaper to violate some
equivalence (xki
∗ ↔ xk′i
∗
) in a chain in (10) than to violate all the penalty functions
{Fk(xk) : xi ∈ xk} on one side of the equivalence. One solution to this problem is
to multiply the PFk ’s by sufficiently small weights wk < 1, at the cost reducing their
gaps gk. In the following we discuss the bounds that can be placed on wk.
Let I denote the indices of the functions Fk(xk) that use the variable xi; that is,
I = {k : xi ∈ xk}. An equivalence (xki ∗ ↔ xk
′
i
∗
) in the chain of xi splits the chain
into two subchains, and splits I into two subsets Ik and Ik′ such that (xki ∗ ↔ xk
′
i
∗
)
connects the functions of Ik to the functions of Ik′ . Assume we have a desired gap
gdesired > 0 separating solutions with broken chains from true solutions. Then a
sufficiently large gap for the equivalence (xki
∗ ↔ xk′i
∗
) is
g(k,k′) = min
∑
j∈Ik
gj ,
∑
j∈Ik′
gj
+ gdesired,
as this gap ensures that it is gdesired cheaper to violate all the constraints in Ik or Ik′
then to violate (xki
∗ ↔ xk′i
∗
). Recall from (10) that Eq(xi) is the set of variable pairs
(xki
∗
, xk
′
i
∗
) that form equivalences (xki
∗ ↔ xk′i
∗
) in the chain of xi. To ensure that all
equivalence constraints are not violated, a sufficient gap for the entire chain is
gchain = max
(xki
∗,xk′i
∗
)∈Eq(xi)
g(k,k′). (12)
Finally, recalling that each equivalence has gap 2, we update the weight definition
in Proposition 2 for each k ∈ I:8
wk =
2 · ck
gk · gchain (13)
An alternative bound on gchain is given in [36]. In the paper, the author bounds
the chain strength required to ensure that all minima of an embedded QUBO problem
8Note that the normalization factor c here is 1 as chains are normal.
18
can be mapped to a minimum of the original QUBO problem (see §3.4 below). Let
θ∗i =
∑
k wkθi be the bias value obtained by sharing the xi variable as in Property 3
9.
If xi is substituted by a chain with li endpoints, QUBO minima are preserved if the
chain gap is the following:
gchain = 2
li − 1
li
 ∑
(i,j)∈E
|θ∗ij | − |θ∗i |
+ gdesired (14)
This alternative bound is sometimes lower than (12), especially when |θ∗i | is high.
Note that, as the original paper explains, if the bound value is negative then PF∗ is
monotonic on xi. If that is the case, then xi = −sgn(θ∗i ) always minimizes PF∗ , so
we can fix the value of xi and there is no need for a chain.
In general, neither (12) nor (14) are typically very tight bounds on required chain
gap, and finding the smallest viable chain gap analytically appears to be a difficult
problem. In practice gchain is often determined empirically; this is discussed further in
§7.
Overall, the MaxSATtoIsing problem is subject to some intrinsic limitations. Firstly,
it requires the usage of exact penalty functions for its sub-formulas, which are more
difficult to obtain. Secondly, the need to re-weight penalty functions to ensure chain
equivalences are not violated typically results in smaller gaps. Thirdly, it is difficult to
directly encode hard constraints in a MaxSAT problem; this again requires re-weighting
soft constraints by very small factors, reducing their gaps accordingly.
3.4. Embedding
The process of representing a single variable xi by a collection of qubits connected
in chains of strong couplings is known as embedding, in reference to the minor em-
bedding problem of graph theory [36, 37]. More precisely, let PF (x|θ) be a penalty
function whose interactions define a graph GF (so xi and xj are adjacent iff θij 6= 0)
and let GH be a QA hardware graph. A minor embedding of GF in GH is a function
Φ : VGF → 2VGH such that:
• for each GF -vertex xi, the subgraph induced by Φ(xi) is connected;
• for all distinct GF -vertices xi and xj , Φ(xi) and Φ(xj) are disjoint;
• for each edge (xi, xj) inGF , there is at least one edge between Φ(xi) and Φ(xj).
The image Φ(xi) of a GF -vertex is a chain, and the set of qubits in a chain are con-
strained to be equivalent using (1− xik∗xik′∗) couplings as in Equation (11).
Embedding generic graphs is a computationally difficult problem [38], although
certain structured problem graphs may be easily embedded in the Chimera graph [39,
40] and heuristic algorithms may also be used [41]. A reasonable goal in embedding
is to minimize the sizes of the chains, as quantum annealing becomes less effective as
more qubits are included in chains [24].
9For simplicity, we assume to share a single xi, so each θ∗ij = wkθ
k
ij for some unique k.
19
A different approach to finding models for F (x), global embedding, is based on
first finding a penalty function on a complete graph GF on n + h variables, and sec-
ondly, embedding GF into a hardware graph GH using chains (e.g., using [39]). Fol-
lowing [33], global embeddings usually need fewer qubits than the methods presented
in this paper; however, the final gap of the penalty function obtained in this way is
generally smaller and difficult to compute exactly.
4. Encoding Small Boolean Sub-Formulas
In this section we present general SMT/OMT-based techniques to address the en-
coding problem for small Boolean formulas F (x).
4.1. Computing Penalty Functions via SMT/OMT(LRA).
Given x def= {x1, ..., xn}, a def= {a1, ..., ah}, F (x) as in Section 3, a variable place-
ment in a Chimera subgraph s.t. z = x ∪ a, and some gap gmin > 0, the problem of
finding a penalty function PF (x,a|θ) as in (5) corresponds to solving the following
problem:10
For every i j, find θi ∈ [−2, 2], θij ∈ [−1, 1] such that
∀x.
 ( F (x)→ ∃a.(PF (x,a|θ) = 0)) ∧( F (x)→ ∀a.(PF (x,a|θ) ≥ 0)) ∧
(¬F (x)→ ∀a.(PF (x,a|θ) ≥ gmin))
 . (15)
By applying Shannon’s expansion (§2.2) to the quantifiers in (15), the problem reduces
straightforwardly to solving the following SMT(LRA) problem:
Φ(θ)
def
=
∧
zi∈x,a
(−2 ≤ θi) ∧ (θi ≤ 2) ∧
∧
zi,zj∈x,a
i<j
(−1 ≤ θij) ∧ (θij ≤ 1) (16)
∧
∧
{x∈{−1,1}n|F (x)=>}
∨
a∈{−1,1}h
(PF (x,a|θ) = 0) (17)
∧
∧
{x∈{−1,1}n|F (x)=>}
∧
a∈{−1,1}h
(PF (x,a|θ) ≥ 0) (18)
∧
∧
{x∈{−1,1}n|F (x)=⊥}
∧
a∈{−1,1}h
(PF (x,a|θ) ≥ gmin). (19)
Consequently, the problem of finding the penalty function PF (x,a|θ) that maximizes
the gap gmin reduces to solving the OMT(LRA) maximization problem 〈Φ(θ), gmin〉.
Notice that, since gmin is maximum, PF (x,a|θ) is also normal.
Intuitively, (16) states the ranges of the θ; (17) and (18) state that, for every x sat-
isfying F (x), PF (x,a|θ) must be zero for at least one “minimum” a and nonnegative
for all the others; (19) states that for every x not satisfying F (x), PF (x,a|θ) must be
10As in (5), we implicitly assume θij = 0 when (i, j) 6∈ E.
20
greater than or equal to the gap. Consequently, if the values of the θ in PF (x,a|θ)
satisfy Φ(θ), then PF (x,a|θ) complies with (6); if Φ(θ) is unsatisfiable, then there is
no PF (x,a|θ) complying with (6) for the given placement.
Note that, if a = ∅, then the OMT(LRA) maximization problem 〈Φ(θ), gmin〉
reduces to a linear program because the disjunctions in (17) disappear.
To force PF (x,a|θ) to be an exact penalty function, we add the following conjunct
inside the square brackets of (15):
(¬F (x)→ ∃a.(PF (x,a|θ) = gmin)), (20)
which forces PF (x,a|θ) to be exactly equal to the gap for at least one a. Thus we
conjoin the Shannon’s expansion of (20) to Φ(θ) in (16)-(19):
... ∧
∧
{x∈{−1,1}n|F (x)=⊥}
∨
a∈{−1,1}h
(PF (x,a|θ) = gmin). (21)
4.2. Improving Efficiency and Scalability using Variable Elimination
In the SMT/OMT(LRA) formulation (16)-(19), Φ(θ) grows exponentially with
the number of hidden variables h. For practical purposes, this typically implies a limit
on h of about 10. Here, we describe an alternative formulation whose size dependence
on h is O(h2tw), where tw is the treewidth of the subgraph of G spanned by the qubits
corresponding to the ancillary variables, Ga. For the Chimera graph, even when h is
as large as 32, tw is at most 8 and therefore still of tractable size.
The crux of the reformulation is based on the use of the variable elimination tech-
nique [42] to solve an Ising problem on Ga. This method is a form of dynamic pro-
gramming, storing tables in memory describing all possible outcomes to the problem.
When the treewidth is tw, there is a variable elimination order guaranteeing that each
table contains at most O(2tw) entries. Rather than using numerical tables, our for-
mulation replaces each of its entries with a continuous variable constrained by linear
inequalities. In principle, we need to parametrically solve an Ising problem for each
x ∈ {−1, 1}n, generating O(2nh2tw) continuous variables. However, by the local na-
ture of the variable elimination process, many of these continuous variables are equal,
leading to a reduced (as much as an order of magnitude smaller) and strengthened SMT
formulation.
To describe the method, we first reformulate equations (18)-(19) by introducing
witness binary variables β(x) ∈ {−1, 1}h to enforce the equality constraints (17),
that is, PF (x,β(x)|θ) = 0. Thus, we can rewrite Φ(θ) as the SMT problem Φ(θ,β)
defined by
Φ(θ,β)
def
= (16) ∧ (18) ∧ (19)
∧
∧
{x∈{−1,1}n|F (x)=>}
∨
a∈{−1,1}h
(
(β(x) ≡ a) ∧ (PF (x,a|θ) = 0)
)
.11
11For vectors a,b, we use a ≡ b as a shorthand for (a1 = b1) ∧ (a2 = b2) ∧ (a3 = b3) ∧ . . ..
21
Consider first the case when the graph Ga has no edges. If, for i = 1, . . . , h, we
define
fi(ai|x) = θiai + ai
∑
j:ij∈E
θij xj ,
then we can write
PF (x,a|θ) = c(x) +
h∑
i=1
fi(ai|x),
where c(x) does not depend on the ancillary variables. Thus,
min
a
PF (x,a|θ) = c(x) +
h∑
i=1
min
ai∈{−1,1}
fi(ai|x). (22)
If θ is fixed, solving (22) is straightforward. However, since θ is a variable, the contri-
bution minai∈{−1,1} fi(ai|x) is a function of θ, for each i = 1, . . . , h. Each of these
minimums will be associated with a continuous variable, denoted by mi(∅|x), and re-
ferred to as a message variable (the naming will be clearer in the general case). To
relate mi(∅|x) with minai∈{−1,1} fi(ai|x), we impose the constraints
mi(∅|x) ≤ fi(−1|x) and mi(∅|x) ≤ fi(1|x).
Thus, if F (x) = ⊥, since the message variables are lower bounds on the true mini-
mums of (22), to enforce (19) we need simply add the constraints
c(x) +
h∑
i=1
mi(∅|x) ≥ gmin.
When F (x) = >, we need to ensure that the message variables take the minimums
of (22). Note that variable βi(x) identifies the value of the ancillary variable i that
achieves the minimum in (22). To relate the values of β(x) and the message variables
m(∅|x) we add the SMT constraints
βi(x)⇒
(
mi(∅|x) = fi(1|x)
)
,
¬βi(x)⇒
(
mi(∅|x) = fi(−1|x)
)
.
Finally, to impose (17) and (18), we need that
c(x) +
h∑
i=1
mi(∅|x) = 0.
Since G is usually sparse, it is likely that two binary states x and x′ agree on the bits
adjacent to a fixed ancillary variable i. In this case, it is clear thatmi(∅|x) = mi(∅|x′),
and we can use a single message variable for both states. This observation can be
extended to the general case and will be valuable to reduce the size and strengthen the
SMT problem formulation.
22
Next consider the general case when |E(Ga)| > 0. In what follows, c(x) and
fi(ai|x) are defined as above. Assume first θ is fixed. Given x, we want to solve the
Ising model mina PF (x,a|θ). Variable elimination proceeds in order, eliminating one
ancillary variable at a time. Suppose that ancillary variables are eliminated in the order
h, h − 1, . . . , 1. Each ancillary variable i is associated with a set Fi of factors, which
are functions that depend on ancillary variable i and none or more ancillary variables
with index less than i. The sets Fi are called buckets, and are updated throughout
the computation. Initially, each Fi consists of ancilla-ancilla edges12 fi,k(ai, ak) =
θik aiak for ik ∈ E(Ga), k < i. Let Vi denote the set of ancillary variables involved
in the factors of bucket Fi other than variable i itself (thus, all variable indices in Vi are
less than i, or Vi = ∅). For a fixed a and a subset of ancillary variables U , we use aU
to denote {ai : i ∈ U}. Variable h is eliminated first. Note that once variables in Vh
are instantiated to aVh , the optimal setting of variable h is readily available by solving
gh(aVh) = minah
fh(ah|x) +
∑
f∈Fh
f(aVh , ah). (23)
Here f = fi,h ∈ Fh represents an edge ih between ancillary variables i and h, i < h
(abusing notation we write f(ai, ah) as f(aVh , ah)), and Fh contains all edges adja-
cent to h. The 2|Vh| possible settings of aVh define 2
|Vh| values (23). These values
define new factor gh, a function of variables aVh , that is added to the bucket Fi of
variable i with largest index in Vh. For each instantiation of aVh we define the message
mh(aVh |x) as gh(aVh). Iteratively, eliminating variable i is accomplished by solving,
for each setting of aVi ,
gi(aVi) = minai
fi(ai|x) +
∑
f∈Fi
f(aVi , ai) (24)
generating a new factor gi, a function of aVi . For each one of the 2
|Vi| possible values
of gi we define message mi(aVi |x) to be gi(aVi). Factor gi is then added to bucket Fk
where k is the largest index in Vi. When Vi = ∅, (24) takes the form
min
ai
fi(ai|x) +
∑
f∈Fi
f(ai) (25)
that determines the optimal value of ai; the message corresponding to the value of this
minimum is mi(∅|x). All variables with Vi = ∅ can be eliminated at the same time, so
that, at termination, the value of the Ising problem mina PF (x,a|θ) is equal to
c(x) +
∑
i:Vi=∅
mi(∅|x).
which will be equal to mina PF (x,a|θ). Notice that the number of additional messages
is O(
∑
i 2
|Vi|), where each Vi corresponds to the time when variable i is eliminated.
12Ga is an undirected graph. An edge is defined by a pair of vertices, say i and k; for convenience, in this
section we associate this edge with the ordered pair ik with k < i.
23
When Ga has treewidth t, there is an elimination order for which each |Vi| ≤ t, which
typically, by our low treewidth assumption, will be much smaller than 2h.
When θ is not fixed, as in the case when there were no edges, the messages are vari-
ables. Since these message variable represent minimums, we upper bound the message
variables adding the constraints
mi(aVi |x) ≤ fi(−1|x) +
∑
f∈Fi
f(aVi ,−1)
mi(aVi |x) ≤ fi(1|x) +
∑
f∈Fi
f(aVi , 1).
As before, if F (x) = ⊥, the constraint (19) can be replaced with
c(x) +
∑
i:Vi=∅
mi(∅|x) ≥ gmin, (26)
since the message variables provide a lower bound on (24). When F (x) = >, we must
ensure that all the message variables are tight. For a subset of ancillary variables U , let
βU (x) = {βi(x) : i ∈ U}. Thus, we must have that for all aVi[
βVi(x) ≡ aVi ∧ βi(x)
] ⇒ [mi(aVi |x) = fi(1|x) + ∑
f∈Fi
f(aVi , 1)
]
[
βVi(x) ≡ aVi ∧ ¬βi(x)
] ⇒ [mi(aVi |x) = fi(−1|x) + ∑
f∈Fi
f(aVi − 1)
]
.
In this way, we can enforce that mina PF (x,a|θ) = 0 (that is, constraints (17) and
(18)), with the constraint
c(x) +
∑
i:Vi=∅
mi(∅|x) = 0. (27)
As noted in the case when Ga has no edges, some message variables will always
have the same values. In fact, significant additional model reduction can be accom-
plished by identifying message variables that have to be the same across many states
x. For instance, mi(aVi |x) = mi(aVi |x′) if their corresponding upper bounds are the
same (propagating from h down to i). Because G is sparse, the number of message
variables can typically be reduced by an order of magnitude or more in this way.
In many cases, for counter-models x, F (x) = ⊥, some constraints (19) may be
dropped or relaxed without altering the optimal solution of the original SMT problem.
For instance, we could include only constraints (19) for counter-models x that are
within Hamming distance at most d from all models of F . In our experiments, using
d ≤ 3 sufficed in most cases.
Alternatively, also for counter-models, the variable elimination lower bounds (26)
can be relaxed by weaker lower bounds such as a linear programming relaxation of
the corresponding Ising problem, that requires O(|V |+ |E|) continuous variables and
24
inequalities per x, F (x) = ⊥. For instance, a linear programming lower bound on the
QUBO formulation
min
yi∈{0,1}
∑
i∈V
ciyi +
∑
e={i,j}∈E
qe yiyj ,
is the following:
Minimize
∑
i∈V
cixi +
∑
e∈E
qe ze (28)
subject to
ze − yi − yj ≥ −1 for each e = ij ∈ E, i < j (λe)
(29)
−ze + yi ≥ 0 for each e = ij ∈ E,i < j (λhe,i)
(30)
−ze + yj ≥ 0 for each e = ij ∈ E,i < j (λte,j)
(31)
−yi ≥ −1 for each i ∈ V (αi)
(32)
yi, ze ≥ 0 (33)
Its linear programming dual is given by
Maximize −
∑
e∈E
λe −
∑
i∈V
αi (34)
subject to
λe − λhe,i − λte,j ≤ qe for each e = ij ∈ E,i < j
(35)
−
∑
e:i∈e
λe +
∑
e=ik∈E,i<k
λhe,i +
∑
e=ki∈E,k<i
λte,i − αi ≤ ci for each i ∈ V
(36)
λe, λ
h
e,i, λ
t
e,i, αi ≥ 0 (37)
Notice that if c and q are variables, the dual problem is still linear in the dual variables,
c and q. Thus, we can guarantee (in one direction only) that the value of the QUBO is
at least g with the set of linear inequalities
−
∑
e∈E
λe −
∑
i∈V
αi ≥ g (38)
(35), (36), (37) (39)
25
Note that we can always take
(−
∑
e:i∈e
λe +
∑
e=ik∈E,i<k
λhe,i +
∑
e=ki∈E,k<i
λte,i − ci)+ = αi.
To make this work for an Ising problem, the c and q have to be written as linear func-
tions of θ, which is straightforward.
4.3. Inequivalent Variable Placements and Exploiting Symmetries
Recall that a variable placement is a mapping from the input and ancilla variables
x ∪ a onto the vertices V ; the formula Φ(θ) in (16)-(21) can be built only after each
zi ∈ x ∪ a has been placed. In general there will be many such placements, but by
exploiting symmetry and the automorphism group of G, we can reduce the number of
placements that need be considered.
Let v def= (v1, ..., vn+h) denote a variable placement, so vi is the vertex of V onto
which zi is placed. Two variable placements v and v′
def
= (v′1, ..., v
′
n+h) are equivalent
if there is a graph isomorphism φ of G that point-wise maps the input variables (xi)
in v to the input variables in v′; that is, vi = φ(v′i) for all i ≤ n. If v and v′ are
equivalent, then a penalty function for v can be transformed into a penalty function for
v′ by applying φ. Therefore, in order to find a penalty function of maximal gap among
all variable placements, it suffices to consider only inequivalent ones.
Example 9. Suppose we want to encode a penalty function with n + h = 8 variables
into an 8-qubit Chimera tile. There are 8! = 40320 candidate variable placements.
However, the tile structure is highly symmetric: any permutation of v that either
(i) swaps horizontal qubits with vertical qubits, or
(ii) maps horizontal qubits to horizontal qubits and vertical qubits to vertical qubits
is an automorphism. This fact can be exploited to reduce number of candidate place-
ments to only
(
7
3
)
= 35 as follows. Let 1, ..., 4 and 5, ..., 8 be the indexes of the horizon-
tal and vertical qubits respectively. By (i), we assume w.l.o.g. that z1 is mapped into
an horizontal qubit, and by (ii) we assume w.l.o.g. that v1 = 1. Next, consider some
size-3 subset S of {v2, ..., v8}. By (ii), all placements that map S into the remaining 3
horizontal qubits and map {v2, ..., v8}\S into the vertical qubits are equivalent. Since
there are
(
7
3
)
= 35 such subsets S, there are at most 35 inequivalent placements to
consider.
This notion of equivalence of variable placements can be coarsened slightly by
taking advantage of NPN-equivalence. We define variables x1 and x2 in a Boolean
function F to be NPN-symmetric if swapping the variables, and negating some subset
of variables, produces an equivalent formula. For example, consider F (x1, x2, x3)
def
=
x3 ↔ (x1 ∧ ¬x2). Variables x1 and x2 are NPN-symmetric because F (x1, x2, x3)↔
F (¬x2,¬x1, x3). This symmetry defines an equivalence relation on the variables: for
xi and xj the same equivalence class, there is a permutation and negation of the vari-
ables that does not change F but maps xi to xj while not permuting variables outside
the equivalence class.
26
We say that two variable placements v and v′ are equivalent up to NPN-symmetry
if there is a graph isomorphism φ of G that maps the input variables in v to the input
variables in v′ up to NPN-symmetry classes. That is, for all i ≤ n, there exists a j ≤ n
such that xi and xj are NPN-symmetric and vi = φ(v′j). Again, penalty functions for
v and can be transformed into penalty functions for v′ and vice versa.
Example 10. Consider placing the function AND(x1, . . . , x4) = x1 ∧ x2 ∧ x3 ∧ x4
with h = 4 auxiliary variables on the 8-qubit Chimera tile. From Example 9, it suffices
to consider 35 variable placements. However, the variables x1, . . . , x4 in AND are all
NPN-symmetric. Therefore any two variable placements v and v′ that map the same
number xi’s to horizontal qubits are equivalent, since there is an automorphism that
will map the horizontal xi’s in v to the horizontal xi’s in v′. Moreover, a placement
mapping k ≤ 4 of the xi’s to horizontal qubits is equivalent to one mapping 4 − k of
the xi’s to horizontal qubits, by swapping horizontal and vertical qubits. As a result,
there are only 3 inequivalent variable placements to consider, in which 0, 1 or 2 of the
xi’s are mapped to horizontal qubits.
One way to to check for equivalent variable placements is to use vertex-coloured
graph isomorphisms. Two vertex-coloured graphs (G, c) and (G′, c′) are vertex-coloured
graph-isomorphic if there is a permutation φ mapping V (G) to V (G′) that preserves
edges and maps every vertex of G to a vertex of the same colour in G′ (for all v ∈ V ,
c′(φ(v)) = c(v)). Using a variable placement v and NPN-symmetry, define a vertex-
coloring c of G as follows:
c(g) =
{
s if vi = g and xi is in the s-th equivalence class of NPN-symmetry,
0 if g is not in {v1, . . . , vn}.
Similarly define a vertex coloring c′ for variable placement v′. From these definitions,
v and v′ are equivalent up to NPN-symmetry if and only if the vertex colored graphs
(G, c) and (G, c′) are vertex-colored graph-isomorphic.
In practice, we can use the graph package NAUTY [43] to compute a canonical
form for each vertex-colored graph and check if two are the same. NAUTY works with
vertex-coloured canonical forms natively as part of its graph isomorphism algorithm,
and can compute canonical forms for graphs with thousands of vertices.
4.4. Placing Variables & Computing Penalty Functions via SMT/OMT(LRIA∪UF).
As an alternative to identifying equivalent variable placements, for small formu-
lae F (x), we can combine the generation of the penalty function with an automatic
variable placement by means of SMT/OMT(LRIA ∪ UF), LRIA ∪ UF being the
combined theories of linear arithmetic over rationals and integers plus uninterpreted
function symbols (§2.2). This works as follows.
Suppose we want to produce the penalty function of some relatively small function
(e.g., so n+h ≤ 8, which fits into a single Chimera tile). We index the n+h vertices in
the set V into which we want to place the variables as V def= {1, ..., n+ h}, and we intro-
duce a set of n+ h integer variables v def= {v1, ..., vn+h} such that vj ∈ V is (the index
of) the vertex into which zj is placed. (For example, “v3 = 5” means that variable z3 is
27
Φ(θ0, b, c,v)
def
= Range(θ0, b, c,v) ∧ Distinct(v) ∧ Graph() (40)
∧
∧
{x∈{−1,1}n|F (x)=>}
∧
a∈{−1,1}h
(PF (x,a|θ0, b, c,v) ≥ 0) (41)
∧
∧
{x∈{−1,1}n|F (x)=>}
∨
a∈{−1,1}h
(PF (x,a|θ0, b, c,v) = 0) (42)
∧
∧
{x∈{−1,1}n|F (x)=⊥}
∧
a∈{−1,1}h
(PF (x,a|θ0, b, c,v) ≥ gmin) (43)
∧
∧
{x∈{−1,1}n|F (x)=⊥}
∨
a∈{−1,1}h
(PF (x,a|θ0, b, c,v) = gmin) (44)
where:
Range(θ0, b, c,v)
def
=
∧
1≤j≤n+h
(1 ≤ vj) ∧ (vj ≤ n+ h) (45)
∧
∧
1≤j≤n+h
(−2 ≤ b(j)) ∧ (b(j) ≤ 2) (46)
∧
∧
1≤j≤n+h
(c(j, j) = 0) ∧
∧
1≤i<j≤n+h
(c(i, j) = c(j, i)) (47)
∧
∧
1≤i<j≤n+h
(−1 ≤ c(i, j)) ∧ (c(i, j) ≤ 1) (48)
Distinct(v1, ..., vn+h)
def
=
∧
1≤i<j≤n+h
¬(vi = vj) (49)
Graph()
def
= ∧
∧
1≤i<j≤n+h
〈i,j〉6∈E
(c(i, j) = 0) (50)
PF (x,a|θ0, b, c,v) def= θ0 +
∑
1≤j≤n+h
b(vj) · zj +
∑
1≤i<j≤n+h
c(vi, vj) · zi · zj .(51)
Figure 6: SMT (LRIA ∪ UF ) encoding with automatic placement.
placed in vertex #5.) Then we add the standard SMT constraint Distinct(v1, ..., vn+h)
to the formula to guarantee the injectivity of the map. Then, instead of using variables
θi and θij for biases and couplings, we introduce the uninterpreted function symbols
b : V 7−→ Q (“bias”) and c : V × V 7−→ Q (“coupling”), so that we can rewrite
each bias θj as b(vj) and each coupling θij as c(vi, vj) s.t vi, vj ∈ [1, .., n + h] and
Distinct(v1, ..., vn+h).
This rewrites the SMT(LRA) problem (16)-(19) into the SMT (LRIA ∪ UF)
problem (40)-(51) in Figure 6. Equation (44) must be used if and only if we need an
exact penalty function. (Notice that (47) is necessary because we could have c(vi, vj)
s.t. vi > vj .) By solving 〈Φ(θ0, b, c,v), gmin〉 we not only find the best values of
28
1 2
4
3
1 2
4
3
1 2
4
3
x1
a1
x2
x3
x3x1
a1
x2
x1
x2
a1
x3
Figure 7: 3 possible placements of z def= {x1, x2, x3} ∪ {a1} into a 4-qubit tile fraction with 2 horizontal
and 2 vertical qubits. All 4! = 24 combinations are equivalent to one of them.
the biases b and couplings c, but also the best placement v of the variables into (the
indexes of) the qubits.
Example 11. Consider x def= {x1, x2, x3}, a def= {a1} and F (x) def= (x3 ↔ (x1 ∧ x2)),
and 4-qubit fraction of a tile with 2 horizontal and 2 vertical qubits. Let z1, z2, z3 and
z4, denote x1, x2, x3 and a1 respectively, so that each vj denotes the vertex into which
zj is placed. We consider the encoding (40)-(51), in particular we have that:
PF (x,a|θ0, b, c,v) def= θ0 + b(v1)x1 + b(v2)x2 + b(v3)x3 + b(v4)a1 +
c(v1, v2)x1x2 + c(v1, v3)x1x3 + c(v1, v4)x1a1 +
c(v2, v3)x2x3 + c(v2, v4)x2a1 + c(v3, v4)x3a1
Graph()
def
= c(1, 2) = 0 ∧ c(2, 1) = 0 ∧ c(3, 4) = 0 ∧ c(4, 3) = 0
One possible solution is given in the following tables:
g v1 v2 v3 v4
2 1 3 2 4
θ0 b(v1) b(v2) b(v3) b(v4)
b(1) b(3) b(2) b(4)
5/2 −1/2 −1/2 1 0
c(v1, v2) c(v1, v3) c(v1, v4) c(v2, v3) c(v2, v4) c(v3, v4)
c(1, 3) c(1, 2) c(1, 4) c(3, 2) c(3, 4) c(2, 4)
1/2 0 −1 −1 0 −1
which corresponds to the placing in Figure 7 (center).
4.4.1. Exploiting symmetries.
When using an SMT/OMT solver to search for penalty functions across all variable
placements as in (40)-(51), we may restrict the search space by considering only one
variable placement from each equivalence class under the automorphisms of G.
29
Example 12. In Example 9, when encoding a penalty function with n + h = 8 vari-
ables into a Chimera tile, automorphisms reduced the number of variable placements
under consideration from 8! = 40320 to
(
7
3
)
= 35. We can force the SMT/OMT solver
to restrict the search to only 35 maps by adding the following constraint to (40)-(51),
consisting into the disjunction of 35 cubes, each representing one placement.
(
Fixed︷ ︸︸ ︷
v1 = 1∧
size-3 subset of {v2,...,v8}
mapped to horizontal qubits︷ ︸︸ ︷
v2 = 2 ∧ v3 = 3 ∧ v4 = 4∧
complement of the previous subset
mapped to vertical qubits︷ ︸︸ ︷
v5 = 5 ∧ v6 = 6 ∧ v7 = 7 ∧ v8 = 8) ∨
(v1 = 1 ∧ v2 = 2 ∧ v3 = 3 ∧ v5 = 4 ∧ v4 = 5 ∧ v6 = 6 ∧ v7 = 7 ∧ v8 = 8) ∨
...
(v1 = 1 ∧ v6 = 2 ∧ v7 = 3 ∧ v8 = 4 ∧ v2 = 5 ∧ v3 = 6 ∧ v4 = 7 ∧ v5 = 8).
If we add this constraint, the first conjunction in (45) can be dropped.
Example 13. In Example 11 we have 4! = 24 possible placements on to a tile of
2 horizontal and 2 vertical qubits. If we exploit symmetries as above, we have only(
3
1
)
= 3 inequivalent placements, which are described in Figure 7. These can be
obtained by adding the constraint:
(v1 = 1 ∧ v2 = 2 ∧ v3 = 3 ∧ v4 = 4) ∨
(v1 = 1 ∧ v3 = 2 ∧ v2 = 3 ∧ v4 = 4) ∨
(v1 = 1 ∧ v4 = 2 ∧ v2 = 3 ∧ v3 = 4).
5. Encoding Larger Boolean Formulas
As pointed out in Section 3.2, encoding large Boolean functions using the SMT
formulations of the previous section is computationally intractable, as the number of
constraints in the model increases roughly exponentially with the number of variables
in the Boolean function. In this section, we describe the natural approach of pre-
computing a library of encoded Boolean functions and rewriting a larger Boolean
function F (x) as a set of pre-encoded ones
∧K
k=1 Fk(x
k). The penalty functions
PFk(x
k,ak|θk) for these pre-encoded functions may then be combined using chains
as described in Section 3.4. This schema is shown in Figure 8. In terms of QA per-
formance, this method has been shown experimentally to outperform other encoding
methods for certain problem classes [44]. We will describe each of the stages in turn
(see also [33, 44, 45]).
5.1. Library generation
In this stage, we find effective encodings of common small Boolean functions,
using the SMT methods in Section 4 or by other means, and store them in a library
for later use. Finding these encodings may be computationally expensive, but this task
may be performed offline ahead of time, as it is independent of the problem input, and
it need only be performed once for each NPN-inequivalent Boolean function.
30
Offline process
On-the-fly process
Standard cell
mapping
Library
PreprocessingSATproblem
Library
generation
Boolean
functions
Placement
and routing
Ising
model
D-Wave
QA Solution
Figure 8: Graph of the encoding process.
Note that there exist many different penalty functions PF (x,a|θ) for any small
Boolean function F (x). Penalty functions with more qubits may have larger gaps, but
using those functions may result in longer chains, so it is not always the case that larger
gaps lead to better QA hardware performance. Choosing the most appropriate function
may be a nontrivial problem. A reasonable heuristic is to choose penalty functions with
gaps of similar size to the gap associated with a chain, namely gmin = 2.
5.2. Preprocessing
Preprocessing, or Boolean formula minimization, consists of simplifying the input
formula F (x) to reduce its size or complexity. While not strictly necessary, it not only
improves QA performance by reducing the size of PF (x,a|θ) but also reduces the
computational expense of the encoding process. Moreover, the graphical representation
commonly used in preprocessing, the AND-Inverter Graph (AIG), is necessary for the
subsequent phase of encoding.
An AIG encodes F (x) as a series of 2-input AND gates and negations. More pre-
cisely, a directed acyclic graph D on vertex set z = x ∪ a = (x1, . . . , xn, a1, . . . , am)
is an AIG representing F (x) if it has the following properties:
1. Each xi has no incoming arcs and each ak has 2 incoming arcs (the inputs to ak),
and there is a unique ao with no outgoing arcs (the primary output).
2. Each arc z → a is labelled with a sign + or− indicating whether or not z should
be negated as an input to a; define a literal la(z) = z for an arc with sign + and
la(z) = ¬z for an arc with sign −.
3. For each node ak with arcs incoming from z1 and z2, there is an AND function
Ak(ak, z1, z2) = ak ↔ lak(z1) ∧ lak(z2), such that
F (x)↔
m∧
k=1
Ak(z) ∧ (ao = >). (52)
For example, the function F (x) = x1 ∧ x2 ∧ ¬x3 is represented by both of the And-
Inverter Graphs in Figure 9.
There are many And-Inverter Graphs representing a given F (x). Is F (x) is in CNF
form, we can construct an AIG by rewriting each OR clause as an AND function via
De Morgan’s Law, and then rewriting each AND function with more than 2 inputs as a
sequence of 2-input AND functions.
31
x1
a1
x2
a2
x3
ao
+
+
+
-
+
+
x1
a1
x2
x3
ao
+
+
+
-
Figure 9: Two And-Inverter Graphs representing the function F (x) = x1 ∧ x2 ∧ ¬x3.
Preprocessing is a well-studied problem with mature algorithms available [46, 47];
here, we use DAG-aware minimization as implemented by the logic optimizer ABC.13
DAG-aware minimization attempts to find an AIG with a minimal number of nodes
by repeatedly identifying a small subgraph that can be replaced with another, smaller
subgraph without changing the truth assignments of F (x).
More precisely, a cut C of node z in D is a subset of vertices such that every
directed path from an input xi to z must pass through C. The subgraph of D induced
by all paths from C to z is a candidate to be replaced by a smaller subgraph, since
the Boolean value of z is determined by C. We call this value of z as a function
of C the Boolean function represented by C. Cut C is k-feasible if |C| ≤ k and
non-trivial if C 6= {z}. For fixed k, there is an O(n)-time algorithm to identify all
k-feasible cuts in an AIG: traverse the graph from the inputs x to the primary output,
identifying the k-feasible cuts of node ai by combining k-feasible cuts of ai’s inputs.
During traversal, DAG-aware minimization identifies a 4-feasible cut C and replaces
the subgraph induced by C with the smallest subgraph representing the same Boolean
function. (There are 222 NPN-inequivalent 4-input Boolean functions, and smallest
subgraph representing each one is pre-computed.) See [48] for more details.
5.3. Standard cell mapping
In the standard cell mapping phase, F (x) is decomposed into component functions∧K
k=1 Fk(x
k) that are available in the library of penalty functions. For SAT or con-
straint satisfaction problems, this mapping may be performed naı¨vely: given a set of
constraints {Fk(xk)}Kk=1 on the variables, each Fk(xk) is found in the library (pos-
sibly combining small constraints into larger ones [33]). However, more advanced
techniques have been devised in the digital logic synthesis literature. Technology map-
ping is the process of mapping a technology-independent circuit representation to the
physical gates used in a digital circuit [48, 49]. Usually technology mapping is used to
reduce circuit delay and load, and performs minimization as an additional step. Delay
and load do not play a role in the context of QAs, but minimization is important to
simplify the placement and routing phase that follows.
In order to find an efficient decomposition, a technology mapping algorithm takes
as input costs for small Fk(xk) and attempts to minimize the sum of the costs of the
13see https://github.com/berkeley-abc/abc and https://people.eecs.berkeley.edu/ alanmi/abc/.
32
components in
∧K
k=1 Fk(x
k). We define the cost of Fk to be the number of qubits used
by the penalty model PFk , so that the cost of F (x) =
∧K
k=1 Fk(x
k) is the total number
of qubits used to represent F (x), prior to adding chains.
Here, we apply the technology mapping algorithm in [48]: the idea is to decompose
the AIG representing F (x) into a collection of cuts such that each cut represents a small
function Fk(zk) that can be found in the penalty library. A mapping M of an AIG D is
a partial function that maps a node ai of D to a non-trivial, k-feasible cut M(ai). We
say ai is active when M(ai) is defined and inactive otherwise. Mapping M is proper
if:
1. the primary output ao is active;
2. if ai is active; then every aj ∈M(ai) is active; and
3. if aj 6= ao is active; then aj ∈M(ai) for some active ai.
For each active node ak in a proper mappingM , there is a Boolean function Fk(zk)
represented by the cut M(ak), and the original Boolean function F (x) decomposes as
F (x)↔
K∧
k=1
Fk(z
k) ∧ (ao = >).
Therefore, choosing k-feasible cuts with small k, proper mappings provide decomposi-
tions of F (x) into small Boolean functions that can be found in the penalty library. One
example of a proper mapping is the trivial mapping, in which each ai is mapped to the
cut consisting of its two input nodes. Under the trivial mapping, F (x) is decomposed
into a collection 2-input AND’s.
The algorithm in [48] iteratively refines mapping M in order to improve the cost
of the decomposition, in the following way. For each node ai, maintain a list L(ai) of
k-feasible cuts, ordered by their cost. (The cost of a cut is a function of the cost of the
Boolean function it represents, taking into account the anticipated recursive effects of
having a new set of active nodes: see [48] for details.) Traverse the graph from inputs
x to primary output ao. At each ai, first update the costs of the cuts in L(ai) based
on the changes to the costs of earlier nodes in the traversal. Next, if ai is active and
the current cut M(ai) is not the cut in L(ai) of lowest cost, update M(ai). To do this,
first inactivate ai (which recursively inactivates nodes in M(ai) if they are no longer
necessary) and then reactivate ai (which reactivates nodes in M(ai), also recursively).
This process of refining the mapping by traversing the graph is repeated several times.
Given the connectivity of the Chimera hardware graph, a natural choice is to de-
compose into Boolean functions that can be modelled with a single 8-qubit tile. In
particular all 3-input, 1-output Boolean functions (all 3-feasible cuts) can be modelled
in one tile.
5.4. Placement and routing
Once F (x) is decomposed into smaller functions
∧K
k=1 Fk(x
k) with penalty func-
tions PFk(x
k,ak|θk), it remains to embed the entire formula onto the QA hardware as
33
in equation (11). This process has two parts: placement, in which each PFk(x
k,ak|θk)
is assigned to a disjoint subgraph of the QA hardware graph; and routing, in which
chains of qubits are built to ensure that distinct qubits xi and x′i representing the same
variable take consistent values (using equivalence constraints with penalty functions of
the form 1− xix′i). Both placement and routing are very well-studied in design of dig-
ital circuits [50]. Nevertheless, this stage is a computational bottleneck for encoding
large Boolean functions.
5.4.1. Placement
During placement, chain lengths can be minimized by placing penalty functions
that share common variables close together. Current QA processors have a nearly 2-
dimensional structure, which lets us measure distance between variables using planar
coordinates. (For example, for the 2048-qubit Chimera graph in Fig. 2, define the
planar coordinates of a unit cell to be its row and column index in the 16×16 grid.) One
common objective function from digital circuit design is “half-perimeter wire length”
[51]. Define the location of a Boolean function Fk(xk) to be the subgraph of G onto
which Fk(xk) is placed, and define a placement function p : {1, ...,K} → R2 which
maps each k to the planar coordinates p(k) = (ak, bk) of the location of Fk(xk). The
half-perimeter wire length (HPWL) of a variable xi is the total length and width of the
smallest box that can be drawn around the locations of functions containing x. That is,
for Si = {k : xi ∈ xk},
HPWL(xi)
def
= (max
k∈Si
ak − min
k∈Si
ak) + (max
k∈Si
bk − min
k∈Si
bk).
A placement algorithm attempts to find a placement that minimizes
∑n
i=1HPWL(xi).
Heuristic methods for placement include simulated annealing [52], continuous opti-
mization [53], and recursive min-cut partitioning [54]. These algorithms can be applied
in the present context, but require some modification as current QA architectures do
not distinguish between qubits used for penalty functions and qubits used for chains.
For example, in some algorithms, a placement is optimized on the assumption that
the resulting routing problem is feasible (possibly by expanding the planar area made
available for routing). This assumption may not necessarily hold using a fixed QA
hardware graph of limited size and connectivity. If unit cells are packed tightly with
Boolean functions, then there will be few remaining qubits available for routing. On
the other hand, reserving too many qubits for routing will have a negative impact on
hardware performance in the form of longer chains.
In the experiments in §7 we made use of mPL14, a publicly available academic
placement tool [53]. mPL is multilevel method in which the placement problem is
repeatedly coarsened (so that several PFk are clustered and treated as one), placed, and
uncoarsened with local improvements. At the coarsest level, placement is performed
using a customized non-linear programming algorithm which maps penalty functions
to real coordinates minimizing a quadratic distance function between shared variables.
14Available at http://cadlab.cs.ucla.edu/cpmo/
34
5.4.2. Routing
During routing, literals are chained together using as few qubits possible; this prob-
lem may be formalized as follows. Assume a single variable xi has been assigned to a
set of vertices Ti ⊆ V , its terminals, during the placement of small Boolean functions.
To create a valid embedding, the chain of vertices representing xi, call it Ci, must
contain Ti and induce a connected subgraph in G. Finding Ci with a minimum num-
ber of vertices is an instance of the Steiner tree problem [55] and Ci is a Steiner tree.
Given variables (x1, . . . , xn) assigned to terminals (T1, . . . , Tn), the routing problem
demands a set of chains (C1, . . . , Cn) such that each Ci contains Ti, every chain is
connected, and all chains are pairwise disjoint. Among routing solutions, we try to
minimize the total number of vertices of G used or the size of the largest chain.
Routing to minimize the total number of vertices used is NP-hard, but polynomial-
time approximation algorithms exist [56]. In practice, heuristic routing algorithms
scale to problem sizes much larger than current QA architectures [57, 58, 59, 60, 61].
Routing in the current context differs from routing used in digital circuit design
in the sense that vertices (qubits) are the sparse resource that variables compete for,
rather than edges. As a result, we make use vertex-weighted Steiner tree algorithms
rather than edge-weighted ones. This makes the problem harder, as the edge-weighted
Steiner tree problem is (1.39)-approximable in polynomial time [62], while vertex-
weighted Steiner-tree is only (log k)-approximable for k terminals in polynomial time
unless P=NP [63]. Nevertheless, in practice, simple 2-approximation algorithms for
edge-weighted Steiner tree such as the MST algorithm [64] or Path Composition [65]
also work very well for the vertex-weighted problem. In this section, we describe
a modification of the routing algorithm BonnRoute [65] for vertex-weighted Steiner
trees.
We first solve a continuous relaxation of the routing problem called min-max re-
source allocation. Given a set of vertices C ⊆ V , the characteristic vector of C is the
vector χ(C) ∈ {0, 1}|V | such that χ(C)v = 1 if v ∈ C and 0 otherwise. Let Hi be the
convex hull of all characteristic vectors of Steiner trees of Ti in G. Then the min-max
resource allocation problem for terminals T1, . . . , Tn is to minimize, over all zi ∈ Hi,
i ∈ {1, . . . , n},
λ(z1, . . . , zn)
def
= max
v∈V
n∑
i=1
(zi)v.
The vertices v are the resources, which are allocated to customers (z1, . . . , zn)15 To
recover the routing problem, note that if each zi is a characteristic vector of a single
Steiner tree, then
∑n
i=1(zi)v the number of times vertex v is used in a Steiner tree. In
that case, λ(x) ≤ 1 if and only if the Steiner trees are a solution to the routing problem.
To solve the min-max resource allocation, we iteratively use a weighted-Steiner
tree approximation algorithm to generate a probability distribution over the Steiner
trees for each xi. After a Steiner tree is generated, the weights of the vertices in that
Steiner tree are increased to discourage future Steiner trees from reusing them (see
15The original BonnRoute algorithm uses min-max resource allocation with edges rather than vertices as
resources.
35
Algorithm 1 for details). This algorithm produces good approximate solutions in rea-
sonable time. More precisely, given an oracle that computes vertex-weighted Steiner
tree approximations within a factor σ of optimal, for any ω > 0 Algorithm 1 com-
putes a σ(1 + ω)-approximate solution to min-max resource allocation problem using
O((log |V |)(n+ |V |)(ω−2 + log log |V |)) calls to the oracle [66].
Algorithm 1 BonnRoute Resource Sharing Algorithm [65].
Require: GraphG, Steiner tree terminals {T1, . . . , Tn}, number of iterations t, weight
penalty α > 1
Ensure: For each i, a probability distribution pi,Si over all Steiner trees Si for termi-
nals Ti
function BONNROUTE(G,{T1, . . . , Tn})
for each v ∈ V (G) do
wv ← 1
for each Steiner tree Si for terminals Ti, i ∈ [n] do
zi,Si ← 0
for j from 1 to t do
for each i ∈ [n] do
Find a Steiner tree Si for terminals Ti with vertex-weights wv
zi,Si ← zi,Si + 1
wv ← wv ∗ α for all v ∈ Si
Return pi,Si ← zi,Si/t
Once a solution to the min-max resource allocation has been found, we recover
a solution to the original routing problem by formulating an integer linear program
(IP), which may be solved via OMT(LRA).16 For each Steiner tree Si with non-
zero probability in the distribution returned from min-max resource allocation, define
a binary variable as follows:
xi,Si =
{
1, if Si is the selected Steiner tree for variable i;
0, otherwise.
Then minimize the number of qubits selected, subject to selecting one Steiner tree for
each i and using each vertex at most once. That is,
min
∑
i
∑
Si
|Si|xi,Si
s.t.
∑
Si
xi,Si = 1 for all i
xi,Si + xj,Sj ≤ 1 for all Si, Sj s.t. Si ∩ Sj 6= ∅.
16The original BonnRoute algorithm uses randomized rounding to recover a routing solution from min-
max resource allocation, but at current QA hardware scales this is not necessary.
36
When applying routing to the Chimera graph, because of the symmetry within each
unit tile, it is convenient to work with a reduced graph in which the horizontal qubits
in each unit tile are identified as a single qubit, and similarly for the vertical qubits. As
a result the scale of the routing problem is reduced by a factor of 4. This necessitates
the use of vertex capacities within the routing algorithm (each reduced vertex has a
capacity of 4), and variables are assigned to individual qubits within a tile during a
secondary, detailed routing phase.
In the digital circuit literature, the placement and routing stages of embedding are
typically performed separately. However, because of current limited number of qubits
and the difficulty in allocating them to either placement or routing, a combined place-
and-route algorithm can be more effective. This approach is discussed in detail in [44].
6. Related work
There have been several previous efforts to map specific small Boolean functions
(usually in the guise of constraint satisfaction problems) to Ising models. Most of
those mappings have been ad hoc, but some were more systematic (beyond [33] and
[44] as previously discussed). Lucas [67] and Chancellor et al. [68] developed Ising
models for several specific NP-hard problems, while Su et al. [45] and Pakin [69, 70]
decomposed Boolean functions into common primitives.
There are have also been several attempts to map large Boolean functions or more
generally large constrained Boolean optimization problems to D-Wave hardware. Most
of these efforts (e.g. [71, 72, 73, 74, 75, 76, 77, 78, 79]) have used global embedding, in
which an entire Ising model is minor-embedded heuristically [41] or a fixed embedding
is used [39, 40]. However Su et al. [45] used a general place-and-route approach, while
Trummer et al.[80], Chancellor et al. [68], Zaribafiyan et al. [40], and Andriyash et
al. [81] used a placement approach optimized for the specific constraints at hand.
Looking at SAT instances in particular, there have been at least two previous at-
tempts at benchmarking D-Wave hardware performance: McGeoch et al. [82] and
Santra et al. [83] looked at (weighted) Max2SAT problems, and Douglass et al. [84]
and Pudenz et al. [85] looked at SAT problems with the goal of sampling diverse so-
lutions. Farhi et al. [86] and Hen and Young [87] studied the performance of quantum
annealing on SAT problems more generally. The applicability of QAs for various SAT
formulations has also been discussed in [88, 89].
7. Preliminary Experimental Evaluation
We have implemented and made publicly available prototype encoders built on top
of the SMT/OMT tool OPTIMATHSAT [20]. In particular each SATtoIsing-specific
step outlined in Figure 8 has been implemented as a Python library. For preprocessing
we rely on the ABC tool suite [90]. The same software is capable of performing tech-
nology mapping, though a Python version is available in the TECHMAPPING library17.
17Available at https://bitbucket.org/StefanoVt/tech_mapping
37
Finally the PLACEANDROUTE library18 performs the combined placement and routing
step. Regarding the off-line part of the process, the GATECOLLECTOR library19 ex-
tracts the most common gates in a dataset of functions and generates a function library
in the ABC-compatible GENLIB format. The PFENCODING library20 is then used to
call OPTIMATHSAT to encode them for later use. Currently the most expensive step
in the on-the-fly process is the placement and routing step. In the current setup we
use ≈20 minutes on a Intel i7-5600U CPU when we encode the problems used in the
experimental evaluation. The software run-time is heavily tunable in order to trade off
efficiency and effectiveness of the place-and-route process.
We offer preliminary empirical validation of the proposed methods for solving SAT
via SATtoIsing encoding by evaluating the performance of D-Wave’s 2000Q system in
solving certain hard SAT problems (§7.1); we perform a similar evaluation also on
MaxSAT problems (§7.2), despite the limitations highlighted in §3.3.
This task is subject to some limitations. First, we require instances that can be
entirely encoded in a quantum annealer of 2000 qubits (although algorithms for solv-
ing much larger constraint satisfaction problems have been proposed; see [33, 44]).
Furthermore, SAT solvers are already quite effective on the average case, so we need
concrete worst-case problems. Another important consideration in solving [Max]SAT
instances is that the QA hardware cannot be made aware of the optimality of solution;
for example, the algorithm cannot terminate when all clauses in a SAT problem are sat-
isfied. In this way, QA hardware behaves more like an SLS solver than a CDCL-based
one. To this extent, and in order to evaluate the significance of the testbed, we solved
the same problems with the state-of-the-art UBCSAT SLS SAT solver using the best
performing algorithm, namely SAPS [9]. UBCSAT was run on a computer using a
8-core Intel R© Xeon R© E5-2407 CPU, at 2.20GHz.
Remark 1. The results reported in this section are not intended as a performance com-
parison between D-Wave’s 2000Q system and UBCSAT, or any other classic comput-
ing tool. It is difficult to make a reasonable comparison for many reasons, including is-
sues of specialized vs. off-the-shelf hardware, different timing mechanisms and timing
granularities, and costs of encoding. Instead we aim to provide an empirical assessment
of QA’s potential for [Max]SAT solving, based on currently available systems.
Reproducibility of results. To make the results reproducible to those who have access
to a D-Wave system, we have set a website where experimental data, problem files,
translation files, demonstration code and supplementary material can be accessed. 21
Notice that public access to a D-Wave 2000Q machine is possible through D-Wave’s
Leap cloud service 22.
7.1. SAT
18Available at https://bitbucket.org/StefanoVt/placeandroute
19Available at https://bitbucket.org/StefanoVt/gatecollector
20Available at https://bitbucket.org/StefanoVt/pfencoding
21https://bitbucket.org/aqcsat/aqcsat.
22https://cloud.dwavesys.com/leap/.
38
Figure 10: Median times for the best-performing SLS algorithm on two different variants of the SGEN
problem on UBCSAT (SAPS). Timeout is marked with a gray line. The figure report times on a computer
with a 8-core Intel R© Xeon R© E5-2407 CPU, at 2.20GHz.
Choosing the benchmark problems. In order to provide a significant empirical evalu-
ation, and due to the limitations in size and connectivity of current QA systems, we
require SAT problems which have a low number of variables but are nevertheless hard
for standard SAT solvers.
To this end we chose and modified the tool SGEN [91], which has been used to gen-
erate the smallest unsolvable problems in recent SAT competitions. The problems share
a structure that is suited for the problem embedding, as it contains multiple clones of
slightly complex constraints, and even problems with few hundreds variables are con-
siderably hard. The SGEN family of random generators received many improvements
over the years, but the method to generate satisfiable instances has remained the same
[92, 93]. SGEN works by setting cardinality constraints over different partitions of the
variable set. The generator operates as follows:
1. The user decides the number of Boolean variables in the problem.
2. The tool partitions the variable set into sets of 5 elements.
3. For satisfiable problem instances, the desired solution contains exactly one true
variable for each subset. For each subset we guarantee that at most one variable
is true (10 2-CNF clauses).
4. The partition is shuffled. The tool ensures that each new subset contain exactly
one true variable, and minimizes the similarity with the previous partition.
5. For each new subset we ensure that at least one variable is true (a single CNF
clause).
39
6. The previous two steps are repeated one more time, further restricting the solu-
tion space.
In Figure 10 (red plot) we can see how UBCSAT SAPS performs on these random
SGEN problems. Notice that with > 300 variables the solver reaches the timeout of
1000s. In our experiments, we modify the tool by using exactly-2-in-4 constraints on
partitions with sets of size 4 with exactly two true variables per subset. This kind of
constraint has a more efficient embedding and the modified problems are harder (see
Figure 10, blue plot, where UBCSAT reaches the timeout with > 270 variables).
Experiments and Results. To solve these SAT instances, we encode and embed them as
in §4-§5 and then draw a fixed number of samples/instance (5, 10, 20) at an annealing
rate of 10 µs per sample. Table 1(a) shows the results from the D-Wave 2000Q QA
hardware.
The QA hardware solves almost all problems with 5 samples (i.e. within 50 µs of
total anneal time), and all of them with 20 samples (i.e. within 200 µs of total anneal
time), and the rates of sampling optimal solutions remain relatively stable at this scale
of problem.
In order to evaluate the significance of the testbed, we also report the results of solv-
ing the same problems with the UBCSAT SLS SAT solver using SAPS [9]. Remark 1
applies here. Table 1(b) shows that the problems are nontrivial despite the small num-
ber of variables, and the run-times increase significantly with the size of the problem.
(See also Figure 10.)
7.2. Weighted MaxSAT solving and sampling
Choosing the benchmarks. To demonstrate the performance of the QA hardware in this
regime, we generated MaxSAT instances that have many distinct optimal solutions.
These problems were generated from the 2-in-4-SAT instances described above by
removing a fraction of the constraints and then adding constraints on single variables
with smaller weight. More precisely:
1. Beginning with the 2-in-4-SAT instances of the previous section, we remove one
of the partitions of the variable set, and change one 2-in-4 constraint to 1-in-4.
(This makes the SAT problem unsatisfiable: for an n variable problem, the first
partition demands exactly n/2 true variables, while the second demands exactly
n/2− 1.)
2. We change the SAT problem into a weighted MaxSAT problem by assigning
existing constraints a soft weight of 3 and randomly assigning each variable or
its negation a soft constraint of weight 1.
3. We repeatedly generate MaxSAT instances of this form, until we find an instance
in which the optimal solution has exactly one violated clause of weight 3 and at
least n/3 violated clauses of weight 1, and at least 200 distinct optimal solutions
exist.
40
D-Wave 2000Q
Problem size # solved5 samples
# solved
10 samples
# solved
20 samples
% optimal
samples
32 vars 100 100 100 97.4
36 vars 100 100 100 96.4
40 vars 100 100 100 94.8
44 vars 100 100 100 93.8
48 vars 100 100 100 91.4
52 vars 100 100 100 93.4
56 vars 100 100 100 91.4
60 vars 100 100 100 88.2
64 vars 100 100 100 84.6
68 vars 100 100 100 84.4
72 vars 98 100 100 84.6
76 vars 99 99 100 86.6
80 vars 100 100 100 86.0
(a)
UBCSAT (SAPS)
Problem size Avg time (ms)
32 vars 0.1502
36 vars 0.2157
40 vars 0.3555
44 vars 0.5399
48 vars 0.8183
52 vars 1.1916
56 vars 1.4788
60 vars 2.2542
64 vars 3.1066
68 vars 4.8058
72 vars 6.2484
76 vars 8.2986
80 vars 12.4141
(b)
Table 1: (a) Number of SATtoIsing problem instances (out of 100) solved by the QA hardware using 5
samples [resp. 10 and 20] and average fraction of samples from the QA hardware that are optimal solutions.
Annealing was executed at a rate of 10 µs per sample, for a total of 50 µs, [resp. 100 µs and 200 µs] of anneal
time per instance respectively. Total time used by the D-Wave processor includes programming and readout;
this amounts to about 150 µs per sample, plus a constant 10ms of overhead.
(b) Run-times inms for SAT instances solved by UBCSAT using SAPS, averaged over 100 instances of each
problem size. Computations were performed using an 8-core Intel R© Xeon R© E5-2407 CPU, at 2.20GHz.
As discussed in §3.3, determining an appropriate gap for chains in MaxSAT prob-
lems is more complicated than for SAT problems, and finding the smallest viable
chain gap may be difficult analytically. However, a gap may be found experimen-
tally by sweeping over a range of values and choosing one that results in optimal per-
formance. Chain gaps that are too small result in a large number of broken chains,
41
D-Wave 2000Q
Problem size # solved % optimalsamples
32 vars 100 78.7
36 vars 100 69.0
40 vars 100 60.2
44 vars 100 49.9
48 vars 100 40.4
52 vars 100 35.2
56 vars 100 24.3
60 vars 100 22.3
64 vars 99 17.6
68 vars 99 13.0
72 vars 98 9.6
76 vars 94 6.6
80 vars 93 4.3
(a)
MaxSAT solvers: avg time (ms)
Problem size g2wsat rots maxwalksat novelty
32 vars 0.020 0.018 0.034 0.039
36 vars 0.025 0.022 0.043 0.060
40 vars 0.039 0.029 0.056 0.119
44 vars 0.049 0.043 0.070 0.187
48 vars 0.069 0.054 0.093 0.311
52 vars 0.122 0.075 0.115 0.687
56 vars 0.181 0.112 0.156 1.319
60 vars 0.261 0.130 0.167 1.884
64 vars 0.527 0.159 0.207 4.272
68 vars 0.652 0.210 0.270 8.739
72 vars 0.838 0.287 0.312 14.118
76 vars 1.223 0.382 0.396 18.916
80 vars 1.426 0.485 0.430 95.057
(b)
Table 2: (a) Number of MaxSATtoIsing problem instances (out of 100) solved by the QA hardware using
100 samples, and average fraction of samples from the QA hardware that are optimal solutions. Annealing
was executed at a rate of 10 µs per sample, for a total of 1ms of anneal time per instance.
(b) Time in ms taken to find an optimal solution by various inexact weighted MaxSAT solvers, averaged
over 100 MaxSAT instances of each problem size. Classical computations were performed on an Intel i7
2.90GHz× 4 processor. The solvers gw2sat[94], rots[95], and novelty[96] are as implemented in UBCSAT
[9]. All classical algorithms are performed with the optimal target weight specified; in the absence of a target
weight they are much slower.
42
D-Wave 2000Q
Size anneal only wall-clock
32 vars 448.5 443.9
36 vars 607.0 579.9
40 vars 1007.9 922.0
44 vars 1322.6 1066.6
48 vars 1555.4 1111.8
52 vars 3229.0 1512.5
56 vars 2418.9 1147.4
60 vars 4015.3 1359.3
64 vars 6692.6 1339.1
68 vars 6504.2 1097.1
72 vars 3707.6 731.7
76 vars 2490.3 474.2
80 vars 1439.4 332.7
(a)
MaxSAT solvers
Size g2wsat rots maxwalksat novelty
32 vars 448.5 448.5 448.5 448.5
36 vars 607.0 606.9 606.9 606.8
40 vars 1007.7 1006.3 1005.3 1005.0
44 vars 1313.8 1307.1 1311.7 1255.5
48 vars 1515.4 1510.7 1504.9 1320.5
52 vars 2707.5 2813.0 2854.6 1616.2
56 vars 2021.9 2106.2 2186.6 969.8
60 vars 2845.6 3061.7 3289.0 904.4
64 vars 3100.0 4171.0 4770.0 570.6
68 vars 2742.2 3823.3 4592.4 354.8
72 vars 1841.1 2400.2 2943.4 212.6
76 vars 1262.5 1716.0 2059.2 116.4
80 vars 772.2 1111.1 1363.9 66.7
(b)
Table 3: Number of distinct optimal solutions found in 1 second by various MaxSAT solvers, averaged across
100 instances of each problem size.
(a) “anneal only” accounts for only the 10 µs per sample anneal time used by the D-Wave processor. “wall-
clock” accounts for all time used by the D-Wave processor, including programming and readout.
(b) Classical computations were performed as in Table 2(b).
43
while chain gaps that are too large result in gaps for problem constraints that are
smaller than the noise levels of the hardware, yielding solutions that are far from
optimal. For the MaxSAT experiments in this section, the chosen chain gap was
always in the range gchain ∈ [2, 6] (relative to penalty functions PF (x,a|θ) with
θi ∈ [−2, 2], θij ∈ [−1, 1].)
Experiments and Results. Table 2 summarizes the performance of the D-Wave pro-
cessor in generating a single optimal MaxSAT solution, as well as the run-times for
various high-performing SLS MaxSAT solvers. The QA hardware solves almost all
problems with 100 samples/instance (i.e. within 1 ms of anneal time). Remark 1 also
applies here. One of the strengths of D-Wave’s processor is its ability to rapidly sample
the near-optimal solutions: current systems typically anneal at a rate of 10 µs or 20 µs
per sample and are designed to take thousands of samples during each programming
cycle. As a result, the first practical benefits of QAs will likely come from applications
which require many solutions rather than a single optimum.
To this extent, Table 3 considers generating distinct optimal solutions. For each
solver and problem size, the table indicates the number of distinct solutions found in 1
second, averaged across 100 problem instances of that size. For the smallest problems,
1 second is sufficient for all solvers to generate all solutions, while the diversity of so-
lutions found varies widely as problem size increases. Although the D-Wave processor
returns a smaller fraction of optimal solutions for MaxSAT instances than for the SAT
instances, it is still effective in enumerating distinct optimal solutions because its rapid
sampling rate.
Alternative penalty functions. Different penalty functions can result in different QA
performance, even when those penalty functions have the same gap between ground
and excited states. As an example of this, we describe another set of MaxSAT instances
which result in better performance on the D-Wave 2000Q processor relative to classical
solvers, even though the penalty functions they use are less theoretically justified.
We call these instances “unbiased” to distinguish them from the MaxSAT instances
of the previous section. They are generated as follows. Beginning with the SGEN 2-
in-4-SAT instances, we first change one 2-in-4 constraint to 1-in-4, making the SAT
problem unsatisfiable. We then remove 5 constraints from one partition of the variable
set. This increases the total number of optimal solutions. Finally, we treat the result-
ing constraints as a MaxSAT problem in which each 1-in-4 or 2-in-4 constraint has
the same weight. Despite having many solutions, these problems become difficult for
MaxSAT solvers with a relatively small number of variables.
When solving these instances, we represent each 2-in-4-MaxSAT constraint by the
following penalty function: PF (x,a|θ) = 4 + x1x2 + x1x4 + x2x3 + x3x4 − x1a1 −
x2a2 + x3a1 + x4a2. This model satisfies:
min
a
PF (x,a|θ) =

0,
∑
i xi = 0;
2, |∑i xi| = 1;
8, |∑i xi| = 2.
Because the unsatisfiable states |∑i xi| = 1 and |∑i xi| = 2 have different minimal
energy configurations, this is not an exact penalty function as required for MaxSAT as
44
D-Wave 2000Q
Problem size # solved % optimalsamples
32 vars 100 97.5
36 vars 100 95.7
40 vars 100 92.9
44 vars 100 91.1
48 vars 100 88
52 vars 100 86.1
56 vars 100 83.5
60 vars 100 83.1
64 vars 100 80.8
68 vars 100 81
72 vars 100 79.5
76 vars 100 79
80 vars 100 75.1
(a)
MaxSAT solvers: avg time (ms)
Problem size g2wsat rots maxwalksat novelty
32 vars 0.018 0.013 0.025 0.012
36 vars 0.024 0.019 0.036 0.018
40 vars 0.037 0.030 0.052 0.024
44 vars 0.049 0.041 0.076 0.038
48 vars 0.070 0.064 0.115 0.056
52 vars 0.102 0.099 0.176 0.080
56 vars 0.153 0.161 0.262 0.117
60 vars 0.217 0.252 0.403 0.171
64 vars 0.303 0.383 0.598 0.241
68 vars 0.434 0.604 0.938 0.362
72 vars 0.620 0.964 1.448 0.551
76 vars 0.914 1.536 2.262 0.829
80 vars 1.364 2.567 3.618 1.312
(b)
Table 4: (a) Number of MaxSATtoIsing problem instances (out of 100) solved by the QA hardware using
100 samples, and average fraction of samples from the QA hardware that are optimal solutions, for the
“unbiased” MaxSAT instances. Annealing was executed at a rate of 10 µs per sample, for a total of 1ms of
anneal time per instance.
(b) Time in ms taken to find an optimal solution by various inexact weighted MaxSAT solvers, averaged
over 100 MaxSAT instances of each problem size. Classical computations were performed on an Intel i7
2.90GHz × 4 processor. gw2sat[94], rots[95], and novelty[96] are as implemented in UBCSAT [9]. All
classical algorithms are performed with the optimal target weight specified; in the absence of a target weight
they are much slower.
45
in (21). Nevertheless, this model performs well in practice, because for the unbiased
MaxSAT instances only configurations with |∑i xi| ≤ 1 are of interest.
Table 4 summarizes the performance of the D-Wave hardware and classical solvers
in finding an optimal solution for the unbiased MaxSAT instances. It is instructive
to compare these results to the “biased” MaxSAT instances in Table 2. The unbiased
instances require more time for the best classical solvers to solve, yet result in better
D-Wave hardware performance, despite the fact that the penalty function used is not
exact.
8. Ongoing and Future Work
Future QA architectures will be larger and more connected, enabling more efficient
encodings of larger and more difficult SAT problems. Faster and more scalable SMT-
based encoding methods for small Boolean functions is currently an important direction
of research. The ability to increase the number of ancillary variables can lead to larger
gaps, which in turn can make quantum annealing more reliable. Among the encoding
challenges presented in this paper, a few are of particular interest and relevance to SMT
research:
• Variable placement. Methods for simultaneously placing variables and comput-
ing penalty functions are currently less scalable, and have been less studied, than
those for fixed variable placements.
• Augmenting penalty functions. For large Boolean functions, generating penalty
functions directly from SMT becomes difficult because the number of constraints
grows much more quickly than the number of available parameters. Function de-
composition and chains provide one way around this, but chains limit the result-
ing gaps. There may be other methods of recombining a decomposed function
that are not so restrictive. Alternatively, it may be possible to augment an exist-
ing penalty function with additional qubits for the purposes of increasing its gap.
SMT formulations of these problems have not yet been explored.
• Solving (15) directly. In the field of automated theorem proving and SMT, novel
techniques for solving quantified SMT formulas are emerging. It is thus possible
to investigate these techniques for solving directly the quantified formulas (15),
avoiding thus the expensive Shannon expansion of (16)-(19).
• Better function decompositions. While Boolean function decomposition and
minimization are mature classical subjects, those algorithms can probably be im-
proved by taking into consideration the specifics of the embedding (placement
and routing onto a QA hardware graph) that follow them.
• More connected topologies. Future QA hardware graphs will be larger, have
higher per-qubit connectivity, and have less separation between clusters (tiles) of
qubits. An example of a next-generation hardware graph under development at
D-Wave is shown in Figure 11. While these changes will result in the ability to
solve larger and more difficult Ising problems, they will also require new encod-
ing strategies. In particular, new methods for problem decomposition, placing
46
Figure 11: “Pegasus”, the hardware graph of an experimental QA system under development at D-Wave
(720-qubit version). Qubits have maximum degree 15 rather than 6, and qubits do not fall into well-defined
unit tiles as in Chimera.
small Boolean functions, and penalty modelling that take advantage of additional
connectivity will significantly improve the encoding process.
Furthermore, we believe the problems presented here are not only practical, but also
complex enough to be used to challenge new SMT solvers. To encourage the use
of these problems as SMT benchmarks, we have provided example .smt files on the
website of supplementary materialı`23.
23See Footnote 21.
47
References
References
[1] P. W. Shor, Polynomial-time algorithms for prime factorization and discrete log-
arithms on a quantum computer, SIAM J. Comput. 26 (5) (1997) 1484–1509.
doi:10.1137/S0097539795293172.
[2] L. K. Grover, A fast quantum mechanical algorithm for database search, in: Pro-
ceedings of the Twenty-eighth Annual ACM Symposium on Theory of Com-
puting, STOC ’96, ACM, New York, NY, USA, 1996, pp. 212–219. doi:
10.1145/237814.237866.
[3] A. Finnila, M. Gomez, C. Sebenik, C. Stenson, J. Doll, Quantum annealing: A
new method for minimizing multidimensional functions, Chemical Physics Let-
ters 219 (5) (1994) 343 – 348. doi:10.1016/0009-2614(94)00117-0.
[4] T. Kadowaki, H. Nishimori, Quantum annealing in the transverse ising model,
Phys. Rev. E 58 (1998) 5355–5363. doi:10.1103/PhysRevE.58.5355.
[5] E. Farhi, J. Goldstone, S. Gutmann, M. Sipser, Quantum computation by adiabatic
evolution, arXiv preprint quant-ph/0001106.
[6] P. I. Bunyk, E. M. Hoskinson, M. W. Johnson, E. Tolkacheva, F. Altomare, A. J.
Berkley, R. Harris, J. P. Hilton, T. Lanting, A. J. Przybysz, J. Whittaker, Ar-
chitectural considerations in the design of a superconducting quantum annealing
processor, IEEE Transactions on Applied Superconductivity 24 (4) (2014) 1–10.
doi:10.1109/TASC.2014.2318294.
[7] B. Selman, H. Kautz, B. Cohen, Local Search Strategies for Satisfiability Testing,
in: Cliques, Coloring, and Satisfiability, Vol. 26 of DIMACS, 1996, pp. 521–532.
[8] W. M. Spears, Simulated annealing for hard satisfiability problems, in: Cliques,
Coloring, and Satisfiability, Vol. 26 of DIMACS, American Mathematical Soci-
ety, 1996, pp. 533–558.
[9] D. A. D. Tompkins, H. H. Hoos, UBCSAT: An implementation and experimen-
tation environment for SLS algorithms for SAT and MAX-SAT, in: H. Hoos,
D. Mitchell (Eds.), Revised Selected Papers from the Seventh International Con-
ference on Theory and Applications of Satisfiability Testing (SAT 2004), Vol.
3542 of Lecture Notes in Computer Science, Springer Berlin / Heidelberg, 2005,
pp. 306–320. doi:10.1007/11527695_24.
[10] V. S. Denchev, S. Boixo, S. V. Isakov, N. Ding, R. Babbush, V. Smelyanskiy,
J. Martinis, H. Neven, What is the computational value of finite-range tunneling?,
Phys. Rev. X 6 (2016) 031015. doi:10.1103/PhysRevX.6.031015.
[11] J. King, S. Yarkoni, J. Raymond, I. Ozfidan, A. D. King, M. M. Nevisi, J. P.
Hilton, C. C. McGeoch, Quantum Annealing amid Local Ruggedness and Global
Frustration, arXiv preprint arXiv:1701.04579.
48
[12] A. Biere, M. J. H. Heule, H. van Maaren, T. Walsh (Eds.), Handbook of Satisfia-
bility, IOS Press, 2009.
[13] C. M. Li, F. Manya`, MaxSAT, Hard and Soft Constraints, Ch. 19, in: Biere et al.
[12], pp. 613–631.
[14] F. Massacci, L. Marraro, Logical cryptanalysis as a sat problem, Jour-
nal of Automated Reasoning 24 (1) (2000) 165–203. doi:10.1023/A:
1006326723002.
[15] I. Mironov, L. Zhang, Applications of SAT solvers to cryptanalysis of hash func-
tions, in: A. Biere, C. P. Gomes (Eds.), Theory and Applications of Satisfiability
Testing - SAT 2006, Springer Berlin Heidelberg, Berlin, Heidelberg, 2006, pp.
102–115. doi:10.1007/11814948_13.
[16] F. Lafitte, J. N. Jr., D. V. Heule, Applications of SAT Solvers in Cryptanalysis:
Finding Weak Keys and Preimages, Journal of Satisfiability, Boolean Modeling
and Computation - JSAT 9.
[17] A. Fre´chette, N. Newman, K. Leyton-Brown, Solving the station repacking prob-
lem, in: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence,
AAAI’16, AAAI Press, 2016, pp. 702–709.
URL http://dl.acm.org/citation.cfm?id=3015812.3015917
[18] C. W. Barrett, R. Sebastiani, S. A. Seshia, C. Tinelli, Satisfiability Modulo Theo-
ries, in: Biere et al. [12], p. 980.
[19] R. Sebastiani, S. Tomasi, Optimization modulo theories with linear rational costs,
ACM Transactions on Compututational Logics, TOCL 16 (2) (2015) 12:1–12:43.
doi:10.1145/2699915.
[20] R. Sebastiani, P. Trentin, Optimathsat: A tool for optimization modulo theories,
in: D. Kroening, C. S. Pa˘sa˘reanu (Eds.), Computer Aided Verification, Springer
International Publishing, Cham, 2015, pp. 447–454.
[21] Z. Bian, F. Chudak, W. Macready, A. Roy, R. Sebastiani, S. Varotti, Solv-
ing SAT and MaxSAT with a quantum annealer: Foundations and a prelimi-
nary report, in: C. Dixon, M. Finger (Eds.), Frontiers of Combining Systems,
Springer International Publishing, Cham, 2017, pp. 153–171. doi:10.1007/
978-3-319-66167-4_9.
[22] R. Harris, J. Johansson, A. J. Berkley, M. W. Johnson, T. Lanting, S. Han,
P. Bunyk, E. Ladizinsky, T. Oh, I. Perminov, et al., Experimental demonstra-
tion of a robust and scalable flux qubit, Physical Review B 81 (13). doi:
10.1103/physrevb.81.134510.
[23] M. W. Johnson, M. H. S. Amin, S. Gildert, T. Lanting, F. Hamze, N. Dickson,
R. Harris, A. J. Berkley, J. Johansson, P. Bunyk, E. M. Chapple, C. Enderud, J. P.
Hilton, K. Karimi, E. Ladizinsky, N. Ladizinsky, T. Oh, I. Perminov, C. Rich,
M. C. Thom, E. Tolkacheva, C. J. S. Truncik, S. Uchaikin, J. Wang, B. Wilson,
49
G. Rose, Quantum annealing with manufactured spins, Nature 473 (7346) (2011)
194–198. doi:10.1038/nature10012.
[24] T. Lanting, R. Harris, J. Johansson, M. H. S. Amin, A. J. Berkley, S. Gildert,
M. W. Johnson, P. Bunyk, E. Tolkacheva, E. Ladizinsky, N. Ladizinsky, T. Oh,
I. Perminov, E. M. Chapple, C. Enderud, C. Rich, B. Wilson, M. C. Thom,
S. Uchaikin, G. Rose, Cotunneling in pairs of coupled flux qubits, Physical Re-
view B 82 (6) (2010) 060512.
[25] M. H. Amin, Searching for quantum speedup in quasistatic quantum annealers,
Phys. Rev. A 92 (5) (2015) 052323. arXiv:1503.04216, doi:10.1103/
PhysRevA.92.052323.
[26] M. H. Amin, E. Andriyash, J. Rolfe, B. Kulchytskyy, R. Melko, Quantum boltz-
mann machine, Phys. Rev. X 8 (2018) 021050. doi:10.1103/PhysRevX.
8.021050.
[27] J. Raymond, S. Yarkoni, E. Andriyash, Global warming: Temperature estima-
tion in annealers, Frontiers in ICT 3 (2016) 23. doi:10.3389/fict.2016.
00023.
[28] J. P. Marques-Silva, I. Lynce, S. Malik, Conflict-Driven Clause Learning SAT
Solvers, Ch. 4, in: Biere et al. [12], pp. 131–153.
[29] G. S. Tseitin, On the Complexity of Derivation in Propositional Calculus,
Springer Berlin Heidelberg, Berlin, Heidelberg, 1983, pp. 466–483. doi:
10.1007/978-3-642-81955-1_28.
[30] S. A. Cook, The complexity of theorem proving procedures, in: 3rd Annual ACM
Symposium on the Theory of Computation, 1971, pp. 151–158.
[31] S. M. Majercik, Stochastic Boolean Satisfiability, Ch. 27, in: Biere et al. [12], pp.
887–925.
[32] A. Cimatti, A. Griggio, B. J. Schaafsma, R. Sebastiani, The MathSAT 5 SMT
Solver, in: Tools and Algorithms for the Construction and Analysis of Systems,
TACAS’13., Vol. 7795 of LNCS, Springer, 2013, pp. 95–109.
[33] Z. Bian, F. Chudak, R. Israel, B. Lackey, W. G. Macready, A. Roy, Discrete opti-
mization using quantum annealing on sparse ising models, Frontiers in Physics 2
(2014) 56. doi:10.3389/fphy.2014.00056.
[34] V. P. Correia, A. I. Reis, C. Porto, A. R. Brasil, Classifying n-input boolean func-
tions, in: in Proc. IWS, Citeseer, 2001.
[35] Z. Huang, L. Wang, Y. Nasikovskiy, A. Mishchenko, Fast boolean match-
ing based on npn classification, in: 2013 International Conference on Field-
Programmable Technology (FPT), 2013, pp. 310–313. doi:10.1109/FPT.
2013.6718374.
50
[36] V. Choi, Minor-embedding in adiabatic quantum computation: I. the parameter
setting problem, Quantum Information Processing 7 (5) (2008) 193–209. doi:
10.1007/s11128-008-0082-9.
[37] V. Choi, Minor-embedding in adiabatic quantum computation: Ii. minor-universal
graph design, Quantum Information Processing 10 (3) (2011) 343–353. doi:
10.1007/s11128-010-0200-3.
[38] I. Adler, F. Dorn, F. V. Fomin, I. Sau, D. M. Thilikos, Faster parameterized algo-
rithms for minor containment, in: Proceedings of the 12th Scandinavian Confer-
ence on Algorithm Theory, SWAT’10, Springer-Verlag, Berlin, Heidelberg, 2010,
pp. 322–333. doi:10.1007/978-3-642-13731-0_31.
[39] T. Boothby, A. D. King, A. Roy, Fast clique minor generation in chimera qubit
connectivity graphs, Quantum Information Processing 15 (1) (2016) 495–508.
doi:10.1007/s11128-015-1150-6.
[40] A. Zaribafiyan, D. J. J. Marchand, S. S. Changiz Rezaei, Systematic
and deterministic graph minor embedding for cartesian products of graphs,
Quantum Information Processing 16 (5) (2017) 136. doi:10.1007/
s11128-017-1569-z.
[41] J. Cai, W. G. Macready, A. Roy, A practical heuristic for finding graph minors,
arXiv preprint arXiv:1406.2741.
[42] R. Dechter, Bucket Elimination: A Unifying Framework for Probabilistic In-
ference, Springer Netherlands, Dordrecht, 1998, pp. 75–104. doi:10.1007/
978-94-011-5014-9_4.
[43] B. D. McKay, A. Piperno, Practical graph isomorphism, II, Journal of Symbolic
Computation 60 (0) (2014) 94 – 112. doi:10.1016/j.jsc.2013.09.003.
[44] Z. Bian, F. Chudak, R. B. Israel, B. Lackey, W. G. Macready, A. Roy, Mapping
constrained optimization problems to quantum annealing with application to fault
diagnosis, Frontiers in ICTdoi:10.3389/fict.2016.00014.
[45] J. Su, T. Tu, L. He, A quantum annealing approach for boolean satisfiability prob-
lem, in: 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC),
2016, pp. 1–6. doi:10.1145/2897937.2897973.
[46] A. Mishchenko, S. Chatterjee, R. Brayton, Dag-aware aig rewriting a fresh look
at combinational logic synthesis, in: Proceedings of the 43rd Annual Design Au-
tomation Conference, DAC ’06, ACM, New York, NY, USA, 2006, pp. 532–535.
doi:10.1145/1146909.1147048.
[47] A. Mishchenko, S. Chatterjee, R. Jiang, R. K. Brayton, Fraigs: A unifying rep-
resentation for logic synthesis and verification, Tech. rep., ERL Technical Report
(2005).
51
[48] N. Een, A. Mishchenko, N. So¨rensson, Applying logic synthesis for speeding
up sat, in: J. Marques-Silva, K. A. Sakallah (Eds.), Theory and Applications of
Satisfiability Testing – SAT 2007, Springer Berlin Heidelberg, Berlin, Heidelberg,
2007, pp. 272–286. doi:10.1007/978-3-540-72788-0_26.
[49] A. Mishchenko, S. Chatterjee, R. Brayton, X. Wang, T. Kam, Technology map-
ping with boolean matching, supergates and choices.
[50] V. Betz, J. Rose, Vpr: A new packing, placement and routing tool for FPGA
research, in: International Workshop on Field Programmable Logic and Applica-
tions, Springer, 1997, pp. 213–222. doi:10.1007/3-540-63465-7_226.
[51] A. B. Kahng, J. Lienig, I. L. Markov, J. Hu, VLSI Physical Design: From Graph
Partitioning to Timing Closure, Springer Netherlands, Dordrecht, Netherlands,
2011. doi:10.1007/978-90-481-9591-6.
[52] W.-J. Sun, C. Sechen, Efficient and effective placement for very large circuits,
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Sys-
tems 14 (3) (1995) 349–359. doi:10.1109/43.365125.
[53] T. F. Chan, J. Cong, T. Kong, J. R. Shinnerl, Multilevel optimization for
large-scale circuit placement, in: IEEE/ACM International Conference on Com-
puter Aided Design. ICCAD - 2000. IEEE/ACM Digest of Technical Papers
(Cat. No.00CH37140), 2000, pp. 171–176. doi:10.1109/ICCAD.2000.
896469.
[54] J. A. Roy, D. A. Papa, S. N. Adya, H. H. Chan, A. N. Ng, J. F. Lu, I. L. Markov,
Capo: Robust and scalable open-source min-cut floorplacer, in: Proceedings of
the 2005 International Symposium on Physical Design, ISPD ’05, ACM, New
York, NY, USA, 2005, pp. 224–226. doi:10.1145/1055137.1055184.
[55] J. Byrka, F. Grandoni, T. Rothvoss, L. Sanita`, Steiner tree approximation via
iterative randomized rounding, J. ACM 60 (1) (2013) 6:1–6:33. doi:10.1145/
2432622.2432628.
[56] M. Gester, D. Mu¨ller, T. Nieberg, C. Panten, C. Schulte, J. Vygen, Bonnroute:
Algorithms and data structures for fast and good vlsi routing, ACM Trans. Des.
Autom. Electron. Syst. 18 (2) (2013) 32:1–32:24. doi:10.1145/2442087.
2442103.
[57] Y. Xu, Y. Zhang, C. Chu, Fastroute 4.0: Global router with efficient via mini-
mization, in: Proceedings of the 2009 Asia and South Pacific Design Automation
Conference, ASP-DAC ’09, IEEE Press, Piscataway, NJ, USA, 2009, pp. 576–
581.
URL http://dl.acm.org/citation.cfm?id=1509633.1509768
[58] J. A. Roy, I. L. Markov, High-performance routing at the nanometer scale, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
27 (6) (2008) 1066–1077. doi:10.1109/TCAD.2008.923255.
52
[59] H. Chen, C. Hsu, Y. Chang, High-performance global routing with fast overflow
reduction, in: 2009 Asia and South Pacific Design Automation Conference, 2009,
pp. 582–587. doi:10.1109/ASPDAC.2009.4796543.
[60] M. Cho, K. Lu, K. Yuan, D. Z. Pan, Boxrouter 2.0: architecture and imple-
mentation of a hybrid and robust global router, in: 2007 IEEE/ACM Inter-
national Conference on Computer-Aided Design, 2007, pp. 503–508. doi:
10.1109/ICCAD.2007.4397314.
[61] Y. J. Chang, Y. T. Lee, T. C. Wang, Nthu-route 2.0: A fast and stable global router,
in: 2008 IEEE/ACM International Conference on Computer-Aided Design, 2008,
pp. 338–343. doi:10.1109/ICCAD.2008.4681595.
[62] J. Byrka, F. Grandoni, T. Rothvoß, L. Sanita`, An improved LP-based approxima-
tion for steiner tree, in: Proceedings of the 42nd ACM Symposium on Theory of
Computing, STOC 2010, Cambridge, Massachusetts, USA, 5-8 June 2010, 2010,
pp. 583–592. doi:10.1145/1806689.1806769.
[63] P. Klein, R. Ravi, A nearly best-possible approximation algorithm for node-
weighted steiner trees, Journal of Algorithms 19 (1) (1995) 104 – 115. doi:
10.1006/jagm.1995.1029.
[64] V. V. Vazirani, Approximation Algorithms, Springer-Verlag, Berlin, Germany,
2001.
[65] M. Gester, D. Mu¨ller, T. Nieberg, C. Panten, C. Schulte, J. Vygen, Bonnroute: Al-
gorithms and data structures for fast and good VLSI routing, ACM Trans. Design
Autom. Electr. Syst. 18 (2) (2013) 32. doi:10.1145/2442087.2442103.
[66] D. Mu¨ller, K. Radke, J. Vygen, Faster min–max resource sharing in theory and
practice, Mathematical Programming Computation 3 (1) (2011) 1–35. doi:10.
1007/s12532-011-0023-y.
[67] A. Lucas, Ising formulations of many NP problems, Frontiers in Physics 2 (2014)
5. doi:10.3389/fphy.2014.00005.
[68] N. Chancellor, S. Zohren, P. A. Warburton, S. C. Benjamin, S. Roberts, A direct
mapping of max k-sat and high order parity checks to a chimera graph., Scientific
reports. 6 (2016) 37107.
URL http://dro.dur.ac.uk/20497/
[69] S. Pakin, A quantum macro assembler, in: 2016 IEEE High Performance Ex-
treme Computing Conference, HPEC 2016, Waltham, MA, USA, September 13-
15, 2016, 2016, pp. 1–8. doi:10.1109/HPEC.2016.7761637.
[70] S. Pakin, Performing fully parallel constraint logic programming on a quantum
annealer, Theory and Practice of Logic Programming 18 (5-6) (2018) 928–949.
doi:10.1017/S1471068418000066.
53
[71] D. Venturelli, D. J. J. Marchand, G. Rojo, Quantum Annealing Implementation
of Job-Shop Scheduling, arXiv preprintarXiv:1506.08479.
[72] G. Rosenberg, P. Haghnegahdar, P. Goddard, P. Carr, K. Wu, M. L. de Prado,
Solving the optimal trading trajectory problem using a quantum annealer, in:
Proceedings of the 8th Workshop on High Performance Computational Finance,
WHPCF ’15, ACM, New York, NY, USA, 2015, pp. 7:1–7:7. doi:10.1145/
2830556.2830563.
[73] R. Dridi, H. Alghassi, Prime factorization using quantum annealing and com-
putational algebraic geometry, Scientific Reports 7 (1). doi:10.1038/
srep43048.
[74] A. Perdomo-Ortiz, J. Fluegemann, S. Narasimhan, R. Biswas, V. Smelyanskiy,
A quantum annealing approach for fault detection and diagnosis of graph-based
systems, The European Physical Journal Special Topics 224 (1) (2015) 131–148.
doi:10.1140/epjst/e2015-02347-y.
[75] E. G. Rieffel, D. Venturelli, B. O’Gorman, M. B. Do, E. M. Prystay, V. N.
Smelyanskiy, A case study in programming a quantum annealer for hard oper-
ational planning problems, Quantum Information Processing 14 (1) (2015) 1–36.
doi:10.1007/s11128-014-0892-x.
[76] B. O’Gorman, E. G. Rieffel, M. Do, D. Venturelli, J. Frank, Compar-
ing planning problem compilation approaches for quantum annealing, The
Knowledge Engineering Review 31 (5) (2016) 465–474. doi:10.1017/
S0269888916000278.
[77] K. M. Zick, O. Shehab, M. French, Experimental quantum annealing: case study
involving the graph isomorphism problem, Scientific Reports 5, 11168.
URL http://www.nature.com/srep/2015/150608/srep11168/
full/srep11168.html
[78] Z. Bian, F. Chudak, W. G. Macready, L. Clark, F. Gaitan, Experimental de-
termination of Ramsey numbers, Phys. Rev. Lett. 111 (2013) 130505. doi:
10.1103/PhysRevLett.111.130505.
[79] S. Jiang, K. A. Britt, A. J. McCaskey, T. S. Humble, S. Kais, Quantum Annealing
for Prime Factorization, arXiv preprintarXiv:1804.02733.
[80] I. Trummer, C. Koch, Multiple query optimization on the d-wave 2x adia-
batic quantum computer, Proc. VLDB Endow. 9 (9) (2016) 648–659. doi:
10.14778/2947618.2947621.
[81] E. Andriyash, Z. Bian, F. Chudak, M. Drew-Brook, A. D. King, W. G. Macready,
A. Roy, Boosting integer factoring performance via quantum annealing offsets.
URL https://www.dwavesys.com/sites/default/files/
14-1002A_B_tr_Boosting_integer_factorization_via_
quantum_annealing_offsets.pdf
54
[82] C. C. McGeoch, C. Wang, Experimental evaluation of an adiabiatic quantum sys-
tem for combinatorial optimization, in: Proceedings of the ACM International
Conference on Computing Frontiers, CF ’13, ACM, New York, NY, USA, 2013,
pp. 23:1–23:11. doi:10.1145/2482767.2482797.
[83] S. Santra, G. Quiroz, G. V. Steeg, D. A. Lidar, Max 2-sat with up to 108 qubits,
New Journal of Physics 16 (4) (2014) 045006.
URL http://stacks.iop.org/1367-2630/16/i=4/a=045006
[84] A. Douglass, A. D. King, J. Raymond, Constructing SAT Filters with a Quantum
Annealer, Springer International Publishing, Cham, 2015, pp. 104–120. doi:
10.1007/978-3-319-24318-4_9.
[85] K. L. Pudenz, G. S. Tallant, T. R. Belote, S. H. Adachi, Quantum Annealing and
the Satisfiability Problem, ArXiv e-printsarXiv:1612.07258.
[86] E. Farhi, D. Gosset, I. Hen, A. W. Sandvik, P. Shor, A. P. Young, F. Zamponi,
Performance of the quantum adiabatic algorithm on random instances of two
optimization problems on regular hypergraphs, Phys. Rev. A 86 (2012) 052334.
doi:10.1103/PhysRevA.86.052334.
URL https://link.aps.org/doi/10.1103/PhysRevA.86.
052334
[87] I. Hen, A. P. Young, Exponential complexity of the quantum adiabatic al-
gorithm for certain satisfiability problems, Phys. Rev. E 84 (2011) 061152.
doi:10.1103/PhysRevE.84.061152.
URL https://link.aps.org/doi/10.1103/PhysRevE.84.
061152
[88] V. Choi, Different adiabatic quantum optimization algorithms for the np-complete
exact cover and 3sat problems, Quantum Info. Comput. 11 (7-8) (2011) 638–648.
URL http://dl.acm.org/citation.cfm?id=2230916.2230923
[89] A. D. King, T. Lanting, R. Harris, Performance of a quantum annealer on range-
limited constraint satisfaction problems, arXiv preprintarXiv:1502.02098.
[90] R. Brayton, A. Mishchenko, Abc: An academic industrial-strength verification
tool, in: International Conference on Computer Aided Verification, Springer,
2010, pp. 24–40.
[91] I. Spence, Sgen1: A generator of small but difficult satisfiability benchmarks,
J. Exp. Algorithmics 15 (2010) 1.2:1.1–1.2:1.15. doi:10.1145/1671970.
1671972.
[92] A. V. Gelder, I. Spence, Zero-one designs produce small hard SAT instances, in:
O. Strichman, S. Szeider (Eds.), Theory and Applications of Satisfiability Testing
– SAT 2010, Lecture Notes in Computer Science, Springer Berlin Heidelberg, pp.
388–397. doi:10.1007/978-3-642-14186-7\_37.
55
[93] I. Spence, Weakening cardinality constraints creates harder satisfiability bench-
marks, J. Exp. Algorithmics 20 (2015) 1.4:1–1.4:14. doi:10.1145/
2746239.
[94] C. M. Li, W. Q. Huang, Diversification and determinism in local search for sat-
isfiability, in: F. Bacchus, T. Walsh (Eds.), Theory and Applications of Satisfia-
bility Testing: 8th International Conference, SAT 2005, St Andrews, UK, June
19-23, 2005. Proceedings, Springer Berlin Heidelberg, Berlin, Heidelberg, 2005,
pp. 158–172. doi:10.1007/11499107_12.
[95] K. Smyth, H. H. Hoos, T. Stu¨tzle, Iterated robust tabu search for max-sat, in:
Proceedings of the 16th Canadian Society for Computational Studies of Intelli-
gence Conference on Advances in Artificial Intelligence, AI’03, Springer-Verlag,
Berlin, Heidelberg, 2003, pp. 129–144.
URL http://dl.acm.org/citation.cfm?id=1760335.1760351
[96] D. McAllester, B. Selman, H. Kautz, Evidence for invariants in local search,
in: Proceedings of the Fourteenth National Conference on Artificial Intelli-
gence and Ninth Conference on Innovative Applications of Artificial Intelligence,
AAAI’97/IAAI’97, AAAI Press, 1997, pp. 321–326.
URL http://dl.acm.org/citation.cfm?id=1867406.1867456
56
