Systematic and Deterministic Graph-Minor Embedding for Cartesian
  Products of Graphs by Zaribafiyan, Arman et al.
Systematic and Deterministic Graph Minor
Embedding for Cartesian Products of Graphs
Arman Zaribafiyan · Dominic J. J. Marchand ·
Seyed Saeed Changiz Rezaei
Abstract The limited connectivity of current and next-generation quantum an-
nealers motivates the need for efficient graph minor embedding methods. These
methods allow non-native problems to be adapted to the target annealer’s archi-
tecture. The overhead of the widely used heuristic techniques is quickly proving to
be a significant bottleneck for solving real-world applications. To alleviate this dif-
ficulty, we propose a systematic and deterministic embedding method, exploiting
the structures of both the specific problem and the quantum annealer. We focus
on the specific case of the Cartesian product of two complete graphs, a regular
structure that occurs in many problems. We decompose the embedding problem
by first embedding one of the factors of the Cartesian product in a repeatable pat-
tern. The resulting simplified problem consists in the placement and connecting
together of these copies to reach a valid solution. Aside from the obvious advantage
of a systematic and deterministic approach with respect to speed and efficiency,
the embeddings produced are easily scaled for larger processors and show desirable
properties for the number of qubits used and the chain length distribution. We
conclude by briefly addressing the problem of circumventing inoperable qubits by
presenting possible extensions of our method.
Keywords graph minor embedding, Cartesian product, quantum annealing
1 Introduction
The majority of the interesting combinatorial optimization problems are hard to
solve. Graph similarity, graph partitioning, graph colouring, resource allocation,
and scheduling problems are among those combinatorial optimization problems
proven to be NP-hard [6,13,17]. Many of these problems have significant real-world
1QB Information Technologies (1QBit)
458-550 Burrard Street, Vancouver, BC, V6C 2B5, Canada
Tel.: +1.646.820.8865
E-mail: arman.zaribafiyan@1qbit.com
E-mail: dominic.marchand@1qbit.com
E-mail: seyed.rezaei@1qbit.com
ar
X
iv
:1
60
2.
04
27
4v
2 
 [c
s.D
M
]  
9 J
ul 
20
16
2 Arman Zaribafiyan, Dominic J. J. Marchand, Seyed Saeed Changiz Rezaei
applications which make them especially interesting. For example, determining
similarities between graphs is a challenging problem that occurs when compar-
ing the structures of different molecules and is thus of great importance in drug
design applications [7]. Many of these problems can be formulated as quadratic
unconstrained binary optimization (QUBO) problems, which can be solved on
specialized quadratic solvers.
One type of quadratic solver which has garnered considerable attention in the
past few years is the quantum annealing processor manufactured by D-Wave Sys-
tems [9,10]. In essence, the processor is a specialized quantum device that samples
low-energy configurations of a set of Ising spin variables s ∈ {−1,+1}n that serve
as quantum registers or qubits. It is designed to solve a binary input problem
formulated as an Ising Hamiltonian specified by a pair (h,J), where h ∈ Rn is a
vector of local fields acting on the spin variables, and J ∈ Rn×n is a symmetric
matrix of quadratic couplings between these spins. The objective function to be
minimized is specified by the energy E(s) of the spin configuration s and given by
E(s) = E(s,h,J) = sTJs+ sTh. (1)
We note that an Ising problem can be formulated as a QUBO problem through
a simple linear transformation. Therefore, the quantum annealer is equivalently
considered to be a quadratic unconstrained binary optimizer that minimizes a
quadratic objective function Q given by
E(x) = E(x,Q) = xTQx (2)
over the discrete configuration space of a set of qubits x ∈ {0, 1}n. The solver
has limited connectivity between its qubits such that not all pairs can be coupled
together. In other words, only a subset of the terms of J or Q are allowed to
be non-zero. For this reason, the structure of the problem to be solved must be
mapped to the architecture of the solver, a process called embedding [5].
In this paper, we will treat both the input problem and the solver as graphs.
An input problem of interest, formulated as either a QUBO or Ising problem, can
be represented as a graph G = (V,E), where V is a set of vertices representing
either the logical variables or physical qubits, and E is a set of edges representing
the interactions between them. For the case of an Ising problem, the vertices of
this graph are the variables s1, . . . , sn, while the set of edges is created by adding
one edge for each pair of vertices si and sj for which Jij is non-zero. On the
other hand, the processor’s architecture is described by the hardware graph C.
This graph represents the available physical qubits or registers and shows how
they are coupled together on the processor. The earlier D-Wave Two hardware
graph (see Figure 6) has 512 physical qubits and each qubit is adjacent to at most
6 others. D-Wave’s nearly regular hardware graph, the Chimera graph, is denoted
by CN,M,L and constructed as an N ×M grid of KL,L bipartite blocks, as defined
in [3].
In order to embed the desired Ising model into the processor, the graph G
should be a subgraph of graph C. A mapping of the input graph to the target
graph is called a direct embedding. Seeking a direct embedding places stringent
constraints on the size and connectivity of the input graph. Alternatively, we can
seek a graph minor embedding, which is a specific type of mapping where we further
allow adjacent vertices of the target graph to be contracted into larger effective
Systematic and Deterministic Embedding for Cartesian Products of Graphs 3
vertices, called chains. In this more general case, the graph G should be a subgraph
of a graph minor of C. For a detailed description of graph minor embedding, see [5].
In simple terms, a chain is created through the addition of strong penalty terms to
the objective function such that the variables involved are forced to take the same
value. In the Ising formulation, this is achieved by applying a strong ferromagnetic
coupling between any two adjacent vertices i and j of G in the same chain.
In the most general case, where no assumption is made about the input and
target graphs, seeking a graph minor embedding is an NP-hard problem [4]. Prac-
tically, this means that as the size of the graphs increases, the problem of finding
a valid embedding quickly becomes prohibitively computationally expensive. To
avoid having to solve an NP-hard problem with each embedding, we could use
the fact that the structure of the solver is usually known in advance (here, it is a
Chimera graph structure possibly less a few inoperable qubits and couplers). This
means that polynomial solutions to the embedding problem remain achievable.
While such an exact method exists, its poor scaling still renders it unusable for
graphs larger than 10 vertices [1]. As a result, the most widely used embedding
algorithms, such as the one introduced by Cai, Macready, and Roy [4], are heuris-
tic in nature, compromising on embedding quality in order to achieve polynomial
running times with a more favourable scaling. Even then, finding an embedding is
typically very time consuming. This is further exacerbated by other limitations of
the analogue quantum device which can lead to highly variable performance, de-
pending on the quality of the embedding, prompting the need to run the heuristic
multiple times in order to select the best solution. Although less than ideal, heuris-
tic solutions have proven to be mostly satisfactory for the exploratory work con-
ducted on previous-generation quantum annealers, provided that sufficient com-
putation time could be allocated for embedding. With the recent introduction of
a 1000-qubit annealer, however, we are quickly reaching the point where more-
scalable solutions are needed. In fact, quantum annealing making the leap from
a nascent technology of purely academic interest to a useful mainstream tool is
conditional on the availability of fast embedding methods that will not eradicate
any potential quantum speed-up.
The most promising next-generation embedding methods should be systematic
and scalable. It is unlikely that such properties will be attained for truly general
approaches, and advances will come by exploiting not only the structure of the
target architecture, but also the structure of specific problems. We believe that so
long as embedding is needed, although general approaches are useful, it is with
application-specific and systematic graph embedding approaches that the full po-
tential of quantum hardware will be realized. The path to better or faster embed-
ding algorithms, therefore, lies in restricting the graph minor embedding problem
to specific cases. The triangular embedding of complete graphs [5], later general-
ized by Boothby, King, and Roy’s approach of fast clique embedding for complete
graphs [3], epitomizes this application-specific approach and creates a systematic
embedding for fully connected problems on the Chimera graph architecture. The
embeddings produced have equal-length chains and are general because any graph
is a subgraph of a complete graph. Unfortunately, this approach is wasteful for ap-
plications that do not require a fully connected graph, limiting the size of problems
embeddable with this method.
The first step in devising new embedding methods is to identify a common
structure across many problems that can be exploited advantageously. As we will
4 Arman Zaribafiyan, Dominic J. J. Marchand, Seyed Saeed Changiz Rezaei
show below, a recurring graph structure which appears in the quadratic formu-
lation of many of the NP-hard optimization problems mentioned above is the
Cartesian product of two graphs. Cartesian products, being both ubiquitous and
highly structured, are attractive targets for the type of improved methods we are
advocating. One of the main contributions of this research is the analytical iden-
tification of this regularity and structure in the QUBO problem formulation of
important families of NP-hard optimization problems. One of the most important
advantages of this contribution is that it enables us to reuse the found embeddings
for problems with similar structures. The ability to reuse embeddings reduces the
computational complexity of the embedding process to a one-time cost per family
of problems.
In this paper, we describe a procedure for embedding a Cartesian product of
two graphs into a Chimera graph. The vertex set of the Cartesian product G1G2
of two graphs G1 = (V1, E1) and G2 = (V2, E2) is the Cartesian product of the
vertex sets of the individual graphs. In the resulting graph, two vertices (v1, v2)
and (u1, u2) are adjacent if and only if v1 = u1 and v2 is adjacent to u2 or v2 = u2
and v1 is adjacent to u1. Denoting the adjacency matrices of graphs G1 and G2 by
A1 and A2, and having n1 := |V1| and n2 := |V2|, we can compute the adjacency
matrix of G1G2, that is, AG1G2 , in terms of the adjacency matrices of G1 and
G2 as follows:
AG1G2 = In1 ⊗AG2 + AG1 ⊗ In2 (3)
For the sake of generality with respect to embedding, for the remainder of this
paper, we look into the Cartesian product of complete graphs.
2 Identifying the Cartesian product of complete graphs (CPCG)
The Cartesian product of graphs can appear in many application-driven problems.
To exploit this structure, however, we first need to either infer its presence from the
problem’s QUBO form or preserve it as we formulate the problem from the outset.
An alternative structure-preserving QUBO problem formulation can be found in
[2]. Very efficient algorithms for identifying Cartesian products in arbitrary graphs
have been proposed. For example, Imrich and Iztok proposed an exact algorithm
[8] with linear scaling in terms of the number of edges for both the running time
and memory requirement by using a clever edge-labelling technique. Nevertheless,
it is useful to look at how Cartesian products occur when formulating optimization
problems where doubly indexed binary variables are used. This is what we consider
below.
Suppose we have a QUBO problem where the variables are doubly indexed
binary variables xik, where 1 ≤ i ≤ N and 1 ≤ k ≤ K. Such a structure occurs,
for example, in the K-way graph partitioning problem. We formulate this graph
partitioning problem as follows. Given a graph G = (V,E) with N vertices, we
want to divide the vertex set into K partitions, where K is a positive integer, such
that the sum of edges inside the partitions is maximized. Let A be the adjacency
matrix of the graph G built from the edge set E. For every vertex i and partition
k, the optimization variable xik is 1 if vertex i is in partition k, and 0 otherwise.
Furthermore, without loss of generality, we assume that N is divisible by K, and
we let P = N/K. The objective is to find the assignment of vertices to partitions
Systematic and Deterministic Embedding for Cartesian Products of Graphs 5
(i.e., a 0-1 configuration of xik’s) that maximizes the number of intra-partition
edges and satisfies the following constraints:
1. Orthogonality constraint: each vertex must be assigned to one and only
one partition
2. Cardinality constraint: each partition must have the same number of ver-
tices assigned to it
We note that for the case where N is not divisible by K, the second constraint is
relaxed such that the size of each partition should not differ from all others by more
than one vertex. This problem can be formulated as the following optimization
problem:
max
x={xik}
K∑
k=1
 ∑
(i1,i2)∈E
xi1kxi2k

s.t.:
N∑
i=1
xik = P, ∀ 1 ≤ k ≤ K
K∑
k=1
xik = 1, ∀ 1 ≤ i ≤ N
xi,k ∈ {0, 1}, ∀ i, k (4)
node 1 node 2 node N
part 1
part 2
part K
x11 x21 xN 1
x12 x22 xN 2
x1K x2K xN K
orthogonality constraints
cardinality 
constraints
...
...
...
...
... ... ... ...
Fig. 1 A matrix representation of doubly indexed variables xik in an example of a K-way
partitioning problem on a graph with N nodes. The matrix representation illustrates how
subsets of variables contribute to specific orthogonality and cardinality constraints.
In order to formulate this problem as a QUBO problem appropriate for the
annealer, we rewrite the objective function that needed to be maximized into an
objective function to be minimized, and implement the equality constraints as
6 Arman Zaribafiyan, Dominic J. J. Marchand, Seyed Saeed Changiz Rezaei
quadratic penalty terms. The resulting QUBO problem is equivalent to the previ-
ous constrained optimization problem for appropriately chosen penalty constants
A and B:
min
x={xik}
[
−
K∑
k=1
∑
(i1,i2)∈E
xi1kxi2k + A
K∑
k=1
(
N∑
i=1
xik − P
)2
(5)
+ B
N∑
i=1
(
K∑
k=1
xik − 1
)2 ]
The Cartesian product’s structure is easily observed by constructing the QUBO
problem graph for this partitioning problem. It can be built by reading the above
QUBO objective function directly. The QUBO problem graph has a vertex for
each doubly indexed binary variable, and an edge for each quadratic term of the
objective function. The following quadratic terms are found:
1. For a fixed k, the first summation creates a quadratic term xi1kxi2k if (i1, i2) ∈
E.
2. For a fixed k, the second summation creates a quadratic term xi1kxi2k for all
1 ≤ i1 < i2 ≤ N (after expanding the square of the sum).
3. For a fixed i, the third summation creates a quadratic term xikxik′ for all
1 ≤ k < k′ ≤ K (after expanding the square of the sum).
This correspondence between the quadratic terms and the edges in the QUBO
problem graph results in the fact that any subset of vertices corresponding to
a fixed i or k induces a complete graph on the problem graph. From this, we
conclude that the resulting QUBO problem graph is a Cartesian product of two
complete graphs KNKK . Figure 1 shows how grouping terms for a fixed partition
k and for a fixed vertex i can assist in identifying the structure in the final QUBO
formulation.
A similar argument can be used for any other input problem with doubly in-
dexed variables to identify whether there exists a product graph structure in the
resulting QUBO problem graph. In general, an input problem with doubly indexed
variables where the objective function and constraints are defined on subsets of
variables where one index is fixed will end up with QUBO problem graphs which
are subgraphs of Cartesian products of complete graphs. Graph partitioning, graph
colouring, and size-constrained clustering are important examples of such prob-
lems. In addition to these problems, Cartesian product structures have applications
in error-correction for adiabatic quantum computation. Recent research has shown
that using the Cartesian product of graphs as an error-correcting scheme reduces
the time to solution for certain families of problems [19].
3 Description of CPCG Embedding
We have mentioned that a systematic embedding relies in part on the regularity
of the target graph’s architecture. Our method is general and can be adapted
to different architectures provided they can be described as a regular lattice of
unit cells. Nevertheless, it will be convenient to restrict the presentation of our
method to a specific case. The Chimera hardware graph is the obvious choice, as
Systematic and Deterministic Embedding for Cartesian Products of Graphs 7
A5
A6
A7
A8
A5
A6
A7
A8
A5
A6
A7
A8
A1
A2
A3
A4
A1
A2
A3
A4
A1
A2
A3
A4
A1
A2
A3
A4
A6
A7
A8
A5
A5
A6
A7
A8
A2
A3
A4
A1
(a) A possible nexus choice for
a K8 graph
(b) An embedding for K8K7
Fig. 2 (a) A possible nexus choice for a K8 graph on an ideal Chimera chip with L = 4 using
the triangular embedding method. (b) A valid CPCG embedding for the product K8K7 using
the nexus shown in (a). The embedding is shown for a Chimera target graph C8,8,4.
it describes the architecture of the only commercially available quantum annealer.
The D-Wave Two processor uses a 512-qubit Chimera graph C8,8,4, and the newer
D-Wave 2X uses a 1152-qubit Chimera graph C12,12,4.
We consider the Cartesian product of two complete graphs with sizes m and n,
that is, KmKn, as the input graph. It is noteworthy that KmKn has n distinct
copies of Km as well as m distinct copies of Kn as induced subgraphs. We propose
to first embed one copy of either Km or Kn, say Km, into a repeatable unit which
we call a nexus. More precisely, a nexus consists of a collection of adjacent unit
cells of the Chimera graph. The regularity of the grid architecture then allows for
the embedding of n copies of Km by simply placing one nexus instance on the grid
for each of them. We are left with the problem of choosing the exact placement of
these instances and connecting them together to realize the full Cartesian prod-
uct. We call these inter-nexus connections buses and their arrangement the bus
configuration. We have thus not only chosen a simpler high-level description of the
original embedding problem, but also implicitly decomposed the problem into two
subproblems: the nexus selection, and the nexus instances and bus configuration.
We will now look at these subproblems in more detail and describe how they can
be implemented to also achieve a scalable embedding strategy with advantageous
properties.
We use the specific case of embedding K8Kn on CN,N,4 as a demonstration.
Figure 2a shows how three adjacent unit cells of the Chimera graph, along with
their couplers, are used for embedding K8, and constitutes our preferred choice
of nexus for hosting K8. This embedding is essentially the same as the default
triangular embedding for K8 [10]. Figure 2b illustrates an embedding pattern
based on this choice of nexus and how buses were placed to realize K8K7 on
C8,8,4.
8 Arman Zaribafiyan, Dominic J. J. Marchand, Seyed Saeed Changiz Rezaei
3.1 Nexus selection
The nexus shape depends on the structure of the graph to be embedded as well as
on the Chimera graph’s architecture. Embedding a nexus can be viewed as a much
smaller embedding problem with some added constraints pertaining to providing
the appropriate connections to all variables through dedicated bus interfaces. The
definition of these interfaces allows some level of encapsulation by abstracting
away the details of the nexus embedding for the rest of the method. On the target
architecture, an interface is a high-level object that stands for the couplers coming
out of the nexus and an indication of which variables are attached to them. In
the high-level description, on the other hand, it serves as an attachment point for
a bus extending a specific set of variables. Redundant interfaces can be defined
if needed. It is important to note that the definition of the nexus interfaces will
determine the optimal nexus placement and bus configuration. We can therefore
seek a nexus, and then list the interfaces available, or we can require a set of
interfaces as a constraint. To demonstrate the method, we require only that all
variables be accessible through an interface, leaving more-advanced considerations
for future work.
Solving the limited problem of finding a valid nexus could potentially benefit
from other known embedding techniques, including heuristics methods. For the
case at hand, embedding a complete graph on as few Chimera blocks as possi-
ble, the triangular embedding algorithm proposed in [10] represents an attractive
option. This choice exploits the Chimera graph’s large automorphism group (see
[4]) and its resulting high level of symmetry. Figure 2a shows that for a Chimera
graph with L = 4, three adjacent unit cells are sufficient to build a nexus for
K8. The bipartite nature of the Chimera block naturally partitions the set of
variables corresponding to the nodes of K8 into two subsets, each having four
redundant interfaces: two that face downward and to the left, and two that face
upward and to the right. For example, we partition the corresponding vertex set
{A1, . . . , A8} of the K8 nexus shown in Figure 2a into two subsets {A1, A2, A3, A4}
and {A5, A6, A7, A8}.
Each subset of variables is assigned a number of redundant interfaces available
for building the inter-nexus connections. An interface is therefore a set of con-
nection points, called terminals, corresponding to the variables of the subset and
placed on a specific face of the nexus. For example, in Figure 2a, a vertical bus
connects to an interface representing the subset of variables {A5, A6, A7, A8} and
extend them downward, and a horizontal bus connects to a second interface on the
same subset and extends them leftward. Similarly, two buses extends the subset
{A1, A2, A3, A4} in the opposite directions.
3.2 Nexus instance placement and bus configuration
We have slightly simplified the embedding problem by introducing a high-level
description involving nexuses and buses. We now need to solve that high-level
problem. Fortunately, the number of degrees of freedom has been greatly reduced
compared to that of the original problem. A tailored search algorithm can be im-
plemented based on tabu search or simulated annealing, for example. The allowed
steps or updates in configuration spaces are easily derived from the target architec-
Systematic and Deterministic Embedding for Cartesian Products of Graphs 9
ture and its symmetries. We leave such a general solution for future work, however,
and focus on a systematic method that works very well for embedding complete
graphs on the Chimera architecture, inspired in part by the rooks problem [11].
We call the area of the Chimera graph not occupied by nexus instances the bus
space. First introduced near the beginning of Section 3, a bus, more precisely, is a
set of parallel paths leaving from a nexus interface, with one path per variable (see
Figure 2b). Each path is assigned to a specific variable. Two buses can be linked
together at a bus junction. Locating the nexus instances hosting multiple copies
of the complete graph Km on the diagonal of the Chimera graph divides the bus
space into disjoint bus spaces. A unit cell of the Chimera graph where two buses
meet can be used as a junction.
The Chimera graph’s bipartite structure and the proposed triangular embed-
ding naturally invite a partitioning of the variables of Km into two subsets. We
therefore seek a placement of the nexus instances along a line that would divide
the bus space into two bus subspaces, providing access to both subspaces to each
nexus instance. The L-shape of the nexus also lends itself to a more efficient tiling
if we place these instances along the diagonal of the Chimera graph.
Next, we need to extend the nexus interfaces to build the connectivity required
by the full Cartesian product. For each subspace, this implies connecting each
nexus instance through one of its interfaces to an interface of each other nexus
instance in the same subspace. A valid configuration inspired by the rooks problem
is to attach both a vertical and a horizontal bus that run to the edge of the
chip. This creates a rectilinear grid where each pair of nexus instances meet at
a single unit cell. We use each of the created disjoint bus spaces to establish the
required connections for copies of Kn in KmKn. This embedding of copies of Kn
is achieved in a distributed manner through the buses, as opposed to the copies of
Km that are fairly localized and encapsulated in a nexus.
We note that when attempting to embed KmKn, it may happen that using
the nexus for one of the complete graphs does not result in a valid embedding,
while the other choice gives an appropriate embedding of the product. We simply
choose the most promising graph and call it Km without loss of generality.
4 Discussion
In this section, we show the clear advantage of the CPCG embedding method,
CPCG Embedding, over other embedding algorithms with respect to embedding
success rates and the quality of the embeddings achieved. We then prove for a
specific case that CPCG Embedding is optimal with respect to the largest em-
beddable problem, before commenting on the scaling of the running time of the
method. We begin by assuming the Chimera structure is perfectly regular (i.e., it
has no inoperable qubits or couplers). The effect of irregularities is investigated in
the next section.
4.1 Comparison to other embedding algorithms
To showcase the advantages of our method, we compare it to the de facto heuristic
method introduced by Cai, Macready, and Roy [4]. The implementation of this
10 Arman Zaribafiyan, Dominic J. J. Marchand, Seyed Saeed Changiz Rezaei
embedding algorithm, find embedding(), is distributed with D-Wave’s API and
software tools. This function receives both the problem to be embedded and the
target solver’s graph as inputs, making no assumptions about either of them.
Given the NP-hardness of the embedding problem and the poor scaling of the
polynomial methods when fixing the target graph, find embedding() remains
the only viable truly general alternative. The generality of the heuristic approach
also has some added benefits when inoperable qubits are present, a topic that will
be discussed in the next section. Since the Cartesian product of graphs Km and Kn
is a subset of a complete graph Kmn, the triangular systematic embedding method
[5] provides a simple, yet wasteful, approach to embedding Cartesian products and
will therefore serve as our second touchstone.
Our comparison will be restricted to the specific case of embedding Cartesian
products of the form K8Kn into a square Chimera target architecture CN,N,4
made of bipartite blocks of 8 qubits (K4,4). The find embedding() method is a
multi-start heuristic with a number of parameters to be specified. The algorithm
will keep searching until a valid embedding is found or until it reaches one of
its stopping criteria. The most important parameter is the maximum running
time allowed for the search, which we set to one of 1, 100, or 1000 seconds. Each
restart of the search is initiated when a maximum number of steps is reached
without observing an improvement. We leave this at its default value of 10 steps.
We further ensure that the search is not stopped prematurely (i.e., before the
maximum running time) by setting the maximum number of restarts to a large
value (e.g., 10,000 restarts given that each one takes at least 1 second).
In our comparison, we first consider the embedding success rate of the various
methods as shown in Figure 3. In the case of CPCG Embedding and the triangular
embedding, the success rate is simply 1.0 for all sizes smaller than some maximal
size, which we can express as a function of the size N of a square Chimera graph
CN,N,4. CPCG Embedding can embed up to K8KN−1, which means that n = 7
is the largest case with a success rate of 1.0 for a 512-qubit chip, and n = 15 is the
largest case with that success rate for a hypothetical next-generation 2048-qubit
chip. Beyond these sizes the success rate is 0. Similarly, for triangular embedding,
we can embed up to K8KN/2, which results in a success rate of 1.0 for n ≤ 4
(n ≤ 8) into a 512-qubit (2048-qubit) chip, and 0 otherwise. Since the results
obtained from find embedding() are probabilistic, we attempt to embed each
problem size 100 times for each maximum running time considered. For short run-
ning times, we find a satisfactory success rate only for the smallest problem sizes.
We can increase that probability somewhat by increasing the running time, but
even a generous 1000 seconds will not be sufficient to embed the largest Cartesian
products achievable with CPCG Embedding. Aside from the obvious effect of the
poor scaling of the running time of find embedding() on the success rate, we
also observe the limiting effect of the target chip’s size for a fixed running time.
As we get closer to the maximum embeddable problem size for a specific chip size,
the success rate is further reduced. The product K8K6, for example, is easily
embeddable into a 2048-qubit chip, but only succeeds 18% of the time with a
1000-second running time on the 512-qubit chip. With limited chip size also comes
a limited number of valid solutions, so the probability of finding a valid solution
is lower. In other words, the success rate obtained with the find embedding()
method will get worse as the technology scales and we begin to address larger
problem instances, but even more so when we test the limits of a specific archi-
Systematic and Deterministic Embedding for Cartesian Products of Graphs 11
0.00
0.25
0.50
0.75
1.00
K8
 □ 
K2
K8
 □ 
K3
K8
 □ 
K4
K8
 □ 
K5
K8
 □ 
K6
K8
 □ 
K7
K8
 □ 
K8
K8
 □ 
K9
K8
 □ 
K1
0
K8
 □ 
K1
1
K8
 □ 
K1
2
K8
 □ 
K1
3
K8
 □ 
K1
4
K8
 □ 
K1
5
K8
 □ 
K1
6
0.00
0.25
0.50
0.75
1.00
K8
 □ 
K2
K8
 □ 
K3
K8
 □ 
K4
K8
 □ 
K5
K8
 □ 
K6
K8
 □ 
K7
K8
 □ 
K8
K8
 □ 
K9
K8
 □ 
K1
0
K8
 □ 
K1
1
K8
 □ 
K1
2
K8
 □ 
K1
3
K8
 □ 
K1
4
K8
 □ 
K1
5
K8
 □ 
K1
6
CPCG embedding triangular embedding find_embedding() with 1000s find_embedding() with 100s find_embedding() with 1s
1
(a)
(b)
Em
be
dd
in
g 
su
cc
es
s r
at
e 
Problem size
0.00
0.25
0.50
0.75
1.00
K8
□K
2
K8
□K
3
K8
□K
4
K8
□K
5
K8
□K
6
K8
□K
7
K8
□K
8
K8
□K
9
K8
□K
10
K8
□K
11
K8
□K
12
K8
□K
13
K8
□K
14
K8
□K
15
K8
□K
16
0.00
0.25
0.50
0.75
1.00
K8
□K
2
K8
□K
3
K8
□K
4
K8
□K
5
K8
□K
6
K8
□K
7
K8
□K
8
K8
□K
9
K8
□K
10
K8
□K
11
K8
□K
12
K8
□K
13
K8
□K
14
K8
□K
15
K8
□K
16
CPCG Embedding triangular embedding find_embedding() with 1000s find_embedding() with 100s find_embedding() with 1s
1
0.00
0.25
0.50
0.75
1.00
K8
□K
2
K8
□K
3
K8
□K
4
K8
□K
5
K8
□K
6
K8
□K
7
K8
□K
8
K8
□K
9
K8
□K
10
K8
□K
11
K8
□K
12
K8
□K
13
K8
□K
14
K8
□K
15
K8
□K
16
0.00
0.25
0.50
0.75
1.00
K8
□K
2
K8
□K
3
K8
□K
4
K8
□K
5
K8
□K
6
K8
□K
7
K8
□K
8
K8
□K
9
K8
□K
10
K8
□K
11
K8
□K
12
K8
□K
13
K8
□K
14
K8
□K
15
K8
□K
16
CPCG embedding triangular embedding find_embedding() with 1000 s find_embedding() with 100 s find_embedding() with 1 s
1
0.00
0.25
0.50
0.75
1.00
K8
□K
2
K8
□K
3
K8
□K
4
K8
□K
5
K8
□K
6
K8
□K
7
K8
□K
8
K8
□K
9
K8
□K
10
K8
□K
11
K8
□K
12
K8
□K
13
K8
□K
14
K8
□K
15
K8
□K
16
0.00
0.25
0.50
0.75
1.00
K8
□K
2
K8
□K
3
K8
□K
4
K8
□K
5
K8
□K
6
K8
□K
7
K8
□K
8
K8
□K
9
K8
□K
10
K8
□K
11
K8
□K
12
K8
□K
13
K8
□K
14
K8
□K
15
K8
□K
16
CPCG embeddi triangular embedding find_embedding() with 1000s find_embedding() with 100s find_embedding() with 1s
1
K
8
⇤
K
2
K
8
⇤
K
3
K
8
⇤
K
4
K
8
⇤
K
5
K
8
⇤
K
6
K
8
⇤
K
7
K
8
⇤
K
8
K
8
⇤
K
9
K
8
⇤
K
1
0
K
8
⇤
K
1
1
K
8
⇤
K
1
2
K
8
⇤
K
1
3
K
8
⇤
K
1
4
K
8
⇤
K
1
5
K
8
⇤
K
1
6
K
8
⇤
K
2
K
8
⇤
K
3
K
8
⇤
K
4
K
8
⇤
K
5
K
8
⇤
K
6
K
8
⇤
K
7
K
8
⇤
K
8
K
8
⇤
K
9
K
8
⇤
K
1
0
K
8
⇤
K
1
1
K
8
⇤
K
1
2
K
8
⇤
K
1
3
K
8
⇤
K
1
4
K
8
⇤
K
1
5
K
8
⇤
K
1
6
0.00
0.25
0.50
0.75
1.00
K8
□K
2
K8
□K
3
K8
□K
4
K8
□K
5
K8
□K
6
K8
□K
7
K8
□K
8
K8
□K
9
K8
□K
10
K8
□K
11
K8
□K
12
K8
□K
13
K8
□K
14
K8
□K
15
K8
□K
16
0.00
0.25
0.50
0.75
1.00
K8
□K
2
K8
□K
3
K8
□K
4
K8
□K
5
K8
□K
6
K8
□K
7
K8
□K
8
K8
□K
9
K8
□K
10
K8
□K
11
K8
□K
12
K8
□K
13
K8
□K
14
K8
□K
15
K8
□K
16
CPCG embedding triangular embedding find_embedd ng() with 1000s fin _embedding() with 100s find_embedding() with 1s
1
0.00
0.25
0.50
0.75
1.00
K8
□K
2
K8
□K
3
K8
□K
4
K8
□K
5
K8
□K
6
K8
□K
7
K8
□K
8
K8
□K
9
K8
□K
10
K8
□K
11
K8
□K
12
K8
□K
13
K8
□K
14
K8
□K
15
K8
□K
16
0.00
0.25
0.50
0.75
1.00
K8
□K
2
K8
□K
3
K8
□K
4
K8
□K
5
K8
□K
6
K8
□K
7
K8
□K
8
K8
□K
9
K8
□K
10
K8
□K
11
K8
□K
12
K8
□K
13
K8
□K
14
K8
□K
15
K8
□K
16
CPCG embedding triangular embeddi g find_embedding() with 1000s fin _embedding() with 100s find_embedding() with 1s
1
Fig. 3 The embedding success rate for embedding Cartesian products of complete graphs
using find embedding() (for 1000 seconds in light blue, 100 seconds in grey, and 1 second in
yellow), the full triangular embedding (orange), and our systematic CPCG Embedding (dark
blue) for the case of K8Kn as a function of n. The last two are assumed to be produced
in much less than a second. Panel (a) shows results for the ideal case of the previous chip’s
512-qubit architecture C8,8,4, and panel (b) shows results for a hypothetical ideal 2048-qubit
Chimera architecture C16,16,4.
tecture. CPCG Embedding is clearly the superior choice for a perfect Chimera
chip (i.e., one where all qubits are operable), as it can embed products far larger
than the two alternatives in a very short time. We discuss the scaling of running
time in more detail in Section 4.3. In fact, we can even show in some cases that
CPCG Embedding can embed the largest possible Cartesian product of complete
graphs embeddable for a target chip size (see the next section on the discussion of
optimality).
Beyond the ability to embed a problem into a chip, the quality of the embedding
is paramount. Benchmarking for various types of optimization problems can show
a difference of a few orders of magnitude between different embeddings of the
same problem. At this point, there exists no single first-principle metric to rate
embedding quality. Empirical ratings such as the metric used in [15] represent
the most-practical embedding quality metric at this point. Nevertheless, quantum
annealing practitioners have used the number of physical qubits or the length of
the longest chain as a conjectural measure of the embedding quality [4]. It has
also been suggested that having heterogeneous chain lengths in an embedding
is disadvantageous since the chains tend to exhibit unpredictable chain dynamics
throughout the annealing process [3,18]. Clarifying the relative role of these various
properties in determining the quality of an embedding is beyond the scope of this
paper, so we will limit our comparison to the traditional indicators by comparing
12 Arman Zaribafiyan, Dominic J. J. Marchand, Seyed Saeed Changiz Rezaei
the number of physical qubits and the chain length distribution of the CPCG
Embedding with the other alternatives.
CPCG Embedding
find_embedding ( )
triangular embedding
K8⇤Kn
N
um
be
r o
f q
ub
its
Cartesian product
Fig. 4 Average number of qubits used for embedding Cartesian products of complete graphs
for the case of K8Kn as a function of n into a 2048-qubit architecture C16,16,4. Results
are shown for D-Wave’s find embedding() heuristic (in red), the full triangular embedding
(orange), and our systematic CPCG Embedding (dark blue). A fit for the averaged number of
qubits used in embeddings produced by find embedding() and given by 16.63n2− 11.01n+
5.75 is shown with a dashed red line.
The number of physical qubits used is shown in Figure 4, and the chain length
distribution is shown in Figure 5. CPCG Embedding for an ideal Chimera graph for
K8Kn produces chains of length n+2. With 8n logical variables, the embedding
uses a total of 8n(n + 2) physical qubits. In comparison, a triangular embedding
for a complete graph K8n has chains of length 2n+1 for a total number of physical
qubits used equal to 8n(2n + 1). This is twice that of CPCG Embedding in the
asymptotic limit. Both of these embedding methods produce equal-length chains.
The find embedding() method, on the other hand, produces a spread of chain
lengths for each successful embedding found. To illustrate this distribution, we
average the mean, the minimum, and the maximum chain lengths over the 100
embeddings found. The average standard deviation is also shown such that 65%
of the chains produced are found in the red shaded region. Results depend only
marginally on the maximum running time, provided that it is long enough to
find a valid embedding, so we allowed for a generous 1000 seconds. A quadratic
function fit of the averaged number of qubits used is given by 16.63n2 − 11.01n+
5.75, and a fit for the averaged mean chain length is given by 2.05n − 1.06. In
Systematic and Deterministic Embedding for Cartesian Products of Graphs 13
find_embedding()
CPCG Embedding
find_embedding ( )
triangular embedding
Cartesian product K8⇤Kn
C
ha
in
 le
ng
th
 (q
ub
its
)
Fig. 5 Chain length for embedding Cartesian products of complete graphs for the case of
K8Kn as a function of n into a 2048-qubit architecture C16,16,4. Results are shown for
D-Wave’s find embedding() heuristic (in red and yellow), the full triangular embedding (or-
ange), and our systematic CPCG Embedding (dark blue). The latter two produce embeddings
with chains that are all of equal length, shown as a single line. The spread of chain lengths
produced by find embedding() is illustrated by averaging the mean (central red line), max-
imum (upper yellow line), and minimum (lower yellow line) chain length over 100 embeddings.
The average standard deviation (also in red) of the chain length is also shown such that the
red shaded region illustrates where 65% of the chains can typically be found. A fit for the
averaged maximum chain length is given by 0.12n2 + 1.91n− 0.48 (dashed yellow line) and a
fit for the averaged mean chain length is given by 2.05n− 1.06 (dashed red line).
the asymptotic limit, therefore, CPCG embedding produces chains that are less
than half of the mean length produced by find embedding(). Consequently, we
observe that the required number of qubits is also less than half that of the number
of qubits required by find embedding(). Although find embedding() found
some embeddings for K8K11, the statistics are not shown as they were artificially
skewed due to the smaller number of embeddings found.
We find that CPCG Embedding behaves and scales favourably compared to
both the heuristic method of find embedding() and the systematic triangular
embedding. It can embed larger products on chips of the same size while producing
shorter chains of equal length.
14 Arman Zaribafiyan, Dominic J. J. Marchand, Seyed Saeed Changiz Rezaei
4.2 Discussion of optimality
Having shown that CPCG Embedding compares favourably against other tech-
niques, in the following theorem we prove its optimality in certain cases.
Theorem 1 Let N be the smallest number such that CPCG Embedding can embed
KmKn into the square Chimera graph architecture CN,N,L. If one of m and n
is divisible by 2L and the other one is odd, then this embedding is optimal in the
sense that KmKn cannot be embedded into a smaller square Chimera graph.
Proof By symmetry, we may assume that 2L divides m and n is odd. Let us choose
Km to be the graph embedded in a nexus. By placing the nexuses on the diagonal,
CPCG Embedding described in the previous section embeds KmKn into CN,N,L,
where
N =
⌈dmL e
2
⌉
(n− 1) +
⌈
m
L
⌉
= m(n + 1)/(2L). (6)
Now suppose that KmKn can be embedded into CN ′,N ′,L. To prove optimal-
ity we need only show that N ′ ≥ N .
The proof uses a treewidth argument. Let tw(G) denote the treewidth of a
graph G. Since KmKn can be embedded into CN ′,N ′,L, the former graph is a
minor of the latter, so we have the following inequality between their treewidths:
tw(KmKn) ≤ tw(CN ′,N ′,L). (7)
On the one hand, since n is odd, we can construct a bramble similar to that
given in the proof of [12, Lemma 3.2] to show that
m(n + 1)/2− 1 ≤ tw(KmKn). (8)
On the other hand, the treewidth of CN ′,N ′,L is known to be N ′L (this state-
ment is given without proof in [3] and is confirmed with further explanation in
[16]).
Combining these two treewidth results with (7) gives
m(n + 1)/2− 1 ≤ N ′L,
which means that
N ′ ≥ m(n + 1)/(2L)− 1/L = N − 1/L.
But since L > 1 and N and N ′ are positive integers, this implies that N ′ ≥ N , as
required.
We believe that the result above also holds for KmKn, where m is divisible
by 2L and n ≥ m. However, the proof will require a complicated adaptation of the
ideas given in [12, Lemma 3.2] that is beyond the scope of this paper.
As a concrete example, we consider the Chimera structure C8,8,4 corresponding
to a 512-qubit chip. We know that CPCG Embedding can embed K8K7, and,
consequently, any product of the form KmKn, where m ≤ 8 and n ≤ 7. Indeed,
the treewidth of C8,8,4 is 32, and K8K7 has a treewidth smaller than or equal
to 31 and is therefore embeddable, whereas any product of the form K8Kn with
n > 7 must have a treewidth of at least 35 and is not embeddable.
Systematic and Deterministic Embedding for Cartesian Products of Graphs 15
4.3 Running time
We may assume that the input to our algorithm is a polynomial in doubly indexed
variables. For example, in the problem of colouring a graph of n vertices with
m colours, the quadratic formulation of the problem contains a polynomial in
the variables xij , with i ∈ {1, . . . , n} and j ∈ {1, . . . ,m}. Then, we consider the
Cartesian product of two complete graphs KmKn to be embedded into a Chimera
graph.
In order to identify the appropriate complete graphs whose Cartesian product
contains the input graph, we need O (n2m2) operations. Furthermore, from our
proposed algorithm, the total number of operations needed to embed a Cartesian
product KmKn into a Chimera graph is O
(
n2m2
)
. It is worth mentioning that if
we consider the input to be a graph with e edges, the number of operations needed
to identify an appropriate Cartesian product of two complete graphs is O(e).
Now suppose a graph H is to be embedded into a graph G, and both of them
are the inputs to the embedding algorithm proposed by [4]. Let nH and eH denote
the number of vertices and edges of graph H, and nG and eG be the number of
vertices and edges of graph G, respectively. The running time of the algorithm in
[4] is O (nHnGeH(eG + nG lognG)).
5 Fault tolerance and extensions
One key to our low-complexity scalable algorithm is to make use of the lattice-like
regularity in the target Chimera graph. Although the numerical results show sig-
nificant improvement over general heuristics used for embedding into a perfectly
regular Chimera graph, we have thus far not accounted for potential defects and
their impact on CPCG Embedding. One can, of course, argue that such defects
are merely a temporary nuisance which will eventually be eliminated as the tech-
nology matures. Nevertheless, for the method to be of immediate practical use,
the general case of a target graph with inoperable qubits and couplers needs to
be considered. Unfortunately, these inoperable qubits break the perfect regularity
of the Chimera graph, the very feature on which our approach is based. Figure
6 depicts an actual instance of a D-Wave chip with inoperable qubits. This chip
with a Chimera structure C8,8,4 with 509 working qubits was installed at NASA’s
Ames Research Center [14], and was only recently replaced by a newer D-Wave
2X system.
5.1 Presentation of the fault-tolerant method
The issue of having inoperable qubits can be addressed at the expense of adding
more complexity to our scalable embedding approach. One simple idea is to use
the CPCG embedding of the problem on an ideal solver as a starting point and
apply small modifications so that a valid embedding that circumvents the irreg-
ularities caused by inoperable qubits is reached. We expect that such a solution
should achieve reasonable performance on chips with high qubit yields, while a
low qubit yield would lead to substantial degradation of both the embeddability
16 Arman Zaribafiyan, Dominic J. J. Marchand, Seyed Saeed Changiz Rezaei
Fig. 6 A graphical representation of the connectivity map of the 512-qubit chip with Chimera
architecture C8,8,4, and 3 inoperable qubits and associated couplers shown in red.
and the embedding quality. We describe below how this type of extension can be
implemented.
Using the embedding pattern on a perfect chip as a starting point, we address
each nexus instance in turn. We begin with the first nexus and look at the capacity
of its constituting blocks in each direction. The capacity of block (i, j) in a given
direction is denoted by ci,jdirection, where (i, j) are indices on a two-dimensional grid.
The block capacities determine how many paths for variables can run through a
group of adjacent vertical or horizontal blocks to propagate a set of variables (i.e.,
the bus capacity). The presence of inoperable qubits along these directions will
usually result in a reduced capacity. We then extend the size of the nexus until
the relevant blocks along the vertical (horizontal) direction can form a bus with
sufficient capacity. Then a variant of triangular embedding is used to embed the
same complete graph in the newly extended space for the nexus. As a result of the
nexus extension, we need to shift the other nexus instances appropriately. Although
we have just described how a nexus extension can circumvent an inoperable qubit
along a bus path, this shape modification can also help with embedding a nexus
when there are inoperable qubits within the nexus boundaries. For lower qubit
yields, triangular embedding might fail to embed a nexus instance regardless of
the number of shifts and extensions. In such situations, a more complex nexus
embedding algorithm should be used to compensate for the high irregularity in
Systematic and Deterministic Embedding for Cartesian Products of Graphs 17
A5
A6
A7
A8
A5
A6
A7
A8
A5
A6
A7
A8
A1
A2
A3
A4
A1
A2
A3
A4
A1
A2
A3
A4
A1
A2
A3
A4
(a) A possible K8 nexus extension with
an inoperable qubit
(b) A K8 nexus on a chip with inoperable
qubits
Fig. 7 (a) Example of a nexus modification using a horizontal extension to avoid an inoperable
qubit. The original nexus shape is the same as in Figure 2a and the modification is needed
to account for the inoperable qubit A7, shown in red. (b) A valid embedding for the input
problem K8K6 into a 512-qubit chip C8,8,4 with 3 inoperable qubits, formerly installed at
NASA’s Ames Research Center. L-shapes with various colours are modified copies of the nexus
chosen in Figure 2a for embedding copies of K8. The red circles represent inoperable qubits.
The blue nexus is extended due to an inoperable qubit inside the nexus area and the orange
nexus is extended because of a bus capacity problem caused by an inoperable qubit.
the target graph. Figure 7a illustrates how a simple nexus extension can address
the problem caused by having an inoperable qubit within the nexus, and Figure
7b provides a more complete example by showing which modifications need to be
performed to embed a K8K6 on the specific 509-qubit chip in Figure 6. As the
figure illustrates, the shift-and-extension method is applied to bypass the columns
and rows of lower bus capacity caused by inoperable qubits. In the next section,
we provide the numerical analysis of the performance of this algorithm compared
to the find embedding() heuristic [4] for this specific chip architecture. The
pseudo-code in Algorithm 1 provides a few more details of this proof of concept
for this simple fault-tolerant method.
5.2 Comparison to other embedding methods
We have tested the simple fault-tolerant algorithm to embed the family of K8Kn
problems on the quantum annealer described in Figure 6. We again compare it
to the results produced by the find embedding() heuristic run for 1000 sec-
onds and each problem was repeated 100 times to collect statistics. The other
parameters used are the same as in Section 4. Figure 7 shows an embedding of
the maximum problem size embeddable on this chip with CPCG Embedding. The
find embedding() heuristic can also embed this problem size, albeit with a suc-
cess rate of about 18%. Unsurprisingly, the success rate of a heuristic method such
as find embedding() is not greatly affected by a small number of inoperable
qubits. Figure 8 illustrates how the success probability of find embedding()
18 Arman Zaribafiyan, Dominic J. J. Marchand, Seyed Saeed Changiz Rezaei
Data: Adjacency matrix of the target Chimera graph ACM,N,L with inoperable qubits, dimensions
of the CPCG (α, β)
Result: An embedding E(α,β) in the target Chimera graph
Initialization:
C[i, j]← Calculate the capacity vector (ci,jvertical, ci,jhorizontal) for each block [i, j];
E∗(α,β) ← Load a scalable embedding pattern on the ideal target graph A∗CM,N,L ;
diagonals← {(Down,East), (Up,East)};
for direction ∈ diagonals do
(ROW,COL)← coordinates of the corner block of the current direction;
for nexus ∈ E∗(α,β) do
nexus shape← collection of blocks used by nexus ∈ E∗(α,β);
Shift to the current available position (ROW,COL);
while the nexus is not embedded do
Cnexus ← required capacity by triangular embedding of nexus based on nexus shape;
if C[ROW,COL] < Cnexus then
extend ← identify the direction to extend based on sign(Cnexus −
C[ROW,COL]);
extend to (Cnexus − C[ROW,COL]) blocks toward extend direction;
nexus embeddability ←Call triangular embedding() on updated nexus shape;
if nexus embeddability then
Locate nexus on ACM,N,L ;
Update E(α,β) and (ROW,COL);
continue to next nexus;
end
else
Locate nexus on ACM,N,L ;
Update E(α,β) and (ROW,COL);
end
end
end
end
Algorithm 1: Fault-tolerant CPCG Embedding based on shifts and extensions
still drops faster than CPCG Embedding with increasing problem size. We expect
our previous observation that the advantage of CPCG Embedding becomes more
prominent for larger chip sizes to hold for high qubit yields. In other words, CPCG
Embedding should outperform find embedding() by a larger margin for larger
target architectures despite the presence of irregularities caused by a low density
of inoperable qubits.
We also note that the embedding quality of CPCG embeddings remains su-
perior with respect to the number of required qubits and chain length distribu-
tion. These are compared in Figures 9 and 10 for the chip with 509 qubits. Here,
too, the results are not shown for K8K6 for find embedding() because they
were skewed due to a lower embedding success rate. Again, we observe that the
find embedding() heuristic is not very sensitive to this small density of inopera-
ble qubits, as the chain length distribution and qubit counts are almost identical to
the ideal case. Given that CPCG Embedding relies on the regularity of the target
graph, unlike find embedding(), we unsurprisingly observe a higher sensitivity
to the irregularities caused by inoperable qubits. This results in some degrada-
tion in the embedding quality. The chains in each successful embedding are no
longer equal because the algorithm needs to route around inoperable qubits, re-
sulting in the spreading out of the distribution. For the same reason, we observe a
larger qubit count for the embeddings on the real chip compared to the ideal case.
Despite the changes, both the required number of qubits and the distribution of
chain lengths remain significantly superior to find embedding(). It is true that
Systematic and Deterministic Embedding for Cartesian Products of Graphs 19
we are considering a high qubit yield, but Figures 9 and 10 indicate that CPCG
Embedding still has potentially enough of an advantage over find embedding()
to remain the preferable method, even for lower qubit yields. Obviously, a cross-
ing point is expected and methods like find embedding() remain indicated for
irregular target graphs.
We included this simple algorithm to show the possibility of modifying our ap-
proach to be used for real chips. However, this approach seems intuitively wasteful
as it readily discards large blocks of qubits. There exists an obvious trade-off
between the complexity of the fault-tolerant embedding algorithm and its per-
formance in terms of embedding success rate and embedding quality. Work on a
refined approach, still based on modifying the ideal CPCG embedding pattern,
is ongoing and will be presented elsewhere. We believe that improvements to the
techniques described herein should allow us to achieve a higher tolerance to irreg-
ularities while preserving most of the desirable features such as running time and
embedding properties.
0.00
0.25
0.50
0.75
1.00
k8
 □ 
k2
k8
 □ 
k3
k8
 □ 
k4
k8
 □ 
k5
k8
 □ 
k6
k8
 □ 
k7
CPCG Embedding find_embedding() with 1000 s
1
Em
be
dd
in
g 
su
cc
es
s r
at
e 
Problem size
K
8
⇤
K
2
K
8
⇤
K
3
K
8
⇤
K
4
K
8
⇤
K
5
K
8
⇤
K
6
K
8
⇤
K
7
Fig. 8 The embedding success rate for embedding Cartesian products of complete graphs into
a chip with inoperable qubits using D-Wave’s find embedding() heuristic for 1000 seconds
(orange) and our systematic CPCG Embedding (dark blue) for the case of K8Kn as a
function of n. The C8,8,4 chip used has 509 working qubits out of 512 and is described in
Figure 6. The largest problem embedded by both approaches is K8K6 with a success rate of
19% for find embedding() and 100% for CPCG Embedding.
6 Conclusion
Motivated by several interesting combinatorial problems such as graph colouring
and graph partitioning, we proposed a systematic, deterministic, and scalable em-
bedding algorithm for embedding the Cartesian product of two complete graphs
into D-Wave Systems’ Chimera hardware graph. To develop this method, we ex-
ploited the intrinsic structure of a class of combinatorial optimization problems as
well as the structure of the Chimera graph. Although more-general (and perforce
20 Arman Zaribafiyan, Dominic J. J. Marchand, Seyed Saeed Changiz Rezaei
find_embedding ( ) on 509-qubit chip
CPCG Embedding on 509-qubit chip
CPCG Embedding on ideal chip
N
um
be
r o
f q
ub
its
K8⇤KnCartesian product
Fig. 9 Average number of qubits used for embedding Cartesian products of complete graphs
into a chip with inoperable qubits using D-Wave’s find embedding() heuristic for 1000
seconds (red) and our systematic CPCG Embedding (dark blue) for the case of K8Kn as
a function of n. The C8,8,4 chip used has 509 working qubits out of 512 and is described in
Figure 6. CPCG Embedding results for a perfect chip of the same size are also shown (dotted
black line).
slower) methods will remain necessary, it is with such application-specific algo-
rithms that the best performance can be achieved and we expect similar studies
to follow suit in the near future. In the case of our CPCG Embedding algorithm,
we not only showed advantageous running time scaling, and how embedding pat-
terns can be cost-effectively scaled up for larger chip architectures, we also proved
CPCG Embedding to be optimal in specific cases. Beyond the better embedding
success rate achieved, the quality of the embeddings generated, as measured by the
usual empirical factors, is superior to other methods. Indeed, CPCG Embedding
produces equal-length chains on an ideal Chimera chip and uses far fewer physical
qubits. Such improvements in the quality of embedding can play a major role in re-
ducing the time to solution when solving problems. Given the algorithm’s reliance
on the regularity of the target architecture, it is natural to expect a degradation
of performance in the presence of defects. Although we did not explore optimal
modifications to the method to handle inoperable qubits and couplers, we pre-
sented a simple version of the algorithm for those cases and tested it on the C8,8,4
512-qubit NASA chip with 509 working qubits, described in [14]. The results sug-
gest that for high qubit yields, CPCG Embedding will retain some advantage in
both embedding success rates and quality indicators over more-general heuristic
methods.
Systematic and Deterministic Embedding for Cartesian Products of Graphs 21
find_embedding ( ) on 509-qubit chip
CPCG Embedding on 509-qubit chip
CPCG Embedding on ideal chip
C
ha
in
 le
ng
th
 (q
ub
its
)
Cartesian product K 8 K n
Fig. 10 Chain length for embedding Cartesian products of complete graphs into a chip with
inoperable qubits using D-Wave’s find embedding() heuristic for 1000 seconds (yellow and
red) and our systematic CPCG Embedding (blue) for the case of K8Kn as a function of n.
The C8,8,4 chip used has 509 working qubits out of 512 and is described in Figure 6. One CPCG
embedding instance for each problem size is shown in blue with the dark blue line showing the
average chain length and the shaded blue area representing the spread between the minimum
and maximum chain lengths. The spread of chain lengths produced by find embedding() is
illustrated by averaging the mean (central red line), maximum (upper yellow line), and min-
imum (lower yellow line) chain lengths over 100 embeddings. The average standard deviation
(also in red) of the chain length is also shown such that the red shaded region illustrates where
65% of the chains can typically be found.
Acknowledgements The authors are grateful to Marko Bucyk for editing the manuscript,
and to Brad Woods, Natalie Mullin, Abbas Mehrabian, and Robyn Foerster for useful discus-
sions and input.
References
1. Adler, I., Dorn, F., Fomin, F.V., Sau, I., Thilikos, D.M.: Faster parameterized algorithms
for minor containment. Theoretical Computer Science 412(50), 7018–7028 (2011). DOI
http://dx.doi.org/10.1016/j.tcs.2011.09.015. URL http://www.sciencedirect.com/
science/article/pii/S0304397511007912
2. Alghassi, H.: The algebraic QUBO design framework. To be published
3. Boothby, T., King, A.D., Roy, A.: Fast clique minor generation in chimera qubit connec-
tivity graphs. Quantum Information Processing 15(1), 495–508 (2016). DOI 10.1007/
s11128-015-1150-6. URL http://dx.doi.org/10.1007/s11128-015-1150-6
4. Cai, J., Macready, W.G., Roy, A.: A practical heuristic for finding graph minors. arXiv
preprint arXiv:1406.2741 (2014)
22 Arman Zaribafiyan, Dominic J. J. Marchand, Seyed Saeed Changiz Rezaei
5. Choi, V.: Minor-embedding in adiabatic quantum computation: II. Minor-universal
graph design. Quantum Information Processing 10(3), 343–353 (2011). DOI 10.1007/
s11128-010-0200-3. URL http://dx.doi.org/10.1007/s11128-010-0200-3
6. Fan, N., Pardalos, P.M.: Linear and quadratic programming approaches for the general
graph partitioning problem. Journal of Global Optimization 48(1), 57–71 (2010). DOI 10.
1007/s10898-009-9520-1. URL http://dx.doi.org/10.1007/s10898-009-9520-1
7. Hernandez, M., Zaribafiyan, A., Aramon, M., Naghibi, M.: A novel graph-based approach
for determining molecular similarity. arXiv preprint arXiv:1601.06693 (2016)
8. Imrich, W., Peterin, I.: Recognizing Cartesian products in linear time. Discrete Mathe-
matics 307(3–5), 472–483 (2007)
9. Johnson, M.W., Amin, M.H.S., Gildert, S., Lanting, T., Hamze, F., Dickson, N., Harris,
R., Berkley, A.J., Johansson, J., Bunyk, P., Chapple, E.M., Enderud, C., Hilton, J.P.,
Karimi, K., Ladizinsky, E., Ladizinsky, N., Oh, T., Perminov, I., Rich, C., Thom, M.C.,
Tolkacheva, E., Truncik, C.J.S., Uchaikin, S., Wang, J., Wilson, B., Rose, G.: Quantum
annealing with manufactured spins. Nature 473(7346), 194–198 (2011)
10. Kaminsky, W., Lloyd, S.: Scalable architecture for adiabatic quantum computing of NP-
hard problems. In: A. Leggett, B. Ruggiero, P. Silvestrini (eds.) Quantum Computing and
Quantum Bits in Mesoscopic Systems, pp. 229–236. Springer US (2004). DOI 10.1007/
978-1-4419-9092-1 25. URL http://dx.doi.org/10.1007/978-1-4419-9092-1_25
11. Kaplansky, I., Riordan, J.: The problem of the rooks and its applications. Duke Math.
J. 13(2), 259–268 (1946). DOI 10.1215/S0012-7094-46-01324-5. URL http://dx.doi.
org/10.1215/S0012-7094-46-01324-5
12. Lucena, B.: Achievable sets, brambles, and sparse treewidth obstructions. Discrete applied
mathematics 155(8), 1055–1065 (2007)
13. Pardalos, P.M., Mavridou, T., Xue, J.: Handbook of Combinatorial Optimization: Volume
2, chap. The Graph Coloring Problem: A Bibliographic Survey, pp. 1077–1141. Springer
US, Boston, MA (1999). DOI 10.1007/978-1-4613-0303-9 16. URL http://dx.doi.
org/10.1007/978-1-4613-0303-9_16
14. Perdomo-Ortiz, A., O’Gorman, B., Fluegemann, J., Biswas, R., Smelyanskiy, V.N.: De-
termination and correction of persistent biases in quantum annealers. Scientific Reports
6 (2016)
15. Rieffel, E.G., Venturelli, D., O’Gorman, B., Do, M.B., Prystay, E.M., Smelyanskiy, V.N.: A
case study in programming a quantum annealer for hard operational planning problems.
Quantum Information Processing 14(1), 1–36 (2014). DOI 10.1007/s11128-014-0892-x.
URL http://dx.doi.org/10.1007/s11128-014-0892-x
16. Roy, A.: Private conversation
17. Venturelli, D., J. J. Marchand, D., Rojo, G.: Quantum annealing implementation of job-
shop scheduling. arXiv preprint arXiv:1506.08479 (2015)
18. Venturelli, D., Mandra`, S., Knysh, S., O’Gorman, B., Biswas, R., Smelyanskiy, V.: Quan-
tum optimization of fully connected spin glasses. Phys. Rev. X 5, 031,040 (2015). DOI
10.1103/PhysRevX.5.031040. URL http://link.aps.org/doi/10.1103/PhysRevX.
5.031040
19. Young, K.C., Blume-Kohout, R., Lidar, D.A.: Adiabatic quantum optimization with the
wrong hamiltonian. Phys. Rev. A 88, 062,314 (2013). DOI 10.1103/PhysRevA.88.062314.
URL http://link.aps.org/doi/10.1103/PhysRevA.88.062314
