There has been considerable progress in the design and construction of
quantum annealing devices. However, a conclusive detection of quantum speedup
over traditional silicon-based machines remains elusive, despite multiple
careful studies. In this work we outline strategies to design hard tunable
benchmark instances based on insights from the study of spin glasses - the
archetypal random benchmark problem for novel algorithms and optimization
devices. We propose to complement head-to-head scaling studies that compare
quantum annealing machines to state-of-the-art classical codes with an approach
that compares the performance of different algorithms and/or computing
architectures on different classes of computationally hard tunable spin-glass
instances. The advantage of such an approach lies in having to only compare the
performance hit felt by a given algorithm and/or architecture when the instance
complexity is increased. Furthermore, we propose a methodology that might not
directly translate into the detection of quantum speedup, but might elucidate
whether quantum annealing has a "`quantum advantage" over corresponding
classical algorithms like simulated annealing. Our results on a 496 qubit
D-Wave Two quantum annealing device are compared to recently-used
state-of-the-art thermal simulated annealing codes.Comment: 14 pages, 8 figures, 3 tables, way too many reference