27 research outputs found
Mapping constrained optimization problems to quantum annealing with application to fault diagnosis
Current quantum annealing (QA) hardware suffers from practical limitations
such as finite temperature, sparse connectivity, small qubit numbers, and
control error. We propose new algorithms for mapping boolean constraint
satisfaction problems (CSPs) onto QA hardware mitigating these limitations. In
particular we develop a new embedding algorithm for mapping a CSP onto a
hardware Ising model with a fixed sparse set of interactions, and propose two
new decomposition algorithms for solving problems too large to map directly
into hardware.
The mapping technique is locally-structured, as hardware compatible Ising
models are generated for each problem constraint, and variables appearing in
different constraints are chained together using ferromagnetic couplings. In
contrast, global embedding techniques generate a hardware independent Ising
model for all the constraints, and then use a minor-embedding algorithm to
generate a hardware compatible Ising model. We give an example of a class of
CSPs for which the scaling performance of D-Wave's QA hardware using the local
mapping technique is significantly better than global embedding.
We validate the approach by applying D-Wave's hardware to circuit-based
fault-diagnosis. For circuits that embed directly, we find that the hardware is
typically able to find all solutions from a min-fault diagnosis set of size N
using 1000N samples, using an annealing rate that is 25 times faster than a
leading SAT-based sampling method. Further, we apply decomposition algorithms
to find min-cardinality faults for circuits that are up to 5 times larger than
can be solved directly on current hardware.Comment: 22 pages, 4 figure
Recommended from our members
Nanometer VLSI placement and optimization for multi-objective design closure
In a VLSI physical synthesis flow, placement directly defines the interconnection,
which affects many other design objectives, such as timing, power consumption,
congestion, and thermal issues. With the scaling of technology, the relative interconnect
delay increases dramatically. As a result, placement has become a bottleneck
in deep sub-micron physical synthesis. In this dissertation, I propose several
optimization algorithms from global placement, placement migration, timing driven
placements, to incremental power optimizations for multi-objective VLSI design
closure. The first work is DPlace, a new global placement algorithm that scales
well to the modern large-scale circuit placement problems. DPlace simulates the
natural diffusion process to spread cells smoothly over the placement region, and
uses both analytical and discrete techniques to improve the wire length. However,
global placement is never sufficient for multi-objective design closure, a variety of
design objectives have to be improved incrementally, such as timing, routing congestion,
signal integrity, and heat distribution. Placement migration is a critical step
to address the cell overlaps appearing during incremental optimizations. To achieve
high placement stability, I propose a computational geometry based placement migration
flow to cope with placement changes, and a new stability metric to measure
the “similarity” between two placements accurately. Our placement migration algorithm
has clear advantage over conventional legalization algorithms such that the
neighborhood characteristics of the original placement are preserved. For timing
closure in high performance designs, I present a linear programming based incremental
timing driven placement to improve the timing on critical paths directly.
I further present an efficient timing driven placement algorithm (Pyramids). Two
formulations of Pyramids are proposed, which are suitable for different optimization
stages in a physical synthesis flow. Both approaches find the optimal location
for timing of a cell in constant time, through computational geometry based approaches.
For fast convergence of design closure, placement should be integrated
with other optimization techniques. I propose to combine placement, gate sizing
and Vt swapping techniques to reduce the total power consumption, especially the
leakage power, which is becoming increasingly critical for nanometer VLSI design
closure.Electrical and Computer Engineerin
On the design partitioning of 3D monolithic circuits
Conventional three-dimensional integrated circuits (3D ICs) stack multiple dies vertically for higher integration density, shorter wirelength, smaller footprint, faster speed and lower power consumption. Due to the large through-silicon-via (TSV) sizes, 3D design partitioning is typically done at the architecture-level With the emerging monolithic 3D technology, TSVs can be made much smaller, which enables potential block-level partitioning. However, it is still unclear how much benefit can be obtained by block-level partitioning, which is affected by the number of tiers and the sizes of TSVs. In this thesis, an 8-bit ripple carry adder was used as an example to explore the impact of TSV size and tier number on various tradeoffs between power, delay, footprint and noise. With TSMC 0.18um technology, the study indicates that when the TSV size is below 100nm, it can be beneficial to perform block-level partitioning for smaller footprint with minimum power, delay and noise overhead --Abstract, page iii
Inverse design of large-area metasurfaces
We present a computational framework for efficient optimization-based
"inverse design" of large-area "metasurfaces" (subwavelength-patterned
surfaces) for applications such as multi-wavelength and multi-angle
optimizations, and demultiplexers. To optimize surfaces that can be thousands
of wavelengths in diameter, with thousands (or millions) of parameters, the key
is a fast approximate solver for the scattered field. We employ a "locally
periodic" approximation in which the scattering problem is approximated by a
composition of periodic scattering problems from each unit cell of the surface,
and validate it against brute-force Maxwell solutions. This is an extension of
ideas in previous metasurface designs, but with greatly increased flexibility,
e.g. to automatically balance tradeoffs between multiple frequencies, or to
optimize a photonic device given only partial information about the desired
field. Our approach even extends beyond the metasurface regime to
non-subwavelength structures where additional diffracted orders must be
included (but the period is not large enough to apply scalar diffraction
theory).Comment: 18 pages, 8 figure
An efficient analytical placement algorithm using cell shifting, iterative local refinement and a hybrid net model
In this thesis, we present FastPlace-a fast, iterative, flat placement algorithm for large scale standard cell designs in the fixed-die context. FastPlace is based on the quadratic placement approach. The quadratic approach formulates the wirelength minimization problem as a convex quadratic program, which can be solved analytically by some efficient techniques. However, the quadratic approach in general suffers from some drawbacks. First, the resulting placement has a lot of overlap among cells. Second, the resulting total wirelength may be long as the quadratic wirelength objective is only an indirect measure of the total linear wirelength. Third, existing net models tend to create a lot of non-zero entries in the connectivity matrix while modeling the netlist and this slows down the quadratic program solver. These problems are handled as follows: (1) A Cell Shifting technique is proposed to generate an evenly distribute global placement from the quadratic program solution. This technique is very efficient and produces a high-quality global placement with even cell distribution. (2) An Iterative Local Refinement technique is proposed to reduce the wirelength according to the half-perimeter bounding rectangle measure. This technique is very effective as it makes use of the wirelength and cell distribution information provided by a coarse global placement. (3) A Hybrid Net Model is proposed which is a combination of the traditional clique and star models. This net model significantly reduces the number of non-zero entries in the connectivity matrix. It results in a significant speed-up of the solver as compared to using it with the traditional clique model. Experimental results show that the run-time of FastPlace is of the order O(n1·412), where n is the circuit size given by the number of pins. Also, the current implementation when tested on 18 Standard Cell benchmark circuits is on average 11.0 and 82.7 times faster than existing academic placers Capo and Dragon respectively
A novel framework for multilevel full-chip gridless routing
Abstract — Due to its great flexibility, gridless routing is desirable for nanometer circuit designs that use variable wire widths and spacings. Nevertheless, it is much more difficult than grid-based routing because of its larger solution space. In this paper, we present a novel “V-shaped ” multilevel framework (called VMF) for full-chip gridless routing. Unlike the traditional “Λ-shaped ” multilevel framework (inaccurately called the “Vcycle” framework in the literature), our VMF works in the V-shaped manner: top-down uncoarsening followed by bottom-up coarsening. Based on the novel framework, we develop a multilevel full-chip gridless router (called VMGR) for large-scale circuit designs. The top-down uncoarsening stage of VMGR starts from the coarsest regions and then processes down to finest ones level by level; at each level, it performs global pattern routing and detailed routing for local nets and then estimate the routing resource for the next level. Then, the bottom-up coarsening stage performs global maze routing and detailed routing to reroute failed connections and refine the solution level by level from the finest level to the coarsest one. We employ a dynamic congestion map to guide the global routing at all stages and propose a new cost function for congestion control. Experimental results show that VMGR achieves the best routability among all published gridless routers based on a set of commonly used MCNC benchmarks. Besides, VMGR can obtain significantly less wirelength, smaller critical path delay, and smaller average net delay than the previous works. In particular, VMF is general and thus can readily apply to other problems. I