SAT-based algorithms for internal cell routing in nanoelectronic circuits by Vidal Obiols, Alexandre
SAT-based Algorithms for
Internal Cell Routing in
Nanoelectronic Circuits
Author:
Alexandre Vidal Obiols
Supervisors:
Jordi Cortadella Fortuny
Jordi Petit Silvestre
Computer Science Department
Master of Innovation and Research in Informatics
Facultat d’Informa`tica de Barcelona
October 15, 2015

DADES DEL PROJECTE
Tı´tol del projecte: SAT-based Algorithms for Internal Cell Rout-
ing in Nanoelectronic Circuits
Nom de l’estudiant: Alexandre Vidal Obiols
Titulacio´: Master of Innovation and Research in Infor-
matics
Cre`dits: 30
Directors: Jordi Cortadella Fortuny
Jordi Petit Silvestre
Departament: Computer Science
MEMBRES DEL TRIBUNAL (nom i signatura)
Jose´ Luis Balca´zar
Amalia Duch
Enric Rodr´ıguez
QUALIFICACIO´
Qualificacio´ nume`rica:
Qualificacio´ descriptiva:
Data:

Abstract
Since the introduction of the first integrated circuits, the increase on tran-
sistor density and changing technological constraints has created a lot of
challenges for the circuit design process. Electronic design automation is the
category of software tools that automates parts of the process and allows the
design of such complex circuits. One of the processes automated by these
tools is the routing stage, which consists in deciding the exact location of the
wires connecting components inside the chip.
The work presented in “A Boolean Rule-Based Approach for Manufac-
turability - Aware Cell Routing” (Cortadella et al, IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems, Vol. 33, No. 3,
March 2014) is a standard cell router that encodes the problem instances into
Boolean formulas and passes them to a SAT-solver. There is a valid solution
if a model for the formula exists, and there is no valid routing if the formula
is unsatisfiable. The original approach gives good results, but a lot of time
is spent in the SAT-search phase when cells become big and complex.
This thesis proposes to extend this router in order to obtain better perfor-
mance. The idea is to do so by considering more information of the routing
instance during the SAT-solving phase. Whereas the original router uses
binary variables at a wire segment level, we propose to extend the Boolean
formulation by incorporating auxiliary variables that encode entire paths
connecting components of the circuit (called highways). The approach poses
two interesting problems: deciding which paths we should encode into the
problem, and how they should be used during the search.
A SAT-solver, MiniSat, is extended in order to take maximum advantage
of the auxiliary variables we introduced. We modify several internal heuristics
of MiniSat with the idea of ensuring that it is using our auxiliary variables
as much as possible, also allowing the use of its own heuristics when deemed
more appropriate. On one side, we increase the initial priority of our auxiliary
variables in order to ensure they are used at the beginning of the search. On
the other, we respect the heuristics of MiniSat to decide which nets should
have priority at a given point of the search, but we impose highways to be
used if they are available for the nets with priority. Preliminary experimental
results show that the approach gives excellent results if we include a solution
to the problem in the set of highways, showing the approach is valid if the
auxiliar paths are good enough.
We present several techniques for generating the set of highways. One
family of methods uses the router itself to generate them. The basic idea
is to obtain a reduced version of our problem, either because we remove
some of its signals or because we consider only some parts of the original
circuit. We route this reduced problems and we consider the paths that
appear connecting components as good candidates for paths in the complete,
original problem. The routes obtained using these method are expensive
to compute, given that they rely on using the router itself, but on the other
hand the fact that we are using the router ensures that they honor the design
rules and that sets among them fit together into a same solution. We use one
of the approaches to route the Nangate library, a real standard cell library
we take as our basic benchmark. With the other approach, specially suited
to route groups of standard cells, we route Concatlib, a library we create by
concatenating Nangate cells with themselves several times.
Our second family of methods does not rely on the router to generate
valid highways, but observes the circuit and tries to algorithmically build
highways that are interesting for our formulation. In general these highways
can be constructed faster but are not guaranteed to respect all design rules.
We propose two main strategies that use this approach.
The first strategy is based on observing the circuit and deciding on paths
that will probably appear in a routing. We study valid solutions to our
problem to find characteristics of commonly used routes, and we decide to
use paths with up to two corners connecting our pairs of components. We
present an algorithm to generate all such highways and an algorithm to de-
cide, among the generated highways, which ones should be included in our
formulation. In order to choose them we create a ranking in which we assign
a score to each highway, giving more points to the longest, more conflicting
highways. We eliminate highways from our model beginning with the ones
with highest score until we have reached a predefined number of highways
for each pair of terminals we must connect. The algorithms following this
strategy allow for several customizable parameters, such as the number of
highways to be included in the formulation and the relative weights of the
parameters that determine which highways should be chosen. An exploration
of the parameters is presented in the experiments section. We show the dis-
tribution of the running time of the cells in the Nangate library and the
different times routing the entire library and a concrete cell.
The second strategy we propose is to generate a set of possible net order-
ings. Then we use a maze router to greedily find paths, sequentially routing
the nets following one of our orderings. This method provides highways that
are fast to create and with more variation than the previous method. The
routing time when routing the entire Nangate library is compared to the
other approaches.
All of the algorithms manage to obtain speedups of about 2-3x on the SAT
search phase of the algorithm, which accounts for the majority of its running
time. Even though the algorithms present room for additional variations, we
do not expect those to yield a significantly better speedup.
We propose future lines of work in which the router plays a key role as
a tool in the physical synthesis flow. One would be connecting signals from
adjacent standard cells, that would normally be external, using only metal 1
and metal 2. Another line of work would be trying to avoid future electro-
migration problems during the routing stage. A final proposal is to use the
router in a framework allowing the generation of cells on the fly.

“The value of life lies not in the
length of days, but in the use we
make of them... Whether you
find satisfaction in life depends
not on your tale of years, but on
your will.”
Michel de Montaigne

Contents
1 Background 3
1.1 VLSI and EDA . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Physical Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Design for Manufacturability . . . . . . . . . . . . . . . . . . . 10
1.5 Boolean satisfiability . . . . . . . . . . . . . . . . . . . . . . . 13
1.6 MiniSat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 The Router 19
2.1 Previous Approaches . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 The Routing Problem . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Routing Problem Representation . . . . . . . . . . . . . . . . 21
2.4 Encoding to SAT . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3 Enhanced SAT formulation 29
3.1 Highway Basics . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2 Initial Priority . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 Enhanced Search . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.4.1 Highway Usage Experiments . . . . . . . . . . . . . . . 41
3.4.2 Routing Time . . . . . . . . . . . . . . . . . . . . . . . 43
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4 Generating Highways Using Partial Routings 49
4.1 Signal-Removal Algorithm . . . . . . . . . . . . . . . . . . . . 50
4.2 Cell-Dividing Algorithm . . . . . . . . . . . . . . . . . . . . . 52
4.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.3.1 Signal-Removal Experiments . . . . . . . . . . . . . . . 55
4.3.2 Cell-Dividing Experiments . . . . . . . . . . . . . . . . 57
4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5 Generating Highways Without Partial Routings 61
5.1 2-corner Highway Generation . . . . . . . . . . . . . . . . . . 62
5.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.1.2 All-Paths Generation Algorithm . . . . . . . . . . . . . 63
5.2 2-corner Highway Selection . . . . . . . . . . . . . . . . . . . . 66
5.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.2.2 Selection Algorithm . . . . . . . . . . . . . . . . . . . . 68
5.3 Greedy Highway Generation . . . . . . . . . . . . . . . . . . . 71
5.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.4.1 Runtime Distribution . . . . . . . . . . . . . . . . . . . 75
5.4.2 Highway Selection Parameter Tuning . . . . . . . . . . 76
5.4.3 Greedy Generation . . . . . . . . . . . . . . . . . . . . 78
5.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6 Conclusions 81
6.1 Final Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Bibliography 85

Preface
The aim of this project is to enhance a SAT-based algorithm for the routing
of nanoelectronic circuits. We work on the routing framework introduced in
[1], which presents a Boolean formulation of the problem and uses a SAT-
solver to find valid routings for standard cells. This approach takes too much
computational time when dealing with big and complex cells. In contrast
with the original Boolean variables that encode wire segments, we propose
to extend the Boolean formulation to include auxiliary variables that encode
entire paths in the circuits. We modify MiniSat in order to exploit the use
of this variables and present several strategies to generate the paths that our
auxiliary variables will encode, considering length, conflicts among them,
design rule compliance and other parameters. Experimental results show
a speedup of about 2-3x on the SAT search phase of the algorithm, and
we consider that further exploration of the technique would only lead to
additional marginal gain.
Chapter 1 provides background on the integrated circuits design flow and
design considerations, the routing problem, the applications of the satisfi-
ability problem and the basic algorithms inside MiniSat. Chapter 2 is an
introduction to the routing framework this project aims to improve, includ-
ing a brief description of the tecniques it is based on and some experimental
results. Chapter 3 presents the extension of our Boolean formulation, as
well as how me modify MiniSat in order to take maximum advantage of the
new auxiliary variables. Algorithms for the generation of entire paths for our
router are shown in Chapter 4 and Chapter 5, together with the experimental
results obtained using each technique. Final conclusions and future lines of
work are presented in Chapter 6.
1
2
Chapter 1
Background
1.1 VLSI and EDA
The complexity of integrated circuits (ICs) has been growing year after year
since they were first introduced. According to Moore’s Law [2], the density
of transistors on a chip doubles approximately every 2 years. This tendency
has been followed for the last 40 years, but of course, such a fast-paced evo-
lution of the number of transistors comes with a lot of challenges at many
levels, such as technology, design and tools. Very large system integration
(VLSI) is the process that allows combining millions of transistors in a sin-
gle chip such as a microprocessor. This field has been constantly evolving,
trying to make faster chips and integrate more transistors generation after
generation. As the number of transistors has dramatically increased over the
years, the complexity of circuits has also increased enormously; and with it,
the challenges associated to the design of such circuits.
The design of VLSI circuits is therefore a very complex process that re-
quires automation. Electronic design automation (EDA) is a category of
software tools for designing electronic systems such as ICs. This aid has
been evolving together with the needs of VLSI design since the mid-70s.
Nowadays, given the level of complexity that VLSI design has reached, EDA
tools play a very important role in the fabrication of ICs
Current workflows for the fabrication of chips are very modular, and an
overview of the process can be found in Figure 1.1. When designing a chip,
the first step is to obtain a design specification with the details of what
3
4 CHAPTER 1. BACKGROUND
needs to be built, how the system should behave, etc. This is a high-level
description of the system and produces some specification document that will
be handed down to the designers.
Figure 1.1: VLSI Design Flow
The following step is functional design, in which the design described in
the specification document is modeled in the register-transfer level (RTL)
design abstraction, which focuses on the flow of digital signals between hard-
ware registers. Hardware description languages such as Verilog and VHDL
can be used to create such high-level representation of circuits. The output
of this step is HDL code describing the functionality of the circuit.
Next comes logic synthesis, which is an automated process done by the
EDA tools that consists in taking the RTL description of the circuit and
converting it into boolean expressions. After many logical optimization steps,
the resulting expressions are mapped into physical components and the gate
netlist, describing the connectivity of the electronic circuits, is produced.
1.2. PHYSICAL SYNTHESIS 5
The process continues by doing physical synthesis, which takes the in-
tended circuit design and decides the final physical layout of the circuit.
This is also an automated process done by the EDA tools and includes many
interesting steps which we will discuss in Section 1.2. The result of the phys-
ical synthesis are the masks that will be used by photolithography, as we will
see in Section 1.4.
The last step in the process is fabrication itself, which takes place in
the fabrication plants. A lot of steps are omitted and simplified in this ex-
planation, including the increasingly expensive verification phases that take
place between every pair of steps in the process to ensure no malfunctions
are introduced. It is important to note that, given a high-level specification,
multiple final circuits can be considered valid, allowing for a lot of decisions
and optimization to be done in order to generate a good physical design of
a circuit according to the desired goals: low area, low power consumption,
failure tolerance or a combination of all of them.
1.2 Physical Synthesis
Physical synthesis is the process that takes a circuit description as netlists
and defines the physical layout of the design. Figure 1.2 shows some of the
most important steps during physical design.
Figure 1.2: VLSI Physical Synthesis Flow
6 CHAPTER 1. BACKGROUND
Floorplaning
This is the first step in the physical design flow. Floorplanning consists
in identifying structures that should be placed close together, capturing
relative positions rather than fixed coordinates. It can be considered a
generalization of placement, a first draft of how the components will be
allocated in the chip, allowing transformations of the components such
as rotations and modifying their shapes. Simulated annealing, trees
and slicing structures, as well as dynamic programming for floorplan-
ning optimization [3], are widely used in this area. A floorplan can be
optimized for metrics such as area, wirelength, routability and others.
Figure 1.3, taken from [4], shows two different floorplans for a given set
of components. The floorplan on the left is optimal in area while the
one in the right introduces white spaces.
Figure 1.3: (a) Optimal area floorplan (b) Non-optimal area floorplan
Partitioning
The netlist of the functions to implement can be very large. Partition-
ing is the process of dividing the chip into smaller blocks so that later
placement and routing are easier, using a divide-and-conquer strat-
egy to tackle design complexity. Many variants of partitioning exist,
such as two-way partitioning (one of the first approaches, [5]), multi-
way partitioning, which can be seen as an extension of the min-cut for
two-way partitioning, and multi-level partitioning, where the result is
represented by a tree structure. More partitioning approaches can be
found in [6].
Placement
This step consists in assigning cells to positions in the chip according
to some cost functions while preserving legality (for example, with no
overlapping). The inputs are the netlists and the goal is to find the best
position for each module considering wirelength, routability density,
power and other metrics. Many placement styles exist depending on
the design methodology it is integrated with (such as building blocks,
1.2. PHYSICAL SYNTHESIS 7
standard cells or gate arrays). This step is tightly related to the next
phase, routing. Some placement paradigms are:
• Constructive algorithms, such that when the position of a cell
is fixed, it is not modified anymore. Some examples are cluster
growth, min-cut [7], or quadratic-placement algorithms (such as
Hall placement [8], the first analytical placer).
• Iterative algorithms, where intermediate placements are modified
in order to improve some cost function. This would include analyt-
ical methods such as force-directed placement. Figure 1.4, taken
from [4], shows several phases of the placement in a force-driven
algorithm. The elements approach their final position iteration
after iteration.
• Nondeterministic approaches, including metaheuristics like simu-
lated annealing [9] and genetic algorithms [10].
All these methods can be combined to obtain a more accurate result.
Additionally other methods can be considered, for example a flow con-
sisting of a global placement and legalization phase followed then by
detailed placement step. There are many interesting research directions
in placement such as manufacturability-aware placement, but probably
the most interesting for our project would be approaches considering
routability-driven placement [11].
Figure 1.4: Placement of a chip
Routing
The routing process determines the precise paths for nets on the chip
layout, respecting a set of design rules to ensure that the chip can be
8 CHAPTER 1. BACKGROUND
correctly manufactured. It requires a physical placement of the layout,
the netlists and the design rules required by the fabrication process.
The main aim is to complete all required connections on the layout,
although other objectives such as reducing total wirelength or meeting
timing requirements have become of essential relevance in modern chip
design. The routing phase represents a very complex combinatorial
problem. Usually, a two-step approach consisting of a global routing
followed by a detailed routing is used. The first considers the connec-
tion between different regions of the chip, while the second focuses on
obtaining a definite layout for the wire connections.
These are only some examples of steps where algorithms have become
indispensable for the design of IC. As we can see, EDA tools have become
a necessary component of digital circuit design. From logic synthesis to
physical design, all of the steps involved have an important algorithmic load
and much effort has been invested in such crucial synthesis tools.
1.3 Routing
Routing is one of the multiple steps that take place in the physical design
process. For the latest years many algorithmic techniques have been explored
to address the complex problem of determining how the components of cir-
cuits should be interconnected. As the number of transistors per chip grows,
the increasing complexity of the design becomes a challenge for the routing
stage.
The main aim of the routing problem is to find a valid interconnection of
terminals that honors a set of design rules. When routing, two kinds of con-
straints appear: performance constraints and design-rule constraints. The
objective of the performance constraints is to make connections meet the
performance specifications provided by the chip designers. Design rules, on
the other hand, are a set of additional constraints imposed by a given tech-
nology node that will have to be honored if we want the chip to be correctly
manufactured. They impose restrictions on, for example, the minimum width
of the wires or the wire-to-wire spacing. Another example of design rules is
related to the layer models, which can be either reserved or unreserved. In
the first case each layer is allowed only one routing direction, whereas the
placement of wires with any direction is permitted in the other one. Most of
1.3. ROUTING 9
the routers, however, use the reserved model because it has lower complexity
and is much easier for implementation; as we will see, manufacturability has
a great impact on many of the decisions taken during the physical design
flow.
These are many kinds of routing algorithms, and developing a complete
taxonomy of the router zoo would prove to be a very complex task. However,
one interesting division exists between sequential and concurrent routers.
Sequential routing
In this schema we select a specific net order and then route nets se-
quentially according to that order. The quality of the solution greatly
depends on the ordering given that an already routed net might block
the routing of subsequent nets. Often, a rip-up and re-route heuristic
is used to refine the solution. It basically consists in ripping-up some
already connected nets and then re-route the ripped-up connections.
It usually performs iteratively until all nets are routed, a time limit
is exceeded or no gain is obtained. As we will see, our router uses a
similar heuristic to obtain better solutions once an initial legal routing
has been found. The main drawback of this method is that, because
of the dependence on net-ordering, if no feasible solution is obtained,
it is not clear whether it does not exist or the chosen ordering was not
good enough.
Concurrent routing
Concurrent global routing tries to establish all connections at the same
time. Therefore, whether or not a solution is found does not depend on
any net ordering. One of the most popular approaches is to model the
layout as a graph and then use 0-1 integer linear programming. How-
ever, given it is NP-complete, another approach would be to solve the
continuous linear programming relaxation and the transform the frac-
tional solution to integer solutions through a rounding scheme such as
randomized rounding. In practice, such techniques are embedded into
larger global routing frameworks that use a hierarchical, divide-and-
conquer strategy. For routing multi-pin nets instead of two-pin nets,
those multi-pin nets are usually decomposed using minimum rectilinear
Steiner trees.
Another important classification of routing algorithms is whether they
aim at doing global or detailed placement. In the first case can be considered
10 CHAPTER 1. BACKGROUND
the coarse case of routing, in which we define big routing regions and propose
paths for the signals, but without detailing the exact location of the wires.
The task is left to detailed routing, which can work at a wire-segment level
and decides the final position of the wires. Out router, that will be presented
in detail in Chapter 2, is a detailed router with traits from both concurrent
and sequential routing. This means that it finds concrete routes for the wires
at a physical level, and that the algorithm uses concurrent routing to find an
initial solution, but later takes on a sequential routing strategy in order to
improve the result of the initial solution with rip-up and re-route.
1.4 Design for Manufacturability
As we have seen in Section 1.1, the final result of the design and synthesis
processes are masks to be used during the photolithographic process, illus-
trated in Figure 1.5. A source of light emits monochrome light that traverses
the mask in which the circuit patterns have been engraved. The die, covered
with a photosensitive layer, reacts to the exposition to light. The parts that
have been illuminated remain whereas all the others are eliminated after an
etching phase. Layers grow one above the other until all masks have been
applied. The patterns on the masks are the final product that the physical
design produces for a given circuit abstraction.
Figure 1.5: Lithographic process
In the last years, additional problems have been adding up to the chal-
lenges of circuit design itself. As the size of transistors decreases, the manu-
facturing process has become increasingly complex. Previously, the standard
way of scaling down the size of technologies was to reduce the wavelength of
1.4. DESIGN FOR MANUFACTURABILITY 11
the source of light. However, for recent technology nodes in which the size
feature is smaller than the minimum wavelength (193nm), this has become
a source of variability and error during the photolithographic process [12].
This gap, called the lithography gap, is illustrated in Figure 1.6. Techno-
logical solutions such as extreme ultra-violet light sourced (EUVL), electron
beam lithography and others have been constantly delayed, prompting the
design and manufacturing engineers to find extra solutions for the problems
caused meanwhile.
Figure 1.6: Lithography gap. Source: [12]
One such solution was optical proximity correction (OPC), which con-
sists on pre-distorting the masks in such a way that, once exposed to the
light source, the imprinted result will be as close to the original we wanted
as possible. However, such methods rely on simulations and are very com-
putationally expensive, making them prohibitive.
The process is illustrated in Figure 1.7. On the first row we have the
case in which the mask has no optical corrections. After the projection
and the etching, the pattern that results is closer to an ellipse rather than
two rectangles. However, with the corrected mask, the result obtained after
exposure is closer to the original intended pattern.
Additional litho-friendly layout techniques must be considered to deal
with all this increasing complexity. As we get into newer technology nodes,
12 CHAPTER 1. BACKGROUND
Figure 1.7: Optical proximity correction example. Source: [13]
the amount of design rules increases enormously, resulting in more chal-
lenges during the design stage. Manufacturability awareness during the de-
sign phases of the circuit has become a must in order to produce good yields
at the end of the fabrication process.
For example, one of the techniques used during the lithography process
is double patterning (DP), which consists on dividing what normally would
be a single mask in two masks as shown in Figure 1.8. The masks would be
exposed one after the other during the manufacturing process.
By using this method, the effective pitch can double, improving lithogra-
phy resolution. However, masks must be assigned to the components, which
is something that must be done during the design process. The basic rule
is that two components that are too close cannot share the same mask, and
the problem does only get more complex when moving to triple or multiple
mask patterning, the more general case, as feature size decreases. Even if
in the past the design and manufacturing stages might have been relatively
isolated, this makes for a perfect example of how manufacturability issues
are becoming more and more important during design, and considering the
delays in technology development this trend will only continue to grow in the
future to come.
1.5. BOOLEAN SATISFIABILITY 13
Figure 1.8: Double patterning example. Source: [12]
One of the strategies that has been increasing in popularity in the recent
years is the regularity of the designs. In regular designs, we want to use
repetitive layout patterns and blocks, which require simplified masks and al-
low for a better process control [14]. This can be achieved by using simplified
design models and restrictions, such as gridded models, which a priori would
restrict the possible designs, but on the other hand allow for new families of
algorithms to be used during the physical design phases. Additionally, OPC
and DPT are not effective on complex layouts with arbitrary shapes, but are
most useful when using regular layouts.
1.5 Boolean satisfiability
After a brief introduction to the world of circuit manufacturing and design,
it is time to introduce another family of key concepts in this thesis: those
related to the boolean satisfiability (SAT) problem. Let us define an inter-
pretation of a propositional logical formula as an assignment of values to the
variables. The boolean satisfiability problem can be stated as follows: Given
a propositional logic formula, is there any interpretation that satisfies it?
This problem is one of the most extensively studied in computer science
literature, given the theoretical importance it was shown to have in the field
of computational complexity. However, in the last years, the programs ded-
icated to solving the SAT problem have become an important tool in the
computer scientists’ repertoire when dealing with combinatorial optimiza-
14 CHAPTER 1. BACKGROUND
tion problems, specially those that can easily be expressed in propositional
logic. This has been thanks to advances and continuous research on how to
build more efficient SAT-solvers and efforts on understanding the problem’s
characteristics. One illustrating example of such interest are the SAT com-
petitions that take place yearly, in which solvers from all around the world
compete to solve problems, with several categories and tracks [15].
But why are advances in SAT so interesting? How can the SAT problem
help in problems apparently as unrelated as industrial planning, scheduling
of football leagues or the routing of standard cells? It is done by reducing
those problems to SAT, as shown in Figure 1.9. Consider again a black-box
SAT-solver such that it receives a CNF F as an input and it returns “YES”
if satisfiable, with a model that satisfies it, or “NO” otherwise. Reducing
a problem to SAT consists on encoding our problem into a formula that we
can give as an input to a SAT-solver in such a way that we can, in return,
construct a solution to our problem from the answer the SAT-solver has
provided. We move our original problem to the Boolean domain in such a
way that, if the formula is satisfiable, we can translate the formula model
to our original domain, obtaining a valid solution to our problem, and if it
is unsatisfiable we know there is no solution for our problem. This is the
approach used by the router we are extending in this work to route standard
cells, and we will try to extend it by modifying both the encoding of the
problem into a formula and the internals of the SAT-solver, which will not
be a black box anymore.
Figure 1.9: Solving problems using SAT
1.6. MINISAT 15
1.6 MiniSat
The SAT-solver we will be using in the implementation of the project is
MiniSat. It is a small, conflict-driven, clause learning SAT-solver with good
documentation and a very understandable, extensible code [16]. We will now
present the basics of how MiniSat works. The core of the solver is the DPLL
algorithm: We begin with an empty assignation. We pick some variable,
decide a value for it and propagate the value to assign other values to other
variables. If by propagating we find that the formula is unsatisfiable, then
our last decision must be wrong and we go back to that point, giving the
variable the opposite value we originally decided. We proceed this way until
we find a model for the formula. However, modern SAT-solvers are enhanced
with several additional systems and heuristics that prove to work in harmony
when all of them are present: resets, clause learning and conflict analysis.
Conflict Analysis
In the most basic SAT solvers, when a conflict is detected after deciding
on a variable, this last decision is always the one that is undone. How-
ever, more powerful techniques involving analysis of the propagations
can force backtrack to previous points of the search tree. This allows
us to discover propagations that we could have done earlier, and thus
we backtrack to those levels in order to do so.
Clause Learning
During conflict analysis, intermediate lemmas are produced. These
lemmas are independent of the search point at which we are, and ex-
plain the reason why there was a conflict. We can store them and add
them to a learnt clauses database, which we will also consider during
propagations preventing future similar conflicts during the search.
Resets
Using the reset system means that we will fix a budget on the number
of conflicts of the DPLL search. Once the number of conflicts has been
reached, the whole assignment is deleted and the search begins all over,
but keeping the learnt lemmas. In this way we will diversify the search
and explore different parts of the search tree.
The task of choosing on which variable to decide is crucial to a good
performance in a SAT-solver. Every variable in MiniSat has an activity
value, which acts as a priority and begins set at 0. A max heap on the
16 CHAPTER 1. BACKGROUND
activity of all of the variables of the problem is used to determine which is
the next one we want to decide on, following the rationale that deciding on
the most active variables can help untangle the most crucial parts of the
formula. The activity of the variables depends exclusively on their role in
conflicts: every time a variable appears in conflicts analysis, its activity is
bumped.
An basic overview of the algorithm that MiniSat uses is presented in
Algorithm 1. We begin initializing a reset count to 0 and entering a loop
that will include the whole search. Using the reset system means that we
will fix a budget on the number of conflicts of the DPLL search. Once the
number of conflicts has been reached, the whole assignment is deleted and
the search begin all over, but keeping the learnt lemmas. Every iteration in
the loop will represent a reset during the search.
The first thing to be done in a solver iteration is to determine the conflict
budget for the round. A lot of strategies exist for the math series function,
which returns the budget given the round we are in, but MiniSat uses the
Luby sequence by default. Next, we call the SAT search function with its
budget and wait for an answer. If the result is SAT , we get the model and
we are done. If it is UNSAT , we are also done. Otherwise, we clean the
learnt clauses database if there are too many unused lemmas and proceed to
the next iteration.
Algorithm 1 SAT solver
1: reset count← 0
2: while true do
3: reset model()
4: n conflicts← math series(reset count)
5: result← search(n conflicts)
6: if result = SAT then
7: print model()
8: else if result = UNSAT then
9: print unsat()
10: else
11: reset count← reset count+ 1
12: learnt clauses cleanup()
13: end if
14: end while
1.6. MINISAT 17
The inner search loop is shown in Algorithm 2. We begin by having
an undefined state and get into the main loop. Now, the loop checks if
there has been a conflict after the propagations of the last iteration. If this
is the case, conflict analysis identifies an earlier level at which we could
have propagated some information and updates the activity of the variables
involved in the conflict. A lemma explaining the conflict is also added to the
learnt clauses database in order to avoid falling in the same conflict again
in the future. If the suggested backjump level is 0, then the conflict brings
us to a level were no decision has been taken and we know the formula is
unsatisfiable. If we already had to many conflicts, then we know we are
outside our budget and we must restart the search. Otherwise, we jump
back to the level determined by the conflict analysis and reset the state to
undefined.
Algorithm 2 SAT search
1: state← undef
2: while true do
3: if state = conflict then
4: level← conflict analysis()
5: if level = 0 then
6: return UNSAT
7: end if
8: if run out of conflict budget() then
9: return UNDEF
10: end if
11: backjump(level)
12: state← undef
13: end if
14: if state = SAT then
15: return SAT
16: end if
17: next var ← pick decision variable()
18: add to model(next var)
19: state← propagate(next var)
20: end while
If there was no conflict, we check if the formula is already satisfiable,
and return if it is the case. If we are past this point, we must decide on a
new variable and propagate. In order to do so we have a function to pick
the next variable, which takes the variable with the most activity out of a
max heap. It picks variables from the heap until some unassigned variable
18 CHAPTER 1. BACKGROUND
is found, which automatically becomes next var. Then we assign a value to
this variable and propagate it, finishing the current iteration.
In this way, by alternating an outer loop controlling the resets and learnt
lemmas and an inner loop performing the main search, the solver keeps on
iterating until a sat/unsat decision is reached, or some other budget hard
limits, such as execution time, are met. As explained before, MiniSat is a
competitive SAT-solver which, given the small source code size, clean struc-
ture and additional documentation explaining the implemented algorithms,
is especially suitable to be extended. The SAT competition has a track
dedicated to “MiniSat hacks”, showing the popularity of the solver in the
SAT-solving community.
1.7 Conclusions
In this section we have provided background and motivation for our project.
First we have introduced what VLSI and EDA are, and why they are essential
in the fabrication of integrated chips. After overviewing some characteristic
problems in the field in which algorithmics play a key role, we have taken
a closer look into the routing problem, providing some basic considerations
to tackle it. Next we have introduced the drive for regularity in the design
process as a measure of design for manufacturability, a trend that has been
rising in the last years due to the fabrication challenges of integrated chips
and motivates the development of our router. Finally, insight on how the
router works has been given by introducing the boolean satisfiability problem
and how it is used to solve computationally challenging problems that can be
expressed as a set of relations among binary variables. We have also exposed
the basic algorithms and techniques behind a popular SAT-solver, MiniSat,
which we will be extending in this project. More detail on how the router
works will be given in the following chapter.
Chapter 2
The Router
As explained in the abstract, the aim of this project is to develop divide-and-
conquer strategies on a previously existing framework to route standard cells.
This router uses a technology-independent and parametrizable approach that
can be adapted to different fabrics and rules. It uses a Boolean formulation of
the problem to find a legal detailed routing of a cell represented by a gridded
layout. However, as cells become larger, approaches such as the one this
project explores become mandatory to keep SAT formulas tractable. In this
section, basic insight on how the router tool works and necessary vocabulary
that will later be extensively used is provided.
2.1 Previous Approaches
As explained in Section 1.1, routing and manufacturability-aware design have
attracted lots of attention in the last years. This routing tool addresses both
issues by considering geometrical regularity for the routing process. It is
not the first time that a Boolean formulation of the problem is presented.
The approach presented in [17] presents a formulation in which every wire
segment is represented with a Boolean variable. However, the complexity
of the problem restricted the applicability to small cells. This formulation
is compared to our router in its original article [1]. Their encoding, called
dense, has the smallest number of variables, but does not allow for additional
optimizations which are enabled by the use of the encoding proposed in [1].
19
20 CHAPTER 2. THE ROUTER
Additionally, algorithms based on using SAT on regular layout fabrics
had already been proposed in [18]. The variables used in this approach
are not wire segments, but entire paths: a maze router generates sets of
routes for every pair of terminals, and then the Boolean formulation ensures
that a set of routes connecting all terminals is chosen. The constraints are
basically conflicts between entire routes, which is a technology-dependent
process, making the router specially customized for a concrete fabric and a
specific set of design rules.
As will be shown later, our router uses a Boolean encoding that is closer
in spirit to the one presented in [17], but at the same time this thesis proposes
to combine it with the approach in [18] in order to obtain further speedup
on our original router.
2.2 The Routing Problem
The router we are working on proposes an algorithmic approach for a generic
problem of cell routing, which has the following characteristics.
• It should be independent from the layout templates and the intercon-
nect resources, so that it can be configured with the resources available
at any technology generation.
• Parametrizable attributes should be allowed for every wire segment, for
example different wire width, assignation to masks, etc.
• It should allow the router to select the best I/O pin locations.
• It should be independent from the set of design rules.
• In the case of unroutable cells, externally connected pins should be
allowed.
• A set of recommended design rules to improve yield should be specifi-
able.
• Wirelength should be a parameter for optimization.
2.3. ROUTING PROBLEM REPRESENTATION 21
To do so, the router tool uses an encoding scheme for SAT-based for-
mulas that makes large cells tractable by applying windowing heuristics, as
explained in Section 2.4. A formalism to specify gridded design rules and
multiple-patterning constraints is provided. The router also uses heuristics
for quality improvement (wirelength and recommended design rules) and al-
lows the connection of external pins in case of unroutability.
Graphs are used to represent the gridded routing problem. Every net has
a set of terminals that must be connected. Each terminal is represented by
a set of vertices. Edges represent wire segments that can be used to connect
pairs of vertices. The routing problem is defined as follows.
Find a set of edges that define routes connecting the terminals of each
net. The routes must be disjoint (cannot have common vertices) and satisfy
a set of design rules.
It is important to realize that the number of possible solutions is finite.
It can be reduced to a SAT formula in which a variable is associated to every
edge representing the presence or absence of a given signal in that position.
To find such a solution with the maximum quality, the router uses two steps.
1. Finding an initial legal solution that honors the design rules.
2. Improving the solution by iteratively re-routing nets and using quality
terms in the cost function.
2.3 Routing Problem Representation
The routing region is represented by a 3D undirected grid graph G(V,E) as
depicted in Figure 2.1(a).
The vertices have associated integer coordinates in {1, ...,W}×{1, ..., L}×
{1, ..., H}, where W , L and H represent width, length and height. The edges
of the graph connect grid points. Notice that the represented physical grid
may not have a uniform distribution such as the one shown in the grid as
can be seen in Figure 2.1(b).
Every vertex v is denoted by its coordinates v = (x(v), y(v), z(v)). In our
context, z(v) represents the layer of the layout, thus z(v) ∈ {pd,m1,m2} as
22 CHAPTER 2. THE ROUTER
Figure 2.1: (a) Grid model for routing. (b) Physical and logical grid
shown in Figure 2.1(a). Every edge will be denoted by its endpoints, as in
e(v, u). We will also define a net n ⊂ V as a set of grid points that must be
connected. A subnet will be a pair of terminals of the same net.
A Viewer program is provided in order to see how a grid looks like. Given
the description of a grid it shows a 3D representation of it using OpenGL,
allowing interaction with the model including zoom and rotation. Figure 2.2
represents an instance of a real routing problem. Each color represents a
different net or signal. We can see in the lowest layer some terminals that
need to be connected. On the top and bottom of the cell we can see the VDD
(voltage, red) and VSS (ground, grey) lines crossing the second layer. On
the bottom left side of the screen, a complete list of the signals that appear
on the cell is provided.
Figure 2.2: Routing grid problem instance
2.4. ENCODING TO SAT 23
2.4 Encoding to SAT
Now that we have some terminology we can make a broad overview at how
the router uses SAT to solve the routing problem. To do so, the router
codifies the routing problem to a CNF formula that can be given as an input
to a SAT solver. First, some variables to model the problem are needed.
• ρ(e): A variable that represents when edge e is occupied by a wire.
• ρ(e, n): A set of variables that represent the associated net in case e is
occupied by a wire.
• ρ(e, n, s): A set of variables that represent the subnets associated to
every wire.
Given these variables, the Boolean formula F that represents the problem
is as follows.
F ≡ C ∧R ∧DR
Here we will have a brief description of the elements in each of the com-
ponents of F .
C, Consistency constraints
These clauses ensure the consistency of the formula. For example, make
sure that if an edge is associated with a net, such net is occupied by a
wire, or that if an edge is associated to some subnet of a net, it is also
associated to that net.
R, Routability constraints
This clause set represents the routing constraints for the grid. For
example, we must impose that each edge is assigned to at most one net
and that two adjacent wires are assigned to the same net.
DR, Design-rules contraints
Finally, these clauses represent constraints imposed by the user-defined
set of design rules. Such design rules might impose, for example, that
no adjacent vias can be connected to different nets. Extra clauses
modeling wire attributes are included among this constrains.
24 CHAPTER 2. THE ROUTER
An important idea, which is also encoded in the form of routability con-
straints, is windowing. Empirically, it has been shown that the route of a
two-terminal subnet rarely spans beyond the bounding box determined by
the two terminals. Thus, clauses that enforce the variables outside the region
to be falsified can also be added. This might imply that no solution is found
even if one exists, but it greatly improves the tractability of the problem.
The halo parameter indicates how far outside the bounding box a subnet is
allowed to expand.
The router allows for any discrete set of attributes to be binary-encoded
and incorporated to the formula. For example, wires might have two different
widths (thin and thick) or could be assigned to different masks to comply
with some patterning lithography rules. Additional variables with the form
ρ(e, x) which represent the presence of attribute x in edge e are then added,
as do the necessary clauses to deal with such attributes.
The formula F thus generated is given as an input to the MiniSat SAT-
solver which will return a satisfying model, if such exists. Given this model,
a solution for the original routing problem can be obtained: this is the main
goal of the router. In Figure 2.3 we can see a solution to the routing problem
in 2.2.
Figure 2.3: First obtained solution for 2.3
2.4. ENCODING TO SAT 25
As stated before, among all valid solutions, some have better quality than
others. For example, cells with smaller wirelength are preferred. The solution
proposed by the SAT solver in Figure 2.3 has many redundant wires. The
router uses a heuristic method to make this problem tractable using a mixed
integer linear programming engine, gurobi. However, this model becomes
intractable when dealing with large cells. Large neighborhood search is then
used to reduce complexity of the problem in combination with ILP.
The algorithm consists on ripping-up and re-routing nets starting from the
basic solution obtained using the SAT solver until no significant improvement
is observed. In practice, this takes two rounds of re-routing for each net. This
strategy admits variants such as ripping and re-routing more than one net
simultaneously; additionally, other aspects such as the ordering of the nets
could be considered to search for even better local minima. Figure 2.4 shows
an optimized version of the first obtained solution.
Figure 2.4: Optimized solution for 2.2
26 CHAPTER 2. THE ROUTER
2.5 Results
The router tool was used to synthesize the Nangate 45nm Open Cell Library,
which contains 127 cells. All layouts were checked for design rule correctness
and can be found in
https://www.cs.upc.edu/~jpetit/CellRouting
The encoding scheme of the router is compared to another SAT formu-
lation of the routing problem presented in [17]. As we can see in Table 2.1,
the router’s sparse encoding outperforms the dense encoding that was used
in the previous work. This is explained by two facts, the application of the
windowing heuristic and the structure of the clauses: in the space encoding
a 98.5% of the clauses have 2 or 3 literals, whereas in the dense only a 40%
of the clauses fall in this category. When considering the whole library, all
library cells were routed in 45 minutes when using the spare encoding.
Sparse (w=6) Dense [17]
Cell Area Size CPU Size CPU
OAI221 X1 5 158 0.6 96 0.1
HA X1 9 295 1.5 239 29.4
FA X1 15 857 27.9 486 2959.0
DFFS X1 21 1429 31.4 903 205.4
SDFF X1 25 2755 84.9 1380 1424.0
SDFFRS X2 33 4302 142.1 2679 40 hours
Table 2.1: Results for SAT solving (Size in 103 literals, CPU in secs.)
However, we must take into account that the router has been applied to
cells of a limited size. What happens when it has to deal with bigger, more
complex cells? Given that the tool is based on a SAT-solver, and SAT is a
hard problem, as soon as the complexity of the problem scales, it becomes
intractable. As explained before, this project aims to find a way for such
hard cells to be routed and by using problem-specific information to help the
router find paths for the signals, as we will see in the following chapters.
2.6. CONCLUSIONS 27
2.6 Conclusions
This chapter has explained the basics of how the router tool works. First
it has accurately defined what the routing problem for a cell is and how it
represents internally the routing grids. Figures of an example of the routing
flow have been provided. A basic description of the structure of the SAT
formula has also been presented. Some of the original results for the router
tool have been provided. The next chapters will focus on how to interact
with the SAT-solver to find the initial routings faster.
28 CHAPTER 2. THE ROUTER
Chapter 3
Enhanced SAT formulation
Encoding the routing problem into a boolean formula and feeding it to a
SAT-solver is simpler than designing a dedicated algorithm. Once a nice
codification has been found, there is no need to worry about the search: we
just assume that the SAT-solver is a black-box and wait in the other end for
a solution to our problem. This abstraction proves useful for many problems
but, when pushing for performance, it may not be enough. What if we
could tweak the SAT-solver search and try to do it more routing-friendly?
Intuitively, if one were to perform the search by hand, the shortest paths
would be the first ones to be tested. We also know that a solution in which
some paths have a lot of corners is not likely to be routable. Can we help
the SAT-solver (MiniSat in our implementaiton) to make decisions that will
find a model faster? This is what we propose to do in the present work.
In this chapter we explain the strategy that we have implemented in order
to speed up the SAT search using problem-specific information. In the first
section we describe our auxiliary variables and how we encode routes for
electric signals in the Boolean formula. Later, in Section 3.2 and Section 3.3,
we discuss how we enhance MiniSat to use these extra constructions. Finally
on Section 3.4 some experiments show how our method works and details on
how we assessed the validity of the approach are presented, and in the last
section we draw the conclusions of this chapter.
29
30 CHAPTER 3. ENHANCED SAT FORMULATION
3.1 Highway Basics
As we have previously explained, we want to help the SAT-solving phase to
solve the routing instances of our problem. In order to do so we propose to
extend the Boolean formulation of the problem by adding new variables that
encode entire paths. As shown in the introductive chapter, past approaches
to encoding routing into SAT used either low-level metal segment variables
and variables for entire paths, but we propose using both. In our case, the
number of path variables will be very reduced and they will only be used as
a guide. The main variables of our model which will be the ones used in our
original router.
In order to do so, let us remember that the original encoding has the
following three variable types,
ρ(e, n, s)
A variable indicating if the subnet s of net n is traversing the edge e.
ρ(e, n)
A variable indicating if any subnet of net n is traversing the edge e.
ρ(e)
A variable indicating if any subnet of any net is traversing the edge e.
As it has been exposed, activating ρ(e, n, s) propagates the activation of
ρ(e, n) and ρ(e) because of the consistency rules, and possibly other propa-
gations because of the other clauses.
Now, what we want to do is add entire variables such that they represent
an entire signal path (from now on called highways). For instance, let us
say that we decide that some path would be a good highway and we want
a variable to directly point to it, since we want the solver to activate the
whole path with a single decision. That path, which is a set of edges of
some net n and subnet s, can be decomposed in a set of variables of the
form ρ(e, n, s), the ones we had on the original problem. Now we create an
additional auxiliary variable h encoding the path and, for each variable of
the form ρ(e, n, s) in the path, we add the constraint that
h⇒ ρ(e, n, s)
3.1. HIGHWAY BASICS 31
That is, activating the highway variable implies that all of the segment-
level variables that represent the path should be activated. This highway
variable is just a shorthand to activating the whole path. Notice that the
converse implication is not really needed since we want to decide on highway
paths to affect the low-level variables but not vice versa.
Additionally we impose that, among the n highways included in the for-
mulation at least one is activated, that is,
h1 ∨ h2 ∨ . . . ∨ hn
This clause is needed because otherwise the MiniSat’s simplifier sets all
highway clauses to zero and they become useless. By imposing this extra
condition we are enforcing that at least one highway is picked. This means
that we could have false negatives by using this extension to the original
formula: when no highway in the formula can be activated, the formula would
be unsat even if the formulation without highways was satisfiable. However,
this scenario is very unlikely and would only happen when experimenting
with very reduced highway sets.
Just adding these extra variables is not enough to help the SAT-solver find
a routing faster. They are more abstract and high-level than the segment
variables, given that changes in the highway variables affect the segment
variables but not so much the other way around. Highway variables en-
code structures that rarely appear in conflicts during the SAT search, which
are solved normally in a low-level segment basis. Given that the decision
variables in MiniSat take into account the appearances in conflicts, highway
variables do rarely appear as decision variables if no further modifications
are done. In order to ensure that they are used, we modify the internals of
MiniSat so that it can use the highway variables to its advantage.
In order to ensure MiniSat uses our extra variables we add two mecha-
nisms. On one hand, we allow the highway variables to begin with initial
activity, granting us a first phase of the search in which they are thoroughly
used. On the other, we prepare a mechanism to allow highway variables to
take priority over low-level variables, but at the same time letting MiniSat
decide which nets and subnets are the most interesting at every point of the
execution.
32 CHAPTER 3. ENHANCED SAT FORMULATION
3.2 Initial Priority
We would like the solver to use the highways at the beginning of the search,
with the intuitive idea that this way we can set up a good direction in which
to explore later instead of following not-so-promising search paths. In a
normal solver execution all variables have zero priority at the beginning of
the search. We allow the highways to begin with a predefined priority, such
that the first decisions of the solver are taken on highway variables. Later
this initial priorities become small in comparison with the priorities of the
low-level segment variables of the router; the second part of our strategy,
presented in Section 3.3, begins acting at this point.
The initial priority depends on an ordering of the subnets (called inter-
subnet ordering) and an ordering of the highways inside the subnet (called
intra-subnet ordering), both defined by the router. Depending on these order-
ings we prioritize some highways or some others. The inter-subnet ordering
is generated according to one of the following considerations, producing three
alternative algorithms.
1. Number of columns a subnet spans
If the two components of a subnet are very far apart we want to route
them at the beginning. If we route other subnets first they will occupy
space in between and potentially render the subnet unroutable, there-
fore we want to route first those subnets in which both components are
the furthest apart.
The procedure is described in Algorithm 3. It consists in going over
all nets and their subnets. For each one of them we obtain one of the
nodes in each of the two zones that we want to connect. Given that all
nodes of a zone are in the same column we do not care which one we
get. We follow by calculating the number of columns the subnet spans
by simply taking the difference of their column positions. We add a
tuple containing net, subnet and span to our set and we finally order
the elements in the set according to the column span distance of their
components, which is the third component of each one of the elements.
This ordering is the one we wanted to compute.
2. Congestion of the columns in the span of a subnet
The congestion of a column is the lower bound of the number of nets
that must cross a given column of the cell. Intuitively, given a column,
3.2. INITIAL PRIORITY 33
Algorithm 3 Span Ordering
1: V = ∅
2: for each net n ∈ nets do
3: for each subnet s ∈ subnets(n) do
4: src ← node from source(s)
5: dst ← node from target(s)
6: span ← |column(src)− column(dst)|
7: V ← V ∪ (n, s, span)
8: end for
9: end for
10: Order elements in V by the third field
for every subnet that has some component to its left and some compo-
nent to its right, some wire of the net will need to cross the column.
In particular we count the number of subnet spans crossed by that col-
umn, given that it is not the same congestion to have three subnets of
some net crossing a column that having only one.
In this ordering, subnets crossing congested columns should have pri-
ority given that they are in a critical area of the cell. The procedure
is described in Algorithm 4. The algorithm consists basically of two
parts. During the first one we compute some metrics, mainly how
many subnets cross each column (subnets per col), and, for each sub-
net, which columns it is crossing (columns per sub). In order to do so
we go across all nets and subnets in a similar fashion to Algorithm 3.
For each subnet we consider each column it crosses, add it to the cor-
responding entry in the columns persub vector and increase the subnet
count for the column in subnets per col.
The second part of the algorithm looks for the maximum congestion
for every subnet, by considering all the columns it spans. When the
max is known we insert a new element in our set V consisting of the
net, the subnet and the max congestion. The last step of the algorithm
consists in ordering the elements of the set V by the third field, which
is where we have stored the max during the previous phase, and we are
done.
3. Subnets that cross minimal congestions cuts
In this approach we also use the concept of congestion. The idea is to
find a column with relatively low congestion, route all the subnets that
cross it, and then route the subnets to its left, and then the subnets to
its right. In this way we intend to do a small divide and conquer, by
34 CHAPTER 3. ENHANCED SAT FORMULATION
Algorithm 4 Congestion Ordering
1: V ← ∅
2: subnets per col← vector of ints of size equal to the number of columns
3: columns per sub← vector of sets of size equal to the number of subnets
4: Initialize values in subnets per col to 0
5: Initialize sets in columns per sub as empty sets
6: for each net n ∈ nets do
7: for each subnet s ∈ subnets(n) do
8: src ← node from source(s)
9: dst ← node from target(s)
10: for each column c ∈ span(column(src), column(dst)) do
11: subnets per col[c]← subnets per col[c] + 1
12: columns per sub[s]← columns per sub[s] ∪ c
13: end for
14: end for
15: end for
16: for each net n ∈ nets do
17: for each subnet s ∈ subnets(n) do
18: max← 0
19: for each column c ∈ columns per sub(s) do
20: if max < subnets per col[c] then
21: max← subnets per col[c]
22: end if
23: end for
24: V ← V ∪ (n, s,max)
25: end for
26: end for
27: Order elements in V by the third field
3.2. INITIAL PRIORITY 35
finding first a routing of the central part and later looking for a routing
for the sets of subnets on both sides, which are disjoint and local.
The process is described in detail in Algorithm 5, with its recursive
procedure in Algorithm 6. In Algorithm 5 we prepare some initializa-
tions, similar to the ones done in previous algorithms. We also have a
set in which we store the triplets of net, subnet and an additional int,
which in this case is the order assigned by divide ordering procedure.
We also need a vector (subnets in col) that contains, for every column,
all subnets that cross it along with their span. This vector is initialized
in the double for loop of the algorithm, which traverses all nets, subnets
and columns to fill the information appropriately. Later, we proceed to
call divide ordering procedure, which fills V with triplets containing
net, subnet and order, and the only thing left to do is effectively order
the elements of the set.
Algorithm 5 Divide Ordering - Initialization
1: V ← ∅
2: subnets in col← vector of sets of size equal to the number of subnets
3: Initialize sets in subnets in col as empty sets
4: for each net n ∈ nets do
5: for each subnet s ∈ subnets(n) do
6: src ← node from source(s)
7: dst ← node from target(s)
8: for each column c ∈ span(column(src), column(dst)) do
9: subnets in col[c]← subnets in col[c] ∪ (n, s, span)
10: end for
11: end for
12: end for
13: divide ordering procedure(V, subnets in col, 0, num of columns, 1)
14: Order elements in V by the third field
The procedure, as show in Algorithm 6, is a recursive procedure with
5 parameters. The first one, V , is used for the output. subnets in col
contains the information processed prior to calling the functions. l and
r represent columns in the cell, used for the managing of divide and
conquer, and cont is an input/output order that should be assigned to
the next available position for a subnet, which is increased every time
we add a new element to V . In the initial call, we have l = 0, r =
number of columns and cont = 1.
The recursive function has two cases, depending on the area of the cell
36 CHAPTER 3. ENHANCED SAT FORMULATION
we have to handle. If it is big enough (r − l greater than a constant)
we fall in the recursive case. In it, we take the center column and,
on the window of columns c − ε to c + ε, we find the least congested
one, index. Unless otherwise stated we have used cte = 10 and ε = 3.
Now, we want to route first all the subnets that cross this column; in
order to do so we access all the subnets that cross the column, order
them according to the span (which is the third field of the elements in
the subnets in col vector) and proceed to add them in this order to V .
Finally, we call the procedure for both sides of the index column.
The only thing left to do if we fall in the base case, that is, for small
spans of the cell, is to order the subnets contained within it which have
not already been assigned an order. To do so we create an auxiliary
set and include in it all net and subnet pairs present in the zone, along
with the span of their subnet. We then proceed to order the elements
of the set according to their span and finally we add them following
that order to the final set.
Intuitively, if we had a cell that allows for only a middle partition, we
would prioritize first the ones crossing a low-congested column around
the middle, later the subnets in the left, and later the ones in the right.
Among the three groups, we use the span of the subnets to break ties,
longer spans first. All subnets have been assigned an ordering by the
end of the procedure.
As for the intra-subnet ordering, which is the order among the highways
of the same subnet, we have chosen to give more priority to the shortest
paths. The initial priority of the highway variables is determined by using
the simple relation
n
i+ j
Where n is the total number of highways, i is the position of the highway
in its intra-subnet ordering and j is the postition of the subnet in the intra-
subnet ordering. By using this expression we are giving priority to both the
chosen subnets and inside them the shortest paths in an interleaving way.
The shortest highway of the subnet with highest priority is the first one to
be used. Then, the second shortest of that subnet and the shortest from the
second subnet are the next ones. Thus the process continues interleaving the
best highways of the subnets with more priority, all the way to the end.
3.2. INITIAL PRIORITY 37
Algorithm 6 Divide Ordering - Recursive Procedure
1: procedure divide ordering procedure(V, subnets in col, l, r, cont)
2: if r − l > cte then
3: center ← (r + l)/2
4: min← size(subnets in col[l])
5: index← 0
6: for i from c− ε to c+ ε do
7: if min < size(subnets in col[i]) then
8: min← size(subnets in col[i])
9: end if
10: end for
11: Order elements in subnets in col[index] by the third field
12: for i from 0 to size(subnets in col[index]) do
13: n, s← net and subnet of element ith in subnets in col[index]
14: V ← V ∪ (n, s, cont)
15: cont← cont+ 1
16: end for
17: divide ordering procedure(V, subnets in col, l, index, 1)
18: divide ordering procedure(V, subnets in col, index+ 1, r, 1)
19: else
20: V ′ ← empty vector of tuples {int, int, int}
21: for each net n ∈ nets do
22: for each subnet s ∈ subnets(n) do
23: if subnet in span(l, r) and not in V then
24: V ′ ← V ′ ∪ (n, s, span)
25: end if
26: end for
27: end for
28: Order elements in V ′ by the third field
29: for v ∈ V ′ do
30: V ← V ∪ (v.net, v.subnet, cont)
31: cont← cont+ 1
32: end for
33: end if
34: end procedure
38 CHAPTER 3. ENHANCED SAT FORMULATION
3.3 Enhanced Search
The initial priority scheme presented in the previous section works at the
beginning of the search. When conflicts begin to arise during the SAT search,
the priority of the low-level segment variables begins to take over and we end
up not using the highway variables as much as we would like, as we will see
in the experiments in Section 3.4.
In order to prevent it we impose that, every time the solver is about
to decide on a variable belonging to a given subnet, if one of the highway
variables corresponding to such subnet can be activated, it picked it instead.
Similarly, if the solver decides on a variable belonging to a given net and
some highway of such net can be activated, it is chosen instead of the original
variable.
Using this technique we ensure that the original variables of the problem
are only used after all the highways corresponding to a net/subnet with
priority have been exploited. On one side we maintain the heuristics of the
solver to decide which part of the problem should be treated next, but on
the other side we provide it with potentially better tools to do so.
To implement this new heuristic we have modified MiniSat’s next decision
variable function. Recall that MiniSat has a max-heap in which all variables
are ordered depending on their priority. Initially, it just removed the top
priority variables until some undecided variable was found. We extend from
this point as shown in Algorithm 7.
The algorithm works as follows. First, we define next to be and undefined
variable. Then we examine the variable in the top of the priority heap. We
compare its assigned value to the undefined: if it is different then we can not
decide on this variable, because it is already true or false. Thus, we remove
it from the heap and iterate again to get a new variable. We leave the loop
as soon as we obtain the first variable that has not been already assigned.
Notice that the first unassigned variable we find is not removed from the
heap.
The normal algorithm would end here by removing next from the heap
and using this variable as the next decision, but we want to use highways as
much as possible. Thus, we obtain the net and subnet to which this variable
is related, as long as it belongs to the ρ(e, n, s) or ρ(e, n) variable sets. If it
belongs to some net but not to some subnet, then we want to leverage the
3.3. ENHANCED SEARCH 39
Algorithm 7 Pick Decision Variable
1: HEAP ← min-heap of variables ordered per priority
2: next← var Undef
3: while next = var Undef or value(next)! = l Undef do
4: next← HEAP.top()
5: if value(next)! = l Undef then
6: HEAP.pop()
7: end if
8: end while
9: n← get net(next)
10: s← get subnet(next)
11: H ← ∅
12: if net 6= none then
13: if subnet = none then
14: for each subnet i ∈ subnets(n) do
15: H ← H ∪ highways(n, i)
16: end for
17: else
18: H ← H ∪ highways(n, s)
19: end if
20: end if
21: for each highway h ∈ H do
22: if h is undefined then
23: activity(h)← cte× activity(next)
24: HEAP.update(h)
25: end if
26: end for
27: next← HEAP.top()
28: next← HEAP.pop()
29: return next
40 CHAPTER 3. ENHANCED SAT FORMULATION
highways belonging to every subnet on the net, thus we add them to the H
set. Otherwise, if the variable is tied to a subnet, we are happy because we
have more concrete information on which subnet would take priority. In this
case we only include on the set the highways that connect the subnet that
MiniSat wanted to prioritize.
Finally we set a fake priority to the highways we have selected in the
previous step. In order to do so we assign them the top priority, which is
the activity of the variable we obtained before, next, and assign the same
priority times a constant cte (1.1 unless stated otherwise) to the highways.
Then we update the heap with these highway activities and proceed to pick
a new variable. Now we remove the variable with the highest priority from
the heap: if there were some available highways it is one of them, otherwise
we pick next again and proceed with the search.
We considered other approaches to prioritize highway variables during
the search. For instance, a different idea was to modify the heap such that it
would not only consider the max-heap on priority, but also always prioritize
highway variables over all the others (considering activity for the relative
ordering of highway and non-highway variables). This approach was very
fast for cells that could be solved easily by using highways, normally the ones
with more space allowing more regular paths, but it proved to be disastrous
for complex cells. Always prioritizing highway variables over the low-level
segment ones meant that the latter would not be explored until either all
highways were set to true or false, thus enforcing a lot of exploration on
highway variables but never allowing the low-level ones to participate in the
search.
3.4 Experiments
Several criteria have been used to assess how well these schemes performed.
When working with this kind of searches it is very important to have a deep
insight on how they are behaving, how we would like them to behave, and
how can we push the search to act accordingly. In order to do so we use a grid
viewer that allows us to visually see the progress of the execution; we can
at any point of the search output the current state of the problem in a grid
format, which is later processed and shown by a viewer application. When
the search is finished, we can see all such grids one after another as if they
were frames of a film, thus giving an idea on how the search has behaved.
3.4. EXPERIMENTS 41
We have extended and modified this functionality of the original router to
accommodate the use of the highways. In order to visually emphasize them,
we give extra thickness to the wire segments that belong to highways.
3.4.1 Highway Usage Experiments
Another interesting measure to assess the behavior of the search can be
obtained by instrumenting the search and extracting statistics. One such
interesting statistic is the number of decisions taken on normal variables and
on highway variables in each decision level of the search. In Figure 3.1 we
can see this graphic for a search which does not use highway variables on a
flip-flop cell of the NANGATE library, DFFS X2, which we will use as an
example through this section.
Figure 3.1: Number of decisions vs Decision level - No highways
On the x-axis we have the decision level, whereas on the y-axis we have
the number of decisions that have been taken on that level: the blue, dark
line for low-level segment variables, the red, light line for highway variables
(always 0 in this case). Each time the solver decides on a variable, a new
decision level is opened and propagations are performed. Decision level 100
42 CHAPTER 3. ENHANCED SAT FORMULATION
means that, on the current assignment, 100 of its literals have been assigned
by decision, and the others by propagation. When the solver backtracks,
it goes back to some previous decision level. Because of the accumulation
of decisions after backtrack we can say that multiple decisions are taken
on every decision level and, by taking a look at this graphic, we can asses
whether or not the highway variables we want to give the solver as a helper
are used or not.
Here we have several interesting observations. During the first decision
levels few decisions are taken, but suddenly there is a steep rise. This marks
that there has been a lot of backtracking going on at this level of the search,
approximately from level 30 to 80. We also see a long-tail at the end of the
search, with hundreds of decision levels in which close to no decisions are
taken. This corresponds to the final part of the search, during which some
decisions on design rules take place, which have low number of conflicts or,
in case of backtrack, take the search back to a very early stage thanks to the
MiniSat’s conflicts analysis.
Let us now observe what happens when we add highways, but we only
include the interaction scheme dealing with the initial priorities, as depicted
in Figure 3.2. We do see there is some initial activity on the highway vari-
ables, as expected, given that we have arranged an initial activity for each
one of the according to the schemes exposed in previous sections. However,
notice how outside this first decision levels the use of highway variables is
minimal due to other variables getting more priority.
This inactivity on deeper levels of the tree is what motivated us to develop
the interaction scheme presented in Section 3.3, which is the strategy that
brings the whole highway-empowered search together. Figure 3.3 shows the
result of using the enhanced search. In this graphic we also have a black line,
which represents the sum of both types of decisions for highway and low-level
variables. Notice how, in this case, highway variables are clearly used during
the search, and their use is evenly distributed among all decision levels.
An interesting observation is that we could want the highway variables
to have a higher percentage of decisions in this graphic. This would corre-
spond to a strategy similar to the one we briefly introduced in which highway
variables had always precedence over normal variables. However, when using
these schemes we are imposing our criteria too much, whereas the use of
highways should be more of a companion to the search.
3.4. EXPERIMENTS 43
Figure 3.2: Number of decisions vs Decision level - Initial Activity
Figure 3.4 shows the statistic in this last case. Notice how during the first
levels of the search we are having a lot of highway decisions, but when no more
highway variable can be decided upon we swap to low-level variables. This
approach proved to give bad results when the cells where congested, needing
complex highways. If the highways do not provide a very good solution in the
beginning, a lot of search is later performed using the low-level variables, but
will probably not lead to a good solution. Moreover, by using this scheme we
loose a lot of information on how the search is going because we never let the
solver take the initiative on where to decide during the first decisions levels.
It is a delicate issue to find a right balance in which highways do help, and
not hinder, the search. Even with the best of our algorithms there are some
cases in which the highway-enhanced search makes bad decisions and might
end up taking more time than with the original search.
3.4.2 Routing Time
In order to assess the validity of our schemes we have done an additional
test. If we fed the algorithm with very good highways, we expect to find
44 CHAPTER 3. ENHANCED SAT FORMULATION
Figure 3.3: Number of decisions vs Decision level - Initial Activity +
Enhanced Search
a solution fast. But, if we fed it with the solution itself as highways, we
would want an immediate solution. If the search does not behave this way,
we might want to design a different search scheme. Moreover, we want to see
what happens if we include the solution as highways, but we also add other
highways generated by the algorithms that will be exposed in later chapters.
Does the search degrade because of the presence of additional highways?
Figure 3.5 show the results of such experiment with DFFRS X2, the cell
which usually takes more time to be routed. The column on the bottom,
labeled “original”, represents the original routing time - close to 250 seconds.
The next rows upwards show routing time by using the algorithms with
highways, and with the following highway distribution: “0” has only the
solution as highways, whereas every other row “n” has the original solution
as highways plus n additional highways per subnet generated by the 2-corner
algorithm, which will be exposed in next chapter. Finally, the top row shows
the mean routing time considering all the highway variations.
Notice how the routing time for the “0” column is cloase to zero: if
the solution is the only set of highways, the cell is routed right away be-
3.4. EXPERIMENTS 45
Figure 3.4: Number of decisions vs Decision level - Highway Priority
cause the initial activity ensures that highways are picked at the beginning
of the search. However, what happens if we have some less-optimal highways
thrown into the mix? We will have to move to the second phase of the search,
but this proves not to be very problematic. sStill, the search is conducted
extremely fast, reaching a solution in a fraction of the original routing time.
This validates the fact that our search strategy is adequate when optimal
highways are generated- we would have to look for a different one otherwise.
In the next chapter we will focus on the other big challenge of our project:
how to generate good highways for our algorithm to use.
Let us close the chapter with one final experimental result. Among all
the initial activity ordering schemes presented in Section 3.2, which, if any,
is the best one? We have conducted experiment by routing the whole NAN-
GATE cell library and the results can be observed in Figure 3.6. The y-axis
represents SAT-search time, in seconds. The first column, original, is the
routing time of the original tool. The other three show the mean time of sev-
eral experiments routing the whole NANGATE library using different sets of
highways. Our intuition was that by having a good beginning of the search
we could influence the solver, but notice how in the end the results are rather
46 CHAPTER 3. ENHANCED SAT FORMULATION
Figure 3.5: Route with know solution - DFFRS X2
similar. None of the schemes is computationally intensive, so all of them can
be considered effective for our purposes, and given the similarity of their
results we do not think much more effort should be done in this direction.
3.5 Conclusions
In this chapter we have presented the basics of the system of highways,
which we use to exploit problem-domain information during the search of a
SAT-solver (MiniSat in our implementation). The two mechanisms we use
inside the solver have been presented, one giving initial priority to highway
variables according to some interesting criteria, the other taking decisions on
highways over the signals the solver has focused on. We have explained the
methodology we have used to assess the functionality of our approach and
provided some experiments that support the design decisions we have taken
in order to balance the delicate equilibrium between the solver doing his
search and our inclusion of additional information. In the following chapters
we will focus in the other big challenge for our project: how to build useful
highways for our SAT search.
3.5. CONCLUSIONS 47
Figure 3.6: NANGATE Routing time by initial activity scheme
48 CHAPTER 3. ENHANCED SAT FORMULATION
Chapter 4
Generating Highways Using
Partial Routings
In the previous chapter we have presented how to modify our SAT-solver to
take advantage of domain-specific information in the form of highways. In
this chapter we will propose two methods that work in a similar direction: we
will be using the router itself to generate the highways. That is, we will use
the router to route our original circuit with some modifications, eliminating
some of the signals in one case, or routing just some parts of the grid in the
other. Both methods will be exposed in the following sections, and finally
we will assess their performance.
This method presents two very important characteristics. The first one
is that it is potentially time-consuming, as we will be performing real SAT-
based routing to obtain highways for a later SAT-based routing. Thus, there
is a risk of having excessively long runtimes if we use the SAT-solver itself as
a preprocess. However, the highways that we will be obtaining will probably
be of good quality, not only because they have been generated by the router
itself, but specifically because we know they will comply with the design
rules of the router, which is something that would not be guaranteed in other
highway generation methods. Additionally, with the Cell-dividing algorithm
we will specifically aim at groups of standard cells.
49
50CHAPTER 4. GENERATING HIGHWAYS USING PARTIAL ROUTINGS
4.1 Signal-Removal Algorithm
The first method we will present in this chapter is the Signal-removal algo-
rithm. The idea behind this approach is to remove some of the signals of
the cell and use the router on it, so that we can obtain meaningful highways
while avoiding the complexity of the problem. In order to implement these
strategies we have used Python, as the computational cost of the algorithms
we will now present lies in the execution of MiniSat.
The algorithm is described in Algorithm 8. The first thing we do is calling
the router in order to obtain information on the subnet distribution of the
grid, which is something that is decided during the routing process. Then
we use an analogous method to the one presented in Section 3.2 in order
to obtain information on the columns each subnet spans, and which subnet
crosses each column. We use this congestion information to decide which
nets we want to keep.
Our aim at this point is to create a congestion ranking among the nets
of the cell. Each net has a vector with as many position as columns in the
cell. For each column the net spans, its corresponding entry in the vector
is the number of subnets that must cross the column, otherwise it is set to
0. We sort each vector in decreasing order, and then sort all the vectors
among themselves in lexicographically decreasing order. This results in our
congestion rank. We are giving more priority to the maximum congestion
among the spanned columns, and then the number of spanned columns. See
the following congestion vectors and their congestion ranking.
Ca = [4, 4, 3, 2, 0, 0, 0, 0, 0] (4.1)
Cb = [4, 3, 3, 2, 0, 0, 0, 0, 0] (4.2)
Cc = [3, 3, 3, 3, 3, 0, 0, 0, 0] (4.3)
Ca > Cb > Cc (4.4)
Now we must prepare the new grid. We will get a copy of the original one
and remove the top k more congested nets by simply going through every
position of the net and changing the position to “occupied” in case it was
used by one of the eliminated nets, in order to tell the router that those
positions can not be occupied by other nets. We can see a comparison of an
original grid and the same grid removing the top three congested signals in
4.1. SIGNAL-REMOVAL ALGORITHM 51
Algorithm 8 Signal-removal algorithm
1: G←original grid
2: H ← ∅
3: subnets← Router.obtain subnets()
4: calculate columns per subnet, subnets per column
5: C ← initialize congestion vectors()
6: for each net n ∈ nets() do
7: for each subnet s ∈ subnets(n) do
8: for each column c ∈ column span(s) do
9: C[n][c]← subnets per column[c]
10: end for
11: end for
12: sort(C[n])
13: end for
14: sort(C)
15: F ← G
16: for i ∈ [0..k] do
17: n← net id(C, i)
18: F.remove net(i)
19: end for
20: R← Router.route(F )
21: for each net n ∈ nets() do
22: for each subnet s ∈ subnets(n) do
23: H ← H ∪ recover path(R, s)
24: end for
25: end for
52CHAPTER 4. GENERATING HIGHWAYS USING PARTIAL ROUTINGS
Figure 4.1. After routing the grid with removed signals, we will recover the
paths for the signals used in the solution and indicate the router that now it
must route the original grid, but using the paths as highways.
Figure 4.1: Grid after removing top 3 congested signals and original grid
One of the main drawbacks of this method is that, by removing signals,
there is no guarantee that complexity will be decreased. In particular in
the case of unsatisfiable cells, removing signals might cause them to fall
in the long-runtime area in which it is difficult to determine if the cell is
routable, whereas originally the unsatisfiability could be determined in a few
propagations.
4.2 Cell-Dividing Algorithm
The Cell-dividing highway generation could be used in any situation, but
aims specifically at the routing of groups of standard cells. The idea is that
we will divide our target grid into multiple small grids, each of which we will
route separately. Later we will use their solutions as highways and route the
whole cell. The additional information provided by the already known paths
4.2. CELL-DIVIDING ALGORITHM 53
should reduce the routing time. This section extends the work I presented
as my final degree project [19], which already explored the idea of dividing
big cells and routing separate pieces of them, trying to use their information
later to route the whole cell.
Let us take for instance the following example, routing two AND2 X2
cells. Our algorithm decides to cut them in the middle row, generating two
instances of the routing problem, one for the left part and one for the right.
Figure 4.2 shows how the results of the partial routing look on the complete
grid. Figure 4.3 shows the result of routing the cell using the highways
provided by the partial routings. The thick wires are the ones provided by
the highways; notice how some of the original paths appear again on the final
solution, whereas some other routes are not used in the final solution.
Figure 4.2: Partial routings of AND2 X2 2
The reason why this method is interesting in groups of cells is that it helps
alleviate difficulty due to the size of the problem more than difficulty due to
its complexity. If we route separately parts that have lots of connections,
when trying to use the information we have gathered by isolating parts of
the problem we will find that it is rather useless. The ideal situation in where
54CHAPTER 4. GENERATING HIGHWAYS USING PARTIAL ROUTINGS
Figure 4.3: Complete routing of AND2 X2 2 using partial routing highways
to use this approach is when this isolation can be done in a natural way, and
such arises when the cell has components which share close to no signal
among them, providing meaningful places where to divide our problem.
As mentioned before, the approach is similar to the one I used in my final
degree project [19]. However, in that case the highways were always fixed
and, in the final routing stage, the router would not be able to discard the
highways, thus resulting in a very high percentage of unsatisfiable routings.
In our current approach we are not forcing the use of the results of the
partial routings, but only suggesting them to the SAT-solver as possibly
good options. Using the highway system has allowed us to bring together
the power of the partial routings and the flexibility of the enhanced SAT
search.
The process of the Cell-dividing algorithm is presented in Algorithm 9.
We begin with the original grid G and an empty set of highways H. First,
we decide where the borders of the divisions should be. This can be done
by checking the congestion of the columns and deciding that the cuts should
be placed where such congestion is minimum (meaning few or no signals are
crossing the boundary).
Deciding where the borders of the divisions will be gives us a series of
regions that need to be routed. For each of them we create a new temporal
grid and copy the region zone to this new grid. Special care is taken to
adjust all parameters of the grid, such as signals, counts, etc. Then we use
the router to route the temporal grid and we extract the highways of the
solution by tracing the path uniting every subnet in it. The only thing left
4.3. EXPERIMENTS 55
Algorithm 9 Cell-dividing algorithm
1: G←original grid
2: H ← ∅
3: V ← divide grid(G)
4: for each region reg ∈ V do
5: TG← copy grid part(G, reg)
6: perform TG grid adjustments
7: R← Router.route(TG)
8: for each net n ∈ nets() do
9: for each subnet s ∈ subnets(n) do
10: H ← H ∪ recover path(R, s)
11: end for
12: end for
13: end for
to do for the router should be to route the area around the division columns,
or signals shared by several parts if it is the case.
4.3 Experiments
In this section, experiments for the highway generation methods presented in
this chapter are introduced. In the first subsection we will discuss about the
results obtained by routing the cells removing some of the signals in order
to obtain highways for the final routing, whereas in the second one we will
present the results dividing the cell to route its parts, and later using them
as highways in the last routing stage.
4.3.1 Signal-Removal Experiments
In order to assess the performance of this algorithm we have routed the
entire Nangate library and compared the time with the routing time of the
original tool. The Nangate library is the same benchmark used by the tool
before the extensions introduced by this thesis. It includes a wide variety
of standard cells, ranging from small NAND gates to scan flip-flops with set
and reset. The experiments have been conducted using an Intel Core i5 at
2,6GHz with 16GB of RAM. The results are presented in Figure 4.4. The y-
56CHAPTER 4. GENERATING HIGHWAYS USING PARTIAL ROUTINGS
axis shows the routing time of the whole library, represented by each column.
The first column is the routing time using the original library, while all the
others correspond to using the Signal-removal algorithm removing the top
k congested signals to obtain highways, where k is the number at the foot
of the column. The composition of the columns is as follows. The lowest
section (orange) represents the SAT time of the final routings. The next
section (yellow) represents the extra time overhead due to formula generation
and I/O of the process. The next section (blue) represents the SAT search
time devoted to partial routings, whereas the last section (green) represents
formula generation and I/O of the extra routings.
Figure 4.4: Running time routing flipflops in Nangate using the
Signal-removal algorithm
Let us first focus on the comparison between the original execution and
the Signal-removal algorithm removing only the top congested signal. We can
see that the SAT solving phase time is reduced to about a little more than half
of the original time when routing using those highways. However, the over-
head induced by routing the partial SAT is comparable to the entire original
routing time. This is caused by cells that originally were clearly unsatisfi-
able and become satisfiable by removing one signal, but for which finding a
satisfying model requires a lot of time. We must additionally consider the
4.3. EXPERIMENTS 57
extra time of the partial routings, ignoring the partial routing SAT-search
time; however, it might be reduced by using an incremental scheme during
the formula generation.
As the number of congested signals decreases, the partial SAT search
time dwindles until becoming close to negligible. At this moment we are
very easily solving cells in which the most congested signals are gone, thus
generating sets of highways for the other signals relatively fast. Notice that
all SAT search times are reduced when compared to the original, with a
speedup of up to 2x. The best versions are the ones removing several signals
and generating highways with the ones left, but still do not give much gain
considering the cost of producing the highways.
We did not further explore the possibilities of this algorithm, given that
small tweaks on a number of parameters involved in the algorithms, such
as partial routing optimization rounds and modified congestion orderings,
did not seem to provide interesting speedups over the configuration we have
presented.
4.3.2 Cell-Dividing Experiments
As exposed before, the main interest of this algorithm is to route groups of
standard cells. In order to study the performance of the algorithm we have
used the cell of the Concatlib library, consisting of the result of concatenat-
ing the cells of Nangate with themselves up to 4 times. The experiment we
conducted consists on routing all the flipflops and their Concatlib versions,
using the original algorithm and the Cell-dividing algorithm. The experi-
ments have been conducted using a Intel Xeon X5670 at 2933MHz with up
to 64GB of RAM.
The results can be observed in Figure 4.5. The y-axis represents time in
seconds, and each column represents an algorithm. As for the composition of
the columns, the base (orange) represents SAT solving time, the next section
(yellow) represents formula generation and other router running times, and
the last section (blue) represents the overhead devoted to the routing of parts
of the original cell.
The algorithms in the graphic are the following. First, the original algo-
rithm represents using the router without any of the modifications related to
the highway system. The two remaining columns represent using the Cell-
58CHAPTER 4. GENERATING HIGHWAYS USING PARTIAL ROUTINGS
Figure 4.5: Running time routing flipflops in Concatlib using the cell-divide
algorithm
dividing algorithm to route all of the flipflops and their concatenations, using
0 or 1 rounds of optimization in the partial results respectively. The differ-
ence is relevant because using a round of optimization will use more time
in the partial routings section, but might give better highways to the final
overall routing.
As we can see, the original algorithm is the one that takes the most time,
with around 92.000s of SAT solving and 20.000s devoted to other running
times. This last part will remain constant through all experiments and de-
pends partially on the I/O of the system, but it is interesting to see its
relative weight in the whole running time. By using the Cell-dividing algo-
rithm we manage to reduce the SAT-solving time to a half of the original
one, by adding a small overhead on partial routings. By adding a round of
optimization on this partial routings we are increasing the partial routing
time a little, but the SAT-solving time drops again, this time to a third of
the original time. By considering overall time, this last version manages to
route all the flipflops in a half of the original time.
4.3. EXPERIMENTS 59
Figure 4.6: Running time vs difficulty of Concatlib cells
The graphic shown in Figure 4.6 will give us more insight on how the cells
are reacting to our algorithm and the distribution of the running times of the
cells in Concatlib library. Each dot on the graphic represents the routing of
a cell in Concatlib. The y-axis represents the routing time of a cell, and the
x-axis represents the estimated difficulty of routing the cell, using a measure
based on size and congestion, presented with more detail in Section 5.4. The
red dots represent routing using the original routing algorithm, whereas the
blue dots represent the routing using the Cell-dividing algorithm without
optimization rounds on the partial routings. The time limit is set at 14.400
seconds.
We can observe how in general the blue dots are below the red dots,
meaning they take less time. There are very big differences in running times
of concrete cells, which can be seen when there is a dot of a color in a high
position, but its counterpart is in a very different position. For example,
consider the two red points that take 14.400 seconds, meaning they ran out
of time, using the original algorithm. Their blue counterpart is in the lower
part of the table, meaning that just by routing those cells using our algorithm
we save thousands of seconds in running time. The reverse effect occurs in
the cell that scores the most difficulty, which is unroutable. In this case, our
cell-division algorithm reaches the time limit and cannot decide on the satiafi-
60CHAPTER 4. GENERATING HIGHWAYS USING PARTIAL ROUTINGS
ability of the instance, whereas the original algorithm detects unsatisfiability
in very low run time.
4.4 Conclusions
This chapter has presented two strategies for the generation of highways.
Both rely on using the router itself to route restricted versions of the original
problem, either by reducing the number of signals to route in the first case,
or by routing only portions of the original cell in the second case. These
methods should provide highways complying with the design rules at the
expense of the highway generation cost introduced by the use of the router
itself.
We have also shown the results of the experiments when using both meth-
ods. In the first case we tried the algorithm on the whole Nangate cells, pro-
ducing not very optimistic results given the high highway generation costs.
The second method, aimed specifically at the routing of groups of standard
cells, provides interesting results when routing the most significant cells of
the Concatlib library, managing to obtain speedups of 2 3x.
Chapter 5
Generating Highways Without
Partial Routings
In the previous chapter we have presented a possible strategy for the genera-
tion of highways. In this chapter we will expose two additional strategies we
implemented to generate sets of highways. The basic idea is to look at the
circuit and create highways that are likely to be useful for our SAT search,
basically through two methods: generating highways with up to two corners,
presented in Sections 5.1 and 5.2, and and greedily routing the grid to obtain
highway candidates, presented in Section 5.3.
Whereas the time spent generating the highways presented a very high
cost in the methods introduced in the previous chapter, the ones exposed
in this chapter deliver highway sets in a very low time. However, they are
not guaranteed to comply with the design rules nor are they produced by
the router itself. Even though we risk finding redundant and uninteresting
highways, a careful analysis of the highways that are really used in the router
by taking a look at already solved instances, in the first method, and the use
of greedy routings in the second method, will prove to give good results as
we will see towards the end of this chapter.
61
62CHAPTER 5. GENERATING HIGHWAYSWITHOUT PARTIAL ROUTINGS
5.1 2-corner Highway Generation
The first set of highways that we introduce is the one containing 2-corner
highways. When looking at the solutions provided by the original router,
it is easy to see that a lot of the paths among components are paths with
at most two corners. These paths include straight lines, simple corners and
z-shapes as shown in Figure 5.1. We ignore corners introduced because of
the changes of metal layers, which inevitably rise in this kind of 3D designs.
In fact, the only paths that need more than 2 corners are the ones that take
strange shapes in order to avoid conflicts with other paths that are already
present in a given solution.
Figure 5.1: Shapes of paths with at most 2 corners
5.1.1 Motivation
In order to validate our intuition we do a statistical analysis of the paths
obtained when routing the whole Nangate library using the original tool.
We classify each path according to the number of corners it has. Figure 5.2
shows the result of this exploration. Each column represents the number of
corners, and the number of paths that have such corners are the height of the
column. We can see that most of the paths of the routed cell library have 2
or less corners: in fact, most of them have either 0 or 2 corners.
If we could pre-generate all these paths and feed them to the solver,
we would be able to directly interact with them as has been exposed in
the previous chapter. Generating paths with more corners would be more
complex, and would have to consider more possible bad interactions with the
design rules. Moreover, the possible number of paths of more than 2 corners
5.1. 2-CORNER HIGHWAY GENERATION 63
would increase greatly, and we would like to keep the number of highways
reasonable as we have seen in the previous chapter.
Figure 5.2: Distribution of paths by number of corners
5.1.2 All-Paths Generation Algorithm
We generate all the paths of up to 2 corners for every pair of components
that the router needs to connect. In order to connect a multi-component
net, the router divides it into pairs of components which must be connected,
called subnets. That is, we do not generate these paths for every possible
pair of components of every net, but only for the subnets that the router has
calculated beforehand. For instance, Figure 5.3 shows a distribution of the
components of a net into several subnets.
Given two components (called src, dst) for which we want to generate
highways, we proceed as described in Algorithm 10. We must generate high-
ways for every pair of pins every component has. Now, given the two pins,
we have two situations. In the easy one they are on the same column or row:
then according to our constraints there is only one possible highway, a direct
connection. In the other case we can proceed to generate the other highways
by doing a vertical and a horizontal sweep from one pin to the other.
The sweep process is illustrated in Figure 5.4 and consists of the following.
Let us take the vertical case, as the horizontal is analogous. We generate all
64CHAPTER 5. GENERATING HIGHWAYSWITHOUT PARTIAL ROUTINGS
Figure 5.3: Possible subnet distribution of a multi-component net
Algorithm 10 2-corner generation for a subnet
1: Hsrc,dst ← ∅
2: for each pin p ∈ pins(src) do
3: for each pin q ∈ pins(dst) do
4: if same column(p, q) or same row(p, q) then
5: Hsrc,dst ← Hsrc,dst ∪ direct highway(p, q)
6: else
7: Hsrc,dst ← Hsrc,dst ∪ vertical sweep(p, q)
8: Hsrc,dst ← Hsrc,dst ∪ horizontal sweep(p, q)
9: end if
10: end for
11: end for
2-corner paths from D to B that leave D by its right side. If the path has
only two corners, then one must be to bend upwards and the next one must
bend to the right again. This means that all possible paths are all possible
bend columns. Thus, with a simple sweep we can generate all paths, and
as an extra we get one of the two 1-corner paths in the last position of the
sweep. A similar process sweeping with the horizontal lines gives the other
half of the paths we are interested in.
The overall algorithm for the generation of the 2-corner highways is pre-
sented in Algorithm 11. We enumerate all subnets, which are pairs of com-
ponents we want to connect, and accumulate all highways.
Notice that highways are produced for all signals except for voltage (vdd)
and ground (gnd). The reason for excluding them of the highway generation
is that, experimentally, we have found that they tend to distract the solver,
5.1. 2-CORNER HIGHWAY GENERATION 65
Figure 5.4: Paths from D to B, generated by a vertical sweep (top row) and
a horizontal sweep (bottom row)
Algorithm 11 2-corner generation
1: H ← ∅
2: for each net n ∈ nets do
3: if n is not vdd or gnd then
4: for each subnet s ∈ subnets(n) do
5: H ← H ∪ 2corner highways(s)
6: end for
7: end if
8: end for
instead of helping the search. Intuitively what happens is that the solver
tends use the highways greedily. In the case of vdd and ground there are
a lot of subnets that could potentially share metal, but since the highway
generation does not take into account the interaction of highways at a multi-
pin level, the opportunity for this coordination is not present in our model.
The result is that a lot of redundant metal appears, and overall the connection
of the vdd and gnd nets is not very good. We have found that by leaving
them and focusing in producing highways for the rest of subnets brings better
results in combination with out interaction schemes.
Additionally, the generation of highways for vdd and gnd presented sev-
eral additional challenges. The router had the tendency to generate subnets
tying the vdd/gnd lines and the components inside the cell, instead of subnets
66CHAPTER 5. GENERATING HIGHWAYSWITHOUT PARTIAL ROUTINGS
pairing components inside the cell itself. Thus the set of highways connect-
ing the vdd/gnd terminals would not appear using the highway generation
scheme we have presented until now. Extra tweaks were introduced to ac-
count for this, generating special sets of highways for vdd and gnd. Moreover,
we considered giving the router a fairly generous amount of vdd/gnd high-
ways, with shared metal segments among them, and forcing the router to use
them to route the vdd/gnd nets. This however resulted in a very high number
of unsatisfiable cells, as the requirement of routing vdd and gnd solely with
highways was too strong, and in combination with the other negative results
convinced us that we should let them outside of the highway generation for
the time being.
5.2 2-corner Highway Selection
Up until this point we have generated all possible 2-corner highways for
each subnet. However, when both components are far apart, this generates
hundreds of highways per subnet. In this section we explore how to, given
all the highways we generate, select which ones should we include in our
formulation and which ones should we discard.
5.2.1 Motivation
Because of the way we generate the highways, most of the times a lot of the
highways are very similar among them, so maybe we can ignore some. As we
have already seen in Section 3.4, in the presence of one essentially good set
of highways, adding extra highways to the mix degrades the performance of
the search.
We have limited the amount of highways for each subnet to the top k
best ones. We have highways for every subnet but overall we do not have an
excessive number. We conducted a small experiment in order to understand
how the highways we generated where used when we included all of them in
the formulation. The first criteria we proposed to sort the highways and get
the top k was taking the shortest highways. We gave the ordered highways to
the router and, after we got the solution to routing the whole Nangate library
we observed, for every subnet that used highways, which was the position of
the highway it used inside the ordering of highways for the subnet.
5.2. 2-CORNER HIGHWAY SELECTION 67
Figure 5.5 shows the results of the experiment. The graphic shows the
absolute position of the used highways in the vector of generated highways
for their subnet, after ordering them according to their length. The x-axis
corresponds to the position, whereas the y-axis corresponds to the number of
highways that fall in that position. From this graphic we can observe that,
even if we limit the number of highways to a small k per subnet, we are still
maintaining most of the highways that the router is using. We must look for
a number k that achieves a balance between ensuring that the highways we
want to use are present but that we are not adding excessive highways.
Figure 5.5: Distribution of used highways according to their position among
those generated for their subnet
However, considering only the shortest highways as the best seems a little
bit naive. After observing the highway sets that were generated we noticed
that many of the highways collided. That meant that activating some high-
way would render other highways for other subnets useless, which is some-
thing we would like to avoid if we want the highways we have generated to
fit in harmony into the same solution. We extended the ordering to take the
conflicts among highways into account, as will be exposed in the rest of this
section.
In short, we have seen in the early experiments that having too many
highways degrades the search, so we must cap then number of highways that
68CHAPTER 5. GENERATING HIGHWAYSWITHOUT PARTIAL ROUTINGS
we add to the formula for each subnet. On one hand we would like to keep
the shortest highways, but on the other we would also like our highways to
be non-conflicting with the other highways we are generating.
5.2.2 Selection Algorithm
We propose an algorithm that balances both criteria. The first step is to
create a conflict graph among the generated highways, with a vertex repre-
senting every highway, and an edge between each pair vertices only if the
highways they represent are conflicting. We begin with an empty edge set.
In order to efficiently create the graph, the bounding box of all pairs of com-
ponents that need to be connected has been pre-calculated. For every pair
of subnets we check if their bounding boxes intersect. If they don’t, then
there can be no possible highway conflict, given that the 2-corner highways
may never leave the bounding box (case in Figure 5.6, left grid). Otherwise,
the bounding boxes intersect and we have to check every possible pair of
highways to see if they are conflicting. If they are, a new edge is added to
our graph (case in Figure 5.6, right grid).
Figure 5.6: Non-intersecting and intersecting bounding boxes
Now we use this conflict graph to know the relation among the conflicting
highways in our problem and combine it with their length metric to keep the
best highways for each subnet. The next step in the process is to fix a desired
number of highways for each subnet. This number is k for every subnet except
those for which less than k highways have been generated, in which case all
highways will be included in our formulation.
The selection algorithm now proceeds as follows. We want to remove
nodes from the graph until there are at most k nodes for each subnet. When
removing the nodes, we want to remove the ones that represent conflicting
and long highways. In order to do so we use a scoreboard data-structure.
5.2. 2-CORNER HIGHWAY SELECTION 69
We define the score of each highway as C + λLα, where C is the number of
conflicts it has with other highways (that is, the number of neighbors in the
graph) and L is the length of the highway. When λ = 0 we are not considering
the length of the highways, and playing with the values of λ and α we can
tweak their relative weight compared to the conflicts between highways.
The data structure consists of the components shown in figure 5.7. The
figure represents a scoreboard with some nodes, each one representing one of
the highways of the graph. On one hand we have a vector of buckets, one
for every possible score, and each bucket contains a vector with all highways
with such score. On the other hand we have the shorthands vector that
contains, for each highway, a pointer to the bucket that contains it and a
pointer to the position in the vector of the bucket. Additionally we keep a
unique pointer to the highest scored bucket that has highways in it.
Figure 5.7: Scoreboard data structure overview
The operation that we perform repeatedly on the data structure is the
deletion of the highway (node in the graph) with the highest score. Our
resulting highway set, H, begins empty and, when only the desired number
of highways for a given subnet remains in the structure, we include them
all in the set. The process is described in algorithm 12. Originally, every
highway is inserted in the bucket corresponding to its score, and the initial
shorthands vector is created. Now we access the top scored bucket and delete
70CHAPTER 5. GENERATING HIGHWAYSWITHOUT PARTIAL ROUTINGS
the first highway- we remove it from the bucket and mark it as removed in
the shorthands vector. We must downgrade every conflicting highway by
one bucket, given that we have discarded one node with which they are not
conflicting anymore, updating all components of the data structure accord-
ingly. When there are no more highways of that score, we check if there are
highways in the score minus one, and delete them using the same procedure.
As exposed, when we have reached the objective number of highways for a
given subnet we remove all of them from the data structure and consider all
of them valid for our formulation, but without downgrading the neighbors
given that this highways will be in the final selection of highways and must
count as a conflict for all the other highways still present in the graph. We
finish the procedure when we have emptied all the buckets, meaning we have
removed all nodes from the graph, either because it had a high score and we
discarded the node, or because we had filled the quota of highways for their
subnet and we accepted it as a highway.
Algorithm 12 Highway selection - Scoreboard
1: H ← ∅
2: highest bucket with nodes← max score
3: while highest bucket with nodes >= 0 do
4: while size(buckets[highest bucket with nodes]) > 0 do
5: n← get node(buckets[highest bucket with nodes])
6: delete from scoreboard(n)
7: for each node m ∈ neighbors(n) do
8: if m in scoreboard then
9: downgrade one bucket(m)
10: end if
11: end for
12: s← subnet(n)
13: if deleted enough highways(s) then
14: for each node n ∈ remaining nodes in scoreboard(s) do
15: H ← H ∪ associated highway(n)
16: delete from scoreboard(n)
17: end for
18: end if
19: end while
20: highest bucket with nodes− = 1
21: end while
Using this data structure it is very simple to get the top element, as well
as update the positions of the other elements when needed. The maximum
5.3. GREEDY HIGHWAY GENERATION 71
score depends on the length and parameters of the algorithm, as well on the
number of conflicts, which is not very high in absolute numbers. By using
this algorithm we can choose a predetermined number of highways for every
subnet, normally determined by the parameter k (unless we have less than
k highways for the subnet), and we can obtain them focusing more on their
length of on their conflicts by tuning the α/λ parameters. Summing up,
the 2-corner generation method proposes to create all highways with up to
2 corners among nodes of the subnets and then picking the best k for every
subnet, according to the α/λ parameters. In the experiments section we
will see the results we obtain by modifying these input parameters on our
Nangate data set.
5.3 Greedy Highway Generation
In this section we will present the greedy highway generation algorithm,
which explores a different direction. With the previous method we have seen
how to generate highways that are good in terms of length and conflict-
avoidance. However, all of them are very similar and regular, and we would
like to have more complex highways that imitate the behavior of paths that
must avoid conflicts in order to connect their two components.
The main idea of the greedy highway generation is that we greedily route
the cell several times, keeping the generated paths as highways to include in
our formulation. The routing algorithm is shown in Algorithm 13. It is a
recursive function that receives a grid representing a partially solved routing
problem, a vector of integers representing the subnets and an index for the
vector. The procedure tries to route the index subnet of the vector, s, bearing
in mind that all the subnets with a lower index have already been routed. In
order to do so we do a modified Dijkstra to try to find the path between the
src and dst regions of s that adds the lowest number of wire segments.
The modified Dijkstra takes into account that we do not want to connect
src with dst, but any node connected to src with any node connected to dst.
For instance, take a look at Figure 5.8 and assume all pins belong to the same
net. We want to connect A and C, and the blue routes between A-B and
C-D are already present. Notice that the shortest path would be the yellow
path following the bottom line. However, since we already have pre-existing
wires, we only need to add the two red metal segments if we want connect A
with C using the least possible wire.
72CHAPTER 5. GENERATING HIGHWAYSWITHOUT PARTIAL ROUTINGS
Algorithm 13 Greedy Routing
1: procedure greedy routing(G : grid, subnets : vector of Integer, in-
dex : Integer)
2: s← subnets[index]
3: modified dijkstra(src(s), dst(s))
4: if path found then
5: add path to G
6: if index < size(subnets) then
7: greedy routing(G, subnets, idx+ 1)
8: end if
9: end if
10: end procedure
Figure 5.8: Shortest path using previously computed routes
In order to do this search we proceed as follows. First, we mark all the
nodes connected to the dst region by beginning in one of them, marking it
and successively marking all neighbors with are connected using the same
signal as s. Now we mark all the nodes connected to the src region and
assign them distance 0. We proceed by expanding the src zone until we
reach a node in the dst zone. All edges leaving the nodes in the src zone
add weight 0 if they are of the same net as s, 1 if they are empty and legal,
and infinite if they are occupied by other signals or illegal. Following the
typical dijkstra procedure, we keep a priority queue to find the node that is
closest to our visited zone and we have not visited. We keep on expanding
and assigning distances until we stumble with the first node marked as dst.
We know it is the one at a shortest distance from the src region, so we trace
back the path in order to find the minimal set of wires that we need to add
to connect our components.
5.3. GREEDY HIGHWAY GENERATION 73
If a path has been found we fix it in the grid, which also acts as output.
If we have already routed all subnets, then we return. Otherwise, we call the
same function with index+1. Notice that if a path has not been found, then
we simply return and the grid contains all the subnets we have routed until
the point in which we can not route the next subnet.
The general flow of the algorithm keeps on doing routings but changing
the routing order of the subnets. The ordering is critical, as can be observed
in a small example of Figure 5.9. In both cases we are routing the blue and
the red nets. In the top case we first greedily routed the blue net and later
the red one, in the bottom case we first routed the red net and later the
blue one. Because of the greediness of the algorithm, in the bottom case
we chose a red route that broke the short connection for blue. In a normal
scenario we would be only interested in the top grid, since it has the lowest
total wire segment. However we are interested in the long, conflict-avoiding
path as well, because it might well be that the red connection used in the
top case is unavailable because of other nets interfering, and then we can use
the other set of non-conflicting highways that we have obtained by picking
what initially seemed like a bad ordering.
Figure 5.9: Same grid routed with two different orderings
74CHAPTER 5. GENERATING HIGHWAYSWITHOUT PARTIAL ROUTINGS
We can see the top level of the greedy generation algorithm in Algorithm
14. We begin with an empty set of highways and an initial ordering of
the subnets. We do the routing with the initial ordering as presented in
Algorithm 13. When the routing finishes we retrieve from the grid a number
of highways, one for every subnet that havs been routed before reaching an
unroutable subnet or routing the whole grid. Then we clean the grid from
highways and move the first subnet in the ordering to the last position of the
subnet ordering, thus preparing for another iteration that will generate new,
different highways. We iterate until we have explored all possible orderings
given by the operation of moving the first element of the ordering to the last,
which is linear in the number of subnets.
Algorithm 14 Greedy Highway Generation
1: H ← ∅
2: subnets← initial subnet ordering()
3: for i ∈ [0, size(subnets)] do
4: greedy routing(G, subnets, 0)
5: H ← H ∪ extract highways(G)
6: G← remove highways(G)
7: subnets← move first to last(subnets)
8: end for
At the end of the process we have generated a rather varied set of high-
ways, containing both short and conflict-avoiding highways for our interac-
tion algorithms to use. Notice that if we managed to route the grid during
this phase it would not be a valid routing, given that even if our greedy rout-
ings take into account the most basic design rules, it does not necessarily
comply with the complex ones.
5.4 Experiments
In this section we will present experiments comparing and assessing the per-
formance of the highway generation methods we have presented in this chap-
ter. The experiments have been conducted using a 64 bits Linux on an Intel
Xeon X5670 at 2933MHz with 64GB of RAM. The data set we have used for
this experiments is the Nangate 45nm cell library.
5.4. EXPERIMENTS 75
5.4.1 Runtime Distribution
We have not find a way to accurately characterize the running time of the
tool from metrics of the input problem. However, as shown in Figure 5.10, it
can be seen that there is a correlation between cell size, mean congestion and
execution time. If the cell is big it tends to need more time to find a routing,
and the same happens when the mean number of nets that must cross a
column is large. Let us define a routing difficulty measure, the number of
columns of a cell, accounting for its size, times the mean lower bound of the
number of nets that must cross each column, accounting for the congestion.
Each dot in the graphic represents the routing of a cell in our library by the
original tool. On the x-axis of the picture we have the number of columns
times the mean congestion of the columns, whereas in the y-axis we have the
time taken to route the cell in the original tool (seconds).
Figure 5.10: Running time distribution according to routing difficulty using
the original routing algorithm
As we can read in the graphic, most of the cells take a very small time
to be routed. However, as routing difficulty increases, we see there is a
greater tendency towards longer routing times. Notice how some cells which
have very high routing difficulty have very low run times: those are clearly
unstaisfiable cells, and the SAT-solver can determine so very fast. It is in
the cells in the sat-unsat border that executions delay a long time. We have
76CHAPTER 5. GENERATING HIGHWAYSWITHOUT PARTIAL ROUTINGS
focused on reducing the running times of the cells that are in the sat-unsat
border, which are mainly the flip-flop standard cells of the Nangate library.
5.4.2 Highway Selection Parameter Tuning
Back to the algorithms we have presented, we used the whole Nangate library
as a test bench. Given the number of parameters in the algorithms, one first
interesting experiment was to see the running times when varying k and α/λ
in the 2-corner algorithm, and to do so we can take a look at Figure 5.11.
Each column represents the routing of the whole standard cell library. The z-
axis represents the running time, which ranges from 150 seconds in the lowest
column to 1180 seconds in the highest. The y-axis, with rows numbered from
0 to 17, represents the number of highways per subnet, except in row 17
which contains the mean of thr running times using from 1 to 15 highways.
The x-axis represents several combinations of α/λ values: rows 0 to 7 have
α = 1 and λ = [0, 1, 2, 4, 6, 8, 10], whereas rows 8 to 14 have α = 2 and
λ = [0, 1, 2, 4, 6, 8, 10]. Notice how the row corresponding to 0 highways per
subnet (y-axis) represents the original routing time, and naturally it has the
same routing time for all values of the x-axis.
In general we can see how the running times decreased with the use of
highways. There are a few exceptional runs in which we obtained worse times
than using the original tool, up to double time in the worst case (from around
600s to 1180s). Time has been reduced in the others, down to a quarter of
the original in the best case (from around 600s to 150s). By taking a look
at the graphic, we can see how the columns that have less runtime tend to
be the ones with a minimum number of highways for subnet and those that
have α = 2, which means penalizing long paths quadratically on their length
during the highway selection phase. In future experiments we have taken
α = 2 and λ = 4 as parameters for our algorithm, given they proved to be
the best in these circumstances.
We now focus on how to determine the best k, the number of highways
we want to include for every subnet. It is a decision that greatly affects
the routing time of each cell. Let us take for example the routing of a
representative cell of the library, a flip-flop with scan and set, SDFFS X2.
We have a graphic depicting its routing times in Figure 5.12.
The bottom red line, “0”, represents the original routing time. Then
every other line up to 14 represents the routing time using the said number
5.4. EXPERIMENTS 77
Figure 5.11: Running times for several α/λ/k
of highways per subnet. Finally, the top green line, “mean”, represents the
mean time of routing the cell using highways. It can be seen that the time
to route a single cell varies greatly changing k, but also chaotically, as there
is no clear trend aside noticing how the time seems to descend the more
highways we add. If we create the same graph routing all the flip flops of the
library, which represent most of the routing time, the result can be seen in
Figure 5.13.
In this case there seems to be a more unified tendency. Mainly, having
a low number of highways per subnet (from 1 to 5) seems to hinder the
potential gains in runtime that are obtained using more of them. Comparing
the original with the mean time we observe a speedup of over 2, which is even
better in certain instances. This is the order of the speedup that we already
saw in the experiments of Chapter 3, using any of the search initializations
we presented in that chapter.
78CHAPTER 5. GENERATING HIGHWAYSWITHOUT PARTIAL ROUTINGS
Figure 5.12: Running times routing SDFFS X2 with different k
5.4.3 Greedy Generation
Let us take a look at the algorithm with the greedy generation. This algo-
rithm has no parameters that we can tune, so there is no need to perform
the parametric analysis we have seen with the 2-corner algorithm. We take
a general picture combined with the previous result in Figure 5.14.
This figure, which is a revisited version of the last figure in Chapter 3,
can now be explained more in depth. The first column represents the original
routing time of the whole Nangate library. The next column represents the
routing time using the greedy highway generation algorithm. The following
three columns represent the routing time using the 2-corner algorithm and the
three different priority initialization schemes that we have already presented.
In the case of the last three columns, it is the mean time of routing using
from 1 to 14 highways per subnet.
When running our algorithms on the Concatlib dataset, which we pre-
sented in the previous chapter, we saw very different results, from cells drop-
ping their running time to a half to others doubling it. The results did not
seem very consistent, and thus as we stand now the cell-divide algorithm
5.5. CONCLUSIONS 79
Figure 5.13: Running times routing all flip-flops with different k
is the best option when routing groups of cell as the ones present in the
Concatlib dataset, whereas the highway generation methods exposed in this
section would be preferable when dealing with single standard cells.
5.5 Conclusions
In this chapter we have presented two additional strategies for the generation
of highways. The first strategy, the 2-corner generation algorithm, relies on
generating all paths with up to 2 corners among the components we want
to connect. Next we have shown how the algorithm picks a given number
of such highways per subnet, by considering the length of such highways
and their conflicts. The second strategy consists of generating different net
orderings and using them to perform greedy routings, without design rules,
from which we will extract highways that we will use in our SAT search. We
ran several experiments to show the impact of the parameters in the case of
the first algorithm. We can see that all routing times are comparable with
an average speedup of 2 in the case of the Nangate library.
80CHAPTER 5. GENERATING HIGHWAYSWITHOUT PARTIAL ROUTINGS
Figure 5.14: Nangate routing time by algorithm
Chapter 6
Conclusions
This chapter summarizes the conclusions that have been drawn during the
development of this project, most of which have already been mentioned in
other parts of the report. Additionally, several lines of future work are given.
6.1 Final Conclusions
In this thesis we have presented the routing problem for standard cells and
a framework that solves it using boolean satisfiability methods. The aim
of this work is to extend such framework in order to route cells that were
previously considered intractable due to the excessively long running times
of the SAT-solving involved.
The proposed method aims at using knowledge of the problem to guide
the SAT-solver during the search. Our approach consists in extending the
original formulation by adding new variables that imply entire paths of metal
in our original framework. We have explained how to help the SAT-solver
take advantage of these kind of variables by endowing them with initial ac-
tivity and giving them priority over the low-level metal segments. We have
later proposed several methods to generate the metal paths, called highways,
to be introduced in the formulation. The first method relies on using the
router itself to route restricted versions of the problem, either removing sig-
nals or only routing parts of the original problem, to obtain highways to be
used in a final routing stage. The second method is generating highways
81
82 CHAPTER 6. CONCLUSIONS
by looking at the grid and determining some highways will be better than
others, considering criteria such as length, conflict-avoidance, etc.
We have presented several experiments using all of the methods to show
the validity of the algorithms. Their results have already been commented
in detail in their respective sections. As we have seen most of them are
yielding interesting speedup (around 2-3x) by using the router to generate
highways for standard cell groups and algorithmically generating them for
single standard cells. However, our endeavor during the whole project has
been to obtain an even bigger speedup, which has proven to be a very tough
challenge. All of the methods presented in this thesis had still more place
for several additional tweaks and variations, but we do not expect those to
obtain a significant improvement over the speedup they are already giving.
In the end, the problem we are facing is a really tough problem. Not
only because of the hardness of the problem itself, but also because of very
diverse running times of the SAT-solver. Even though we have provided a
measure that more or less captures the notion of “routing difficulty”, it is
very hard to determine the running time of a cell before routing it. Not
only that, but small variations in parameters and the behavior of the search
give very different running times and the number of parameters is enormous.
Although the original idea for the algorithms is very clear, how to tweak the
SAT-solver to take advantage of the highways and how to generate them has
proven to be a very delicate task.
6.2 Future Work
Several lines of future work extend, not directly from the work developed
in the thesis, but using the same framework and regular layout properties
and working on the same problem and extensions. We have been exploring
the idea of using highways to reduce running time of the router, obtaining
good but not decisive results, and we now feel it is time to move on to face
other related challenges. The idea is to work on several of them towards
the completion a PhD degree on the topic of algorithms for the synthesis of
regular nanoelectronic circuits.
The first future line of work is to use the cell router to reduce the usage of
metal wires on upper metal layers. The proposal is routing external signals
through metal layers that are normally reserved to internal connections of
6.2. FUTURE WORK 83
the cells. This could be achieved by exploiting the regularity of the designs
and assigning special regions in the boundaries of the cells reserved for these
signals, taking them into account during the place and route steps in the
synthesis design flow. In doing so we would save routing wire outside of the
standard cells, alleviating the congestion of the routing of upper metal levels.
Another item would be to improve the router by adding electro-migration
awareness to the routing stage. Electro-migration refers to the gradual mi-
gration of the ions in the conducting wire. As the size of ICs decreases,
electro-migration becomes a problem in terms of reliability, causing metal
lines to break due to fatigue [20]. Considering these effects during the layout
design stage is becoming a need in order to increase reliability of the circuits
as we move into new technology nodes, and in particular it is during the
routing stage that the problem can be modeled and taken into consideration
when deciding where and which wires to use to interconnect the components
of the chip, by exploiting the regularity of the designs.
Finally we also plan to extend the framework to allow the design of log-
ical gates on the fly. The normal flow of physical design tries to map the
desired circuit to a set of gates offered by a standard cell library. However,
it would be interesting to be able to generate such cells on the fly in order to
implement frequent gates that do not appear in our library. In order to do so
we will benefit of all the work developed during the first stages of the thesis,
incorporating it to a full chip design flow. The problem presents interesting
challenges, including looking for frequent patterns and deciding which are
the ones that should have their own on the fly generated cells, if any, taking
into consideration many parameters such as cell generation difficulty, impact
in area/timing/power consumption, possible reliability issues and others.
84 CHAPTER 6. CONCLUSIONS
Bibliography
[1] J. Cortadella, J. Petit, S. Gomez, and F. Moll. A boolean rule-based ap-
proach for manufacturability-aware cell routing. IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems, 33(3):409–
422, 2014.
[2] Gordon E. Moore. Cramming more components onto integrated circuits.
Electronics (magazine), 1965.
[3] Ralph H.J.M. Otten. Efficient floor plan optimization. In Proceedings
of International Conference on Computer Design, pages 499–503, 1983.
[4] Yao-Wen Chang and Kwang-Ting Cheng, editors. Electronic Design Au-
tomation: Synthesis, Verification, and Test. Morgan Kaufmann, 2009.
[5] Brian W. Kernighan and Shen Lin. An efficient heuristic procedure for
partitoning graphs. Bell System Technical Journal, 49(2):291–307, 1979.
[6] Sao-Jie Chen and Chung-Kuan Cheng. Tutorial on vlsi partitioning.
VLSI Design, 2000.
[7] A. E. Dunlop and Brian W. Kernighan. A procedure for placement of
standard-cell VLSI circuit. IEEE Transactions of Circuits and Systems,
4(1):92–98, 1985.
[8] K. M. Hall. An r-dimensional quadratic placement algorithm. Manage-
ment Science, 17(3):219–229, 1970.
[9] C. Sechen and A. Sangiovanni-Vincentelli. The TimberWolf placement
and routing package. IEEE Journal of Solid-State Circuits, 20(2):510–
522, 1985.
[10] James P. Cohoon and W.D. Paris. Genetic placement. IEEE Trans-
actions on Computer-Aided Design of Integrated Circuits and Systems,
6(6):956–964, November 1987.
85
86 BIBLIOGRAPHY
[11] Meng-Kai Hsu. Routability-driven analytical placement for mixed-size
circuit designs. IEEE/ACM International Conference on Computer-
Aided Design (ICCAD), pages 80–84, 2011.
[12] D.Z. Pan, Bei Yu, and Jhih-Rong Gao. Design for manufacturing
with emerging nanolithography. IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems, 32(10):1453–1472, October
2013.
[13] http://videoprocessing.ucsd.edu/~stanleychan/research/
Lithography.html.
[14] Marc Pons Sol. Layout Regularity for Design and Manufacturability.
PhD thesis, Polytechnic Unversity of Catalonia, July 2012.
[15] The international SAT competitions web page. http://www.
satcompetition.org.
[16] Niklas En and Niklas Srensson. The minisat page. http://minisat.se.
[17] Brian Taylor. Automated layout of regular fabric bricks. Master’s thesis,
Carnegie Mellon University, Pittsburgh, PA, USA, December 2005.
[18] Nikolai Ryzhenko and Steven Burns. Physical synthesis onto a layout
fabric with regular diffusion and polysilicon geometries. In Proceedings of
the 48th Design Automation Conference, DAC ’11, pages 83–88. ACM,
2011.
[19] Alexandre Vidal Obiols. A divide-and-conquer approach for cell routing
using litho-friendly layouts. Bachelor thesis, Polytechnic University of
Catalonia, Barcelona, June 2013.
[20] Jens Lienig. Electromigration and its impact on physical design in future
technologies. In Proceedings of the 2013 ACM International Symposium
on International Symposium on Physical Design, ISPD ’13, pages 33–40.
ACM, 2013.
