Vertex Ordering for a Partitioning-based Fitting Algorithm for an EPLD Device by Gao, Tongjun
Portland State University
PDXScholar
Dissertations and Theses Dissertations and Theses
11-5-1993
Vertex Ordering for a Partitioning-based Fitting Algorithm for an
EPLD Device
Tongjun Gao
Portland State University
Let us know how access to this document benefits you.
Follow this and additional works at: https://pdxscholar.library.pdx.edu/open_access_etds
Part of the Electrical and Computer Engineering Commons
This Thesis is brought to you for free and open access. It has been accepted for inclusion in Dissertations and Theses by an authorized administrator of
PDXScholar. For more information, please contact pdxscholar@pdx.edu.
Recommended Citation
Gao, Tongjun, "Vertex Ordering for a Partitioning-based Fitting Algorithm for an EPLD Device" (1993). Dissertations and Theses.
Paper 4585.
10.15760/etd.6469
AN ABSTRACT OF THE THESIS OF Tongjun Gao for the Master of Science in 
Electrical and Computer Engineering presented November 5,1993. 
Title: Vertex Ordering For A Partitioning-Based Fitting Algorithm For An EPLD Device 
APPROVED BY THE MEMBERS OF THE THESIS COMMITTEE: 
Malgorzata Chrzan 
 
Maria E. Balogh 
As the Application-Specific Integrated Circuit(ASIC) technology develops to the 
trend of high density and modulization, the ASIC device market has been dominated 
gradually by the more complex Erasable Programmable Logic Devices (EPLDs) and 
the Field Programmable Gate Array(FPGAs) instead of the ordinally Programmable 
Logic Devices(PLDs). Meanwhile, the design automation system for such programm-
able devices has also moved from schematic entry design to high level hardware 
description language entry design. Usually, the whole design automation process 
2 
consists of three phrases, the high level hardware description language compiler, the 
logic synthesis stage and the layout synthesis stage. Though the layout synthesis stage 
contains placement and routing, for some highly restricted connection architecture dev-
ices, placement and routing have to be considered together as a fitting problem. This 
thesis concentrated on the utilization of the Heuristic methods, which can be described 
as vertex ordering and global vertices number estimation, on an Architecture-Driven 
Partitioning fitting algorithm. The test results showed that the heuristic algorithm can 
beat the comparable algorithm in several fields. These prove the correctness of our 
heuristic methods and they can be used to guide the future work on the fitting problem 
of other similar programmable devices. 
VERTEX ORDERING FOR A PARTITIONING-BASED FITTING ALGORITHM 
FOR AN EPLD DEVICE 
by 
TONGJUNGAO 
A thesis submitted in partial fulfillment of the 
requirements for the degree of 
MASTER OF SCIENCE 
in 
ELECTRICAL AND COMPUTER ENGINEERING 
Portland State University 
1993 
TO THE OFFICE OF GRADUATE STUDIES: 
The members of the Committee approve the thesis of Tongjun Gao presented 
November 5, 1993. 
Malgorzata Chrzan~wska-Jeske, Chair 
Mare/. PerKowSKI 
Maria E. Balogh 
APPROVED: 
Rolf Schaumann, Chair, Department of Electrical Engineering 
jce Provost for Graduate Studies and Research 
ACKNOWLEDGEMENTS 
I am very thankful to my parents, my uncle and my sister. They gave me strong 
support and encouragement during the period of my graduate study in the United States. 
Their love created my energy to work and study hard. 
I would like to thank Dr. Malgorzata Chrzanowska-Jeske, my advisor, Chair of the 
Committee, for her brilliant guidance and patient encouragement throughout the research 
and the preparation of the thesis. 
I would like to thank other Committee members: Dr. Marek A. Perkowski and Dr. 
Maria E. Balogh for their assistance on the thesis. 
Thanks to my friends and the faculty and staff of the Electrical Engineering 
Department for their friendly support and encouragement in many ways. 
TABLE OF CONTENTS 
ACKNOWLEDGEMENTS ................................................................................................ iii 
LIST OFT ABLES .............................................................................................................. vi 
LIST OF FIGURES ............................................................................................................ vii 
CHAPTER PAGE 
I INTRODUCTION ..................................................................................... . 1 
II EPLD DEVICES ....................................................................................... . 4 
II.1 Restricted-Connectivity EPLD Devices . .................. .. ............... 4 
II.2 Cypress EPLD ................................ ................ ............................ 5 
II.2.1 CY7C361 Chip Architecture........................................... 5 
II.2.2 Partitioning Property ....... ................................................ 6 
III PABFIT- PARTITIONING-BASED FITTER ........................................... 12 
111.1 Fitting Problem Formulation .................................................... 12 
III.2 The PABFIT Algorit:llm ............................................................ 13 
111.3 Drawbacks of the P ABFIT Algorit:llm ...................................... 15 
IV MATHEMATICAL FORMULATION OF THE VERTEX ORDERING . 19 
IV.1 Searching Order of P 3 Assignment.......................................... 19 
IV .1.1 Outstanding Vertices ............ .. ........... ........... ........... ...... 19 
IV .1.2 Partitioning of the Input Vertices .......... ....... ...... ........... 21 
IV .1.3 Properties of Toggle Group Vertices . ............. ............... 24 
IV.1.4 Properties of Chain Group Vertices............................... 27 
IV .1.5 Proposed Searching Order . ... .. .. .. .. ... .. .. .. .. .. ... .. .. .. .. ... .. .. .. 29 
IV .2 The Number of Vertices Assigned to P 3 ••••...•••••••••.•••. •••••.••••••• 31 
V IMPLEMENTATION OF THE VERTEX ORDERING ALGORITHM... 33 
V.l PABFIT.h with Vertex Ordering ............................................... 33 
V .2 Choosing the Vertex Order .. .. .. .. .... ... .. .. .. .. ... .... .. .. ..... .. .. .. .. ... .. .... 36 
v 
V.2.1 Outstanding Vertices ....................................................... 36 
V .2.2 Toggle Group Vertices .... .... ... .. ...... ... ............... ............... 41 
V.2.3 Vertex Ordering Implementation.................................... 47 
V .3 Choosing the Number of Global Vertices .................................. 51 
VI RESULTS OF THE NEW VERSION OFPABFIT ................................... 53 
VII COMPLEXITY ANALYSIS ...................................................................... 62 
VIII CONCLUSION AND FUTURE WORK ................................................... 66 
IX APPLICATION.......................................................................................... 67 
REFERENCES ................................................................................................................... 71 
LIST OFT ABLES 
TABLE PAGE 
I Improvement due to High_In_Degree Heuristic ..... ............... .. .. ..... ............... 18 
II Simple Examples Where PABFIT.h and P ABFIT are equivalent . .. .. ........... .. 55 
III Examples Where PABFIT.h is better than PABFIT ....................................... 57 
IV C_IN Chain Examples Where P ABFIT.h is better than PABFIT .................. 58 
V Toggle Group Examples Where PABFIT.h is better than P ABFIT ............... 59 
VI No Feasible Solution Examples .................... .... .. ....... ......... ........ ..... .. ............. 61 
LIST OF FIGURES 
FIGURE PAGE 
1 Interconnection Matrix of CY7C361 .............................................................. 9 
2 Partitioning Properties of the Adjacency Matrix ............................................ 10 
3 Aowchart of the PABAT Algorithm ............................................................. 14 
4 Degree Vector ................................................................................................. 17 
5 Vertex Degree ................................................................................................. 20 
6 Partitioning of the Input Vertices .......... ............................................ ............. 23 
7 Netlist of "dees2_ba.fit" File ........................................................................... 25 
8 "Split-local-reset-group" Problem .................................................................. 26 
9 Estimation of the Vertices in the Toggle Group ............................................. 27 
10 Properties of the Chain Group Vertices .... ...... .. .. ... .. .... ........... ........................ 29 
11 Olain Look Up Order ....... .... .. .. ....... .. .. .. ... .. .... ....... .... .... ... ............. ................. 30 
12 Assignment of Six Global Macrocells ............................................................ 32 
13 Assignment of Eight Global Macrocells ................................ ........................ 32 
14 Algorithm Comparison ................................................................................... 37 
15 Arrays for the Outstanding Vertex Search...................................................... 38 
16 Choosing the Outstanding Vertices ... .... ....... .... .. ... .. .... ..... .. ...... ..... .. ............... 39 
17 Data Structure A .................................... ... ...................................................... 42 
18 Date Structure B ............................................................................................. 43 
19 Vertex Ordering .. ... .. .. . . .. ... .. . . .. . . .. ... .. . . .. .. ... .. .. .. .. .. ... .. .. .. .. ... .... .. .. .. ... .. .. .. .. ... .... .. 50 
20 Searching Space .. ..... .. .. .. ..... .. .. .. .. ... .. .. .. ......... ......... .. ...... ....... ......... ................. 65 
21 Architecture of MAX5000 EPLD ...... .. ........................................................... 68 
22 Architecture of CY7C376 EPLD .................................................................... 69 
CHAPTER I 
INTRODUCTION 
Recently, as the Application-Specific Integrated Circuit(ASIC) technology 
develops towards high density architecture design, the programmable logic 
devices(PLDs) have gradually been substituted by the more complex Erasable Pro-
grammable Logic Devices (EPLDs). Furthermore, the Field Programmable Gate Arrays 
(FPGAs), which combine the design characteristics of both high density architecture and 
modular function blocks, have also emerged as the new generation device in the current 
ASIC market. Meanwhile, the design automation system has also moved from schematic 
entry design to high level hardware description language entry design. 
Currently, the IEEE standard VHSIC Hardware Description Language(VHDL) has 
been used in many design automation system [1]. The VHDL description of the entire 
design, which is entered by the designer, is translated to a device programming file. Such 
a device programming file is the internal format expressions that can separately describe 
the sequential and combinational logic of the entry design circuit [2]. Then the 
compiler's logic synthesis step divides the whole circuit into pieces of small logic, each 
of which can be realized as a single module. The result of the logic synthesis [3] can be 
represented by a netlist that contains all the modules of the design circuit and the connec-
tions between these modules. Each small module of the design circuit can be mapped 
into a single macrocell of the selected device [4], [5], [6]. Finally, the layout synthesis 
step assigns each module of the netlist to a physical macrocell of the device and uses the 
proper available wire segments from the routing resources of the device to realize the 
2 
connections between the macrocells [7], [8], [9], [10], [11], [12]. 
For some complex programming logic devices, the connections between the 
macrocells are so restricted that the placement of modules have to take the routing res-
triction into account during the whole layout synthesis' process. Therefore, the layout 
synthesis problem on those highly restricted connection architecture EPLDs/FPGAs 
needs to be formulated as a fitting problem, which combines the placement and routing 
together [13], [14], [15]. [16]. The sub-system for solving the fitting problem of the 
whole design automation system is called fitter. 
This thesis concentrates on the fitting problem of one of the Application-Specific 
State Machine Devices(ASSMDS), the CY7C361 EPLD, which was introduced by 
Cypress Semiconductor. The fitting problem can be presented as mapping the netlist 
obtained from high-level synthesis stage into the chip's physical resources [13], [14]. 
The netlist, which represents the realization of the design circuit, is represented as a 
directed cyclic graph. The physical resource of the device architecture, which is the reali-
zation of the internal macrocells and the connections between them, can also be 
represented as a directed cyclic graph. The fitting problem was formulated as the labeled 
graph isomorphism between the netlist graph and the sub-graph of the resource's graph 
[15]. However, due to the strongly limited connectivity of the new EPLD/FPGA devices, 
the fitting problem has to be generalized as a graph monomorphism problem [17] with 
some additional mapping constraints [13], [14]. 
A vertex shifting arrangement mapping algorithm has been developed be Cypress 
Semiconductor to solve the CY7C361 fitting problem [16]. This approach tried to map 
the symbolic vertices of the netlist into the physical locations of the device by shifting 
the netlist vertex order. Due to the local approach of this algorithm, the search time was 
excessive. By considering the device architecture, an Architecture-driven partitioning 
fitting algorithm (PABFIT) has been developed to improve the speed but at the same time 
3 
to maintain the exactness of the fitting problem. (13], (14], [15]. However, PABFIT still 
experienced the same problem for some large circuit designs. In this thesis we will 
present a new vertex ordering heuristic which will be used in the original Architecture-
driven partitioning fitting algorithm (PABFIT) to improve the algorithm performance. It 
is shown on the attached examples that the partitioning-based fitting algorithm with 
heuristic vertex ordering (PABFIT.h) can find a feasible solution much faster than the 
original P ABFIT algorithm. 
Chapter II presents EPLD devices, concentrating on the CY7C361 architecture and 
its partitioning properties. Chapter III describes PABFIT -the Partitioning Based fitter. 
Chapter IV and Chapter V presents the mathematic formulation on the new Heuristic and 
the Algorithm of the Heuristic Methods. Chapter VI give the evaluation of the results. 
Chapter VII presents the complexity analysis. Chapter VIII summarizes our work and 
address the future work. 
CHAPTER II 
EPLD DEVICES 
PLDs use a number of different programming technologies. The ultraviolet eras-
able PLDs, commonly called EPLDs, are based on the CMOS technology. These EPLD 
devices have the reprogrammable ability. Though this technology won't normally affect 
the use of the devices, it can affect how easily the devices can be programmed. CMOS 
technology can be easily tested and used to design high density device. 
ll.l RESTRICTED-CONNECfiVITY EPLD DEVICES 
New EPLD devices have recently been developed by using different architecture 
from the previous PLDs in order to gain the goal of complexity. One developing trends is 
modulization. The device is designed with configurable outputs, which are enhanced with 
special circuitry and are called output macrocells. The device has a number of identical 
output macrocells. The control fuses for the macrocells allow each macrocell to be 
configured in one of several basic configurations. The macrocells can be programmed as 
AND, OR gates, Multiplexors, D flip-flops, or other more complex gates depending on 
the chip architecture. 
Another developing trend is segmented architecture. As PLDs increase in complex-
ity, the size of the programmable array quickly becomes unmanageable. For this reason, 
some manufacturers have developed device architecture that are segmented into smaller 
arrays with limited interconnections. The device's programmable array splits into several 
5 
identical parts that have limited interconnects. This segmentation means that designs 
being implemented in such device must be partitioned and allocated to the parts based on 
the amount of the required interconnection. The segmentation of the interconnection 
array results in a highly restricted connectivity nature among the macrocells. 
The CY7C361 chip, an Application Specific State Machine Device, introduced by 
Cypress Semiconductor, is one of these new architecture EPLD/FPGA. It combines the 
characteristics of both modulation and segmentation. 
II.2 CYPRESS EPLD 
II.2.1 CY7C361 Chip Architecture 
The CY7C361 chip contains 32 internal macrocells [18]. These 32 physical macro-
cells are positioned in one column, and numbered from 1 to 32 from the top to the bottom 
of the column. These 32 macrocells are also called state macrocells. 
By using C_IN chain inputs, all state macrocells can be configured as shift regis-
ters. This C_IN chain input is a very short hard-wired connection between the adjacent 
macrocells and can be regarded as a condition reset to a configurable macrocell. 
The excitation functions and the reset functions of the flip-flops in the state macro-
cells are designed as unique high speed gates called Condition Decodes (CDEC). CDEC 
gate is an AND of two fan-in unlimited gates, AND and NAND. These CDEC gates take 
in primary inputs and feed back the state macrocell outputs to other CDEC gate. There-
fore, one CDEC gate can take the primary inputs from the external pins and the feed back 
argument signals from another internal CDEC gate. 
The CY7C361 architecture contains 32 state macrocells, 8 local resets and 1 global 
reset. The whole architecture is partitioned into two groups of 16 state macrocells with 
6 
their associated reset signals and CDEC planes. Each of these 16-macrocell groups is 
partitioned again into two groups of 8 state macrocells. Each group of 8 macrocells con-
sists of two groups of 4 macrocells with a separate local reset CDEC. Each group of 4 
macrocells is considered as one physical local reset group (LRG). In this 4-macrocell 
LRG, the outputs of the first two macrocells are available to all macrocells in the sub-
group of 8 macrocells, output of the third one to all macrocells in subgroup of 16 macro-
cells, and output of the fourth one is global to all 32 macrocells. The outputs from all 
macrocells in each of the 8-cell groups are available at the inputs of all the macrocells 
belonging to that group. According to the output availability, the macrocells of 
CY7C361 can be classified into three different groups: (1) global, (2) intermediate, and 
(3) local [13], [14], [16]. The description of these three groups will be presented later. 
A state macrocell can have input signals from Condition, Global Reset, and Local 
Reset. These condition and reset logic are realized by the CDEC gates as well. In addi-
tion, C_IN chain input is also available to every state macrocell as a separate condition 
excitation signal. For each physical Local Reset Group of four cells, there exits one 
separate Local Reset signal, but there is only one Global Reset signal for all state cells. 
In CY7C361 architecture, the 32 internal macrocells are divided into 8 local reset 
groups(LRGs), each of which contains 4 cells. These 8 physical local reset groups are 
named LRG 1, LRG 2 , .•••• LRGg. Each physical local reset group LRG; is associated with 
exactly one physical local reset plr;. Microcells triggered by physical local resets are 
called TOGGLE cells. All TOGGLE cells in the same physical LRG; must be triggered 
by the same local reset plr;. In each physical local reset group, there exists two local, one 
intermediate, and one global macrocells [13], [14], [16]. 
7 
II.2.2 Partitioning Property 
As we have mentioned before, the fitting algorithm tries to assign a netlist of the 
circuit, which is obtained from the logic synthesis stage, to the physical resource of the 
chip. According to the architecture of the CY7C361 chip, we know that the internal 
macrocells of the device can be partitioned into three groups, global, intermediate and 
local, and the macrocells in one group can have their own distinctive properties, the con-
nectivity and the connection constraints, which are different from those in the other 
groups. Since the netlist should have the same partitioning properties as the routing 
domain of the physical device, a two-level partitioning fitting algorithm has been 
developed to solve the fitting problem of the CY7C361 chip. Before describing the fitting 
algorithm, we will first explain the architecture constrains of the CY7C361 and its 
corresponding partitioning properties. 
The physical resource of the CY7C361 chip, the 32 macrocells, the condition C_IN 
chain reset signals, the global and local reset signals, and the possible connections 
between them are represented by the directed physical graph G P = ( VP , EP ), denoted 
Physical Connectivity Graph. It consists of the vertex set VP ={ v; I v; represents macro-
cell } and the edge set Ep ={ e; I e; represents the possible connection between V; and 
Vi} [13], [14], [18]. 
The Physical Connectivity Graph GP ( VP , EP ) can also be represented by an 
Interconnection Matrix or Adjacency Matrix, which is shown in Figure 1. The first 32 
rows and 32 columns, P 1, P 2 , ..... P 32, represent the state macrocells. The first 32 rows 
in the Adjacency Matrix represent the outputs of the macrocells, and the first 32 columns 
represent the inputs to these macrocells. If the physical connection exists between the 
output of the row cell and the input to the column cell of the resources' graph, it is indi-
cated by the symbol "I". If no physical connection exists, just keep the empty space in 
that position. A chain of down arrows is listed on the left hand side of the graph to 
8 
indicate the C_IN chain connection. C_IN chains have to be mapped in the specified 
order to the adjacent macrocells Pi and Pi+1· The row/column 33, indicated by P33, 
gives the outputs and inputs of the global reset, and the row 34 to row 41/column 34 to 
column 41, indicated by P 34- P 41, gives the available inputs/outputs of the local resets 
[13], [14]. 
The Interconnection Matrix given in Figure 1 is ordered according to the real phy-
sical location of the macrocells on the chip. To observe the special connectivity sym-
metry of this architecture, the Interconnection Matrix is reorganized and is presented in 
Figure 2 [13], [14]. The global reset is not included in the Adjacency Matrix given in 
Figure 2 because the output of the global reset can be available to the inputs of all the 
state macrocells which means no restrictions on its output connectivity. Additionally, the 
connections of the local resets exhibit the same partitioning property as the Adjacency 
Matrix of the macrocells shown in Figure 2. Therefore, local reset vertices given by 
rows/columns p34-p41 of the matrix in Figure 1 are not included. 
Before describing the partitioning properties of the adjacency matrix in Figure 2, 
the classification of the macrocells will be described. 
According to the connection availability of the output of a state macrocell p;e 
{p 1 , ... , p 32} to the input of state macrocell PjE {p 1 , ... , p 32} (i-#j), macrocell Pi 
can be defined as one of the following types: 
Type 1: 
Type II: 
Global macrocell: the macrocell Pi is considered global if its output is 
available at all other 32 macrocells. 
Intermediate macrocell: the macrocell p; is considered intermediate if its 
output is available at 16 other macrocells, belonging to the same 16-
macrocell block. 
Chain 1234..... 3233 34 41 
.-----------. 
~~ : :: : :: :: PART 1 : : : PART~ 
p3 I I I I I I I I I II I I I I I I I I I I 
p4 I II I II I I I Ill I I I Ill I II II I I I II I I I I I I I I I I II I 
p5 I I I I I I I I I I I 
p6 I II I II II I I I 
p7 I I I I I I I I I Ill I I I I I I I I I 
p8 I II I II I I I II I I I I Ill I II II I I I I I I I I I I I I I I I I I I 
p9 I Ill I I I I I I 
p10 I I I I I I I I I I 
pll I II I II II I Ill I I I I I I I I I 
p12 I II I II I I I Ill I I I Ill I II II I I I II I I I I I I I I I I II I 
p13 I I I I I I I I 
p14 I II I I I I I 
p15 I I I I I I I I I I I I I I I I 
p16 I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I 
p17 II I II II I 
p18 II I I I II I 
p19 II I I I II I I I I I I I I I 
p20 I II I II II I Ill I I I Ill I II II I I I II I I I I 
p21 II I I I II I 
p22 II I II II I 
p23 II I II II I I I I I I I I I 
p24 I II I II II I Ill I I I Ill I II II I I I II I I I I 
p25 I I II I I I I 
p26 I I II I I I I 
p27 II I II II I I I II I I I I 
p28 I II I I I I I I Ill I I I Ill I II II I I I II I I I II I 
p29 I I II I I I I 
p30 I I II I I I I 
p31 I I I I I II I I I I I I I I I 
p32 I II I II II I Ill I I I Ill I II II I I I II I I I I 
p33 I II I II I I I II I I I I Ill I II II I I I II I I I I 
B~11 I I I I I I I I I I I pART 2 
~37 I I I I 
E3~ II I 11 II I 
g4? I I II I I I I 
lrg1 lrg2 lrg3 lrg4 lrg5 lrg6 lrg7 lrg8 
t t t t t t t t 
p34 p35 p36 p37 p38 p39 p40 p41 
Figure 1. Interconnection Matrix of CY7 C361. 
I I 
I I 
I I I I 
I I I I I II I 
II 
I I 
I II I 
I II I Ill I 
I I 
II 
I II I 
I I I I Ill I 
I I 
I I 
I II I 
I I II II I I 
I I 
I I 
I I I I 
I I I I Ill I 
I I I I I II I 
9 
1 2 ..... 32 
. . . . . . . . . .Ll..l. .. LLI...l...J. .. IIIII II II 111111 I 
p1 
p2 
p5 
p6 
p9~ 
p10~ 
p13~ 
p14~ 
p3 . 
p7 
p11 
p15 
Partition P11 
local V. 
I 
Partition P 131 
Partition P132 
I > Partition Pl Partition P12 
local V. 
intermediate V. 
intermediate V. 
p17_ ........................... . 
p18_ 
p21_ 
p22_ 
p25_ 
p26_ Partition P2 
p29_ 
p30_ 
p19_ 
p23_ 
p27_ 
Partition P21 
local V. 
Partition P22 
local V. 
Partition P231 intermediate v. 
Partition P232 intermediate V. p31_ 
p4 ~--------------------~------------------~ 
p8 
p12__j Partition P31 
p16 
p20 
p24 
P28~ Partition P32 
p32 
global Vertices 
global Vertices 
Partition P3 
Figure 2. Partitioning Properties of the Adjacency Matrix. 
10 
Type Ill: 
11 
Local macrocell: the macrocell p; is considered local if its output is avail-
able at 8 other macrocells, belonging the same 8-macrocell block. 
Therefore, the CY7C361 chip contains 8 global, 8 intermediate and 16 local 
macrocells, which can be seen in the Adjacency Matrix in Figure 2. As it was shown 
before, the first level partitioning separates the macrocells into 3 groups: P 1, P 2 and P 3 . 
The partitions P 1 and P 2 contain the intermediate and the local macrocells. The global 
macrocells are in partition P 3 and have their outputs available at all other macrocells in 
P 3 and at all macrocells in P 1 and P 2 . All macrocells in partition P 1 have outputs 
available only at macrocells in the same partition P 1 and in P 31· Analogously, the 
macrocells in partition P 2 have outputs available only at macrocells in P 2 and in P 32 
[16]. 
The partitioning of P 1 and P 2 is identical. Therefore, only the partitioning of P 1 is 
explained in the following. The second level partitioning separates P 1 into 
P 11, P 12, P 13 . The partitions P 11 and P 12 contain only the local macrocells. The inter-
mediate macrocells are in partition P 13 and have outputs available at all other macrocells 
in P 13 as well as at all macrocells in P 11 or P 12. The local macrocells of partition P 11 
have outputs available only at macrocells in the partition P 11 and in P 131• Analogously, 
the macrocells in partition P 12 have outputs available at macrocells in P 12 and in P 132· 
It can be observed in Figure 2 that the availability of the outputs of the state macro-
cells at the inputs to the CDECs of the local reset signals exhibits the same partitioning 
properties as the state macrocell connections. The state macrocell which is global in 
respect to the other macrocells is also global in respect to the local reset connections. The 
same is true for intermeditate and local macrocells. 
Based on the above description of the chip architecture, three groups of constraints 
were distinguished: state macrocell connectivity constraints, global reset constraints, and 
local reset constraints [13], [14], [16]. 
CHAPTER III 
PABFIT- PARTITIONING-BASED FITTER 
lll.1 FITTING PROBLEM FORMULATION 
We are going to solve the problem of mapping the netlist obtained from the logic 
synthesis stage into the physical resources of the CY7C361 device, from Cypress. The 
netlist of the design is represented by the directed Symbolic Graph Gs = ( Vs, Es ). The 
vertices v;e Vs represent the macrocells, and the connections between the macrocells are 
represented by the edges e;e Es. The local resets constraints are not considered in the 
partitioning phase. The formulation of the fitting problem was stated as follows: 
Problem Formulation: Given a directed physical connectivity graph G P =( VP , Ep ) and a 
directed symbolic graph G s =( Vs , Es ), determine if the symbolic graph G s is 
monomorphic to the physical connectivity graph GP, and if so, find the mapping 
(monomorphism rr ) of the symbolic vertices v Sj E Vs to the physical vertices v Sj E Vp' so 
that no global and local reset restriction are violated [16]. 
The graph monomorphism problem [17] is of high complexity. To reduce this 
complexity we take advantage of the partitioning properties of the physical connectivity 
graph Gp [13], [14]. A symbolic graph Gs must exhibit the same partitioning properties 
as the physical connectivity graph Gp. A two-level partitioning is performed on the sym-
bolic graph Gs, subject to the first- and second-level partitioning properties of the physi-
cal connectivity graph G P. The final assignment of symbolic vertices to the physical loca-
tions (state macrocells and reset cells) is performed in the Physical Placement stage [13], 
13 
[16]. 
III.2 THE PABFIT ALGORITHM 
The partitioning-based fitting algorithm(PABFI1) is divided into two parts: The 
First- & Second-level Partitioning and the Physical Placement. The Partitioning part 
consists of two stages: P3 Assignment, First Partition; P13,P23 Assignment and Second 
Partition. The flowchart of the PABFIT is shown in Figure 3. 
In the P3 Assignment stage, the set of vertices, from the symbolic graph Gs, is 
chosen randomly to be assigned as the global vertices. The chosen vertices are placed in 
the block P 3 . The remaining vertices are processed by the First Partition and divided into 
two blocks, block P 1 and block P 2. There are no connections between blocks P 1 and 
P2. So the first stage of the Partitioning part of the PABFIT separated the input vertices 
into partition P 1, partition P 2 and partition P 3 [13], [14], [15]. 
In the second stage of the Partitioning part, the partitions P 1 and P 2 themselves 
are partitioned further in a similar way. The P 13 - P 23 Assignment selects the Intermedi-
ate macrocells from the block P 1 or P 2 and places them into the blocks P 13 or P 23 
respectively. The Second Partition deals with the remaining vertices in blocks P 1 or P 2 
and tries to divide blocks P 1 or P 2 into two sub-blocks P 11 and P 12 or P 21 and P 22 
respectively. The vertices in subpartition P 11 and P 21 and the vertices in subpartition 
P 12 and P 22 have no output connections to each other. 
After the First- & Second-level Partitioning, each vertex Vsj e Vs is assigned to a 
single partition P a. Then the Physical Placement performs a one-to-one mapping of each 
vertex vsj to a single physical macrocell Pj of the device. If no connectivity and reset res-
trictions are violated, PABFIT has found one feasible solution to the fitting problem [13], 
[14], [15]. 
Choose a 
P3-Assignment 
yes 
no no 
Choose a 
P 13, P23-Assignment 
yes 
no feasible feasible 
solution exists solution 
Figure 3. Flowchart of the P ABFIT Algorithm. 
(1) first-level 
partitioning 
P3-Assignment 
choosing 
global macrocells 
(2) second-level 
partitioning 
p13,p23-Assignment 
choosing 
intermediate 
macrocells 
(3) Physical 
Placement 
14 
15 
111.3 DRAWBACKS OF TilE PABFIT ALGORITIIM 
In this work, we will concentrate on the first step of the P ABFIT: the P3 Assign-
ment, because the P3 Assignment has an important effect on the executing time of the 
algorithm. Therefore, we will use heuristic approach to control the assignment process. 
The P3 Assignment is the assignment of vertices Vs; e V8 to the partition P 3 • The out-
puts of the macrocells mapped to the vertices assigned to P 3 are available at the inputs 
to all other state macrocells and reset cells, because the physical macrocells in P 3 are 
global macrocells. All other vertices that are not assigned to P 3 can be assigned only to 
intermediate or local macrocells (partitions P 1 , P 2 and their subpartitions). After the 
P 3 Assignment, the output edges of vertices Vs; e V8 that are assigned to P 3 are 
removed. The disconnected components Gc; of the symbolic graph G s are assigned to 
different partitions during the First Partitioning stage. By properly choosing the vertices 
assigned to P 3 , we should be able to find a solution, if one exists, much faster, and 
without going through many possible P 3 assignments. As a result, a good P 3 Assign-
ment can speed up the search for a feasible solution and reduce the CPU time. 
Basically, there are two aspects of the P 3 Assignment which can influence the 
speed of the whole algorithm. 
• If we know that the minimum number of global vertices is k ( k ~ 0 ), then the pos-
sible number of the global vertices (number given by m) can be assigned from k to 
8. So the order of how many vertices (number given by m) out of all the input ver-
tices are assigned to P 3 can be changed. It can start with m = 0 and increase to 
m = 8 step by step, or it might start with m = 7, continue to m = 8, and than back to 
m = 0, until all possible cases are explored. 
• The order in which m vertices out of all possible vertices are assigned to P 3 can 
also be changed. For example, it can start with assigning vertex 
16 
v1 , v2, and v3 (m=3) to P3, or it might start with v2, v3, and v4. Then it con-
tinues assignment until all possible combinations of m vertices are explored. 
Therefore, heuristics can be applied to control the P 3 Assignment in order to speed 
up the search for a feasible solution [19]. The heuristic approach depends highly on cer-
tain common properties of the input vertices. We have tested PABFIT on a large number 
of examples to find out whether the vertices assigned into P 3 in the final feasible solution 
have some common properties. According to such common properties, we can develop 
new heuristic approaches such as: 
• Choose the proper order of how many vertices ( number given by m ) out of all the 
input vertices are assigned to P 3. 
• Chose the proper order of how these m different vertices are assigned to P 3. 
Assume that all possible P3 assignments determined by PABFIT form a searching 
space. The developed heuristic changes the starting position of the search and the search-
ing order inside the searching space. No possible P 3 assignments are excluded, and 
PABFIT can still explore all possible P 3 Assignments. However, with good heuristic, it 
is more likely that a good P 3 assignment can be found sonner and therefore a feasible 
solution can be also identified [19], [20]. 
The original PABFIT [16] uses only one heuristic indicator in the P3 Assignment 
stage. That indicator is based on a high In-Degree of some vertices of the symbolic 
graph compared to the other vertices. Therefore, the first step is to determine the In-
Degree of all the vertices of the symbolic graph and store all the values in a so-called 
degree vector. Then, the degree vector is ordered according to the increasing degree and 
the difference !l between adjacent elements of the degree vector is calculated. The max-
imum difference is denoted by !lmax and the element of the degree vector where !lmax 
occurs is denoted by i ~ . Figure 4 here shows an example of the degree vector, 
Amax, and i~. 
Degree 
Degree Vector=[ 1,1, 1 ,2,2,3,4,5, 12,12] 
12 
11 
10 
9 
8 
7 
6 
5 
4 
3 
2 
1 
1 2 
11 max=7 
3 4 5 6 7 
Figure 4. High In-Degree Vector. 
A 
8 9 
~ 
i /1 max 
10 
Element of 
Degree Vector 
17 
If Amax > 5 , the algorithm starts assigning m = Amax vertices to the partition P 3 
and increases the value of m step by step after all possible combinations of have been 
explored. The variable m is increased until it reaches m = 8, then the PABFIT starts again 
at m =0 and counts up to m = Amax. If Amax > 8, P ABFIT starts with m =8. 
18 
The comparison between the P ABFIT without heuristic and with High_In_Degree 
heuristic, on a set of examples, is presented in Table I. 
TABLE I 
IMPROVEMENT DUE TO HIGH_IN_DEGREE HEURISTIC 
name max 
9 
14 
Diff_max 1-M_start 
6 
13 
6 
8 
136.6 
0.1 
226.9 
5h 
For all the examples where P ABFIT applies High_ln_Degree heuristics the CPU 
time is decreased compared to the non-heuristic PABFIT, since the heuristic depends 
highly on certain properties of the input netlist. For some examples with such properties, 
heuristic would speed up the search, but for others without such properties, the CPU time 
was even increased. Especially for some examples with a large searching space, PABFIT 
would keep on running for a long long time without finding out the feasible solution. 
Therefore, more general heuristics should be developed to improve P ABFIT performance 
on average set of examples. 
CHAPTER IV 
MATHEMATICAL FORMULATION OF THE VERTEX ORDERING 
It is necessary to analyze the architecture of the chip and the general properties of 
the input netlist to develop some new approach which can be applied in PABFIT. First, 
let's analyze the characteristics of the netlist which will affect the searching order of the 
P 3 assignment. Then we will explain the heuristic methods which will choose the 
number of vertices assigned to the P 3 assignment, or the number of possible global 
macrocells. 
IV.l SEARCHING ORDER OF P3 ASSIGNMENT 
IV.l.l Outstanding Vertices 
Before introducing the term outstanding vertices, let's explain the total degree and 
the multiple degree of the vertices on examples shown in Figure 5. If vertex V 1 is related 
to vertex V 3 through vertex Vs, then vertex Vs provides an opportunity for vertex V 1 to 
be related to vertex V 3 . Let's represent it as V 1 ~ Vs ~ V 3, obvious! y, it is the same as 
V1 ~ Vs ~ V4, V1 ~ Vs ~ V5, V2 ~ Vs ~ V3, V2 ~ Vs ~ V4, and V2 ~ Vs ~ V5. 
Therefore, we can say that vertex Vs provides a total of 6 opportunities for one set of ver-
tices ( V 1 , V 2 ) to be related to another set of vertices ( V 3 , V 4 , V 5 ). Actually, vertex 
Vs is connected with 5 other vertices V 1, V 2, V 3, V 4, V 5. 
The Total Degree of one certain vertex indicates how many other vertices are adja-
20 
cent to this certain vertex, or indicates how many other vertices are output vertices or 
input vertices of this certain vertex. It is a sum of In-Degree and Out-Degree of the ver-
tex. Therefore, the Total Degree of vertex V8 determines how many other vertices can be 
connected to this vertex V8 • It shows vertex Vs' connectivity. 
The Multiple Degree of one certain vertex indicates the possible ability (or latent 
capacity or potentiality) of this vertex to be adjacent to other vertices. The Multiple 
Degree of vertex V8 determines how many opportunities the vertex V8 can provide for the 
other vertices, which are connected to V8 , to be related to one another. It shows vertex 
Vs' ability to become a connection center. It is calculated by multiplying the vertex' In-
Degree and Out-Degree. 
V2 
V3 
---.Qv4 
V5 
In-Degree(Vs)=2 
Out-Degree(Vs)=3 
Total-Degree(V s)=2+ 3=5 
Multiple-Degree(V s )=2x3=6 
Figure 5. Vertex Degree. 
According to the internal structure of the CY7C361 EPLD, which is described in 
detail in Chapter 2, the maximum In-Degree for all vertices is equal seventeen, the max-
imum Out-Degree for global vertices is equal thirty-two, for intermediate sixteen and for 
local eight. So, global vertices must have higher connectivity and stronger ability to 
become connection centers, and must have higher Multiple-Degrees and Total-Degrees. 
Therefore, vertices with high Multiple-Degrees and Total-Degrees can be regarded as the 
most possible candidates for global macrocells and they should be arranged near the top 
21 
of the vertex searching queue. We call these most possible global vertices candidates 
outstanding vertices. 
Therefore, if we sort all the candidate vertices in the input netlist based on their 
total degree and multiple degree, it is likely that we can find some outstanding vertices 
which have higher Total Degrees and higher Multiple Degrees than the other vertices. 
We can anticipate that the probability of finding feasible solutions with these high degree 
vertices being assigned to global macrocells is very high. 
Therefore, we can change the P 3 assignment vertex searching order by putting all 
the outstanding vertices at the beginning of the vertex searching queue. Since the P 3 
assignment function poss_ab_recO in PABFIT chooses the vertices into the P 3 assign-
ment from the top of the searching queue, the outstanding vertices will be assigned to P 3 
assignment at first. It is likely that these outstanding vertices are the real global vertices 
and they can be mapped into the global macrocells without violating any connection con-
straints. This will lead to a feasible solution very fast. 
IV.1.2 Partitioning of the Input Vertices 
Each state macrocell of the CY7C361 can be excited by CDEC gate outputs and 
C_IN chain signals. According to the input excitation signals, there are three different 
programmable configurations for each macrocell: (1) START, (2) TERMINATE and (3) 
TOGGLE. 
• (1) START: When the CDEC expression is 11 1, 11 the output of the START cell gate goes 
to II 1 II exactly for one clock cycle. 
• (2) TERMINATE: When the C_IN signal is "1, 11 the output of the TERMINATE cell 
goes to 11 111 in the next clock cycle, and it goes back to 110" when the CDEC expression 
22 
becomes "1." 
• (3) TOGGLE: When the C_IN signal or the CDEC expression becomes "1," the output 
of the TOGGLE cell will toggle for the next state. 
Each TOGGLE type macrocell has a programmable input connection from a local 
and global reset signal. The global and local resets are used to set the TOGGLE cells to a 
defined state, usually set to 1 by default. However, such a classification is based on the 
input excitation signals to the macrocell. Since the fitting algorithm has been developed 
based on the connection constraints between the macrocells of the physical chip, it is 
much more important to classify the macrocells based on their connection types than to 
base on their input excitation signals. Therefore, if we take the connection type between 
the macrocells into account, we can again classify the macrocells into three different 
groups: (1) Chain Cell Group, (2) Toggle Cell Group and (3) Single Cell Group. 
• (1) Chain Cell Group: Each macrocell in this group belong to one and only one chain. 
In the chain, macrocells are connected in a line by the C_IN chain connections between 
every two neighbors in a chain, except the first and the last one. 
• (2) Toggle Cell Group: All the macrocells in this group are TOGGLE type macrocells, 
and they can be triggered by the different local reset signals. They are divided into 
several sub-groups. In each of the separated group, all the macrocells are triggered by 
the same local reset signal. 
• (3) Single Cell Group: All the macrocells in this group are Non-TOGGLE type macro-
cells without C_IN chain connections. 
23 
Since the input vertices have the same connection properties as the macrocells in 
the physical chip, we can also partition the input vertices into three different vertices 
groups: (1) Chain Group, (2) Toggle Group and (3) Single Group, similar to the connec-
tion characteristic classification of the physical macrocells. This is shown in Figure 6. 
Input 
Vertices Toggle 
Group 
Chain 
Group 
Single 
Group 
• Toggle Vertices Sub-Group 1 Triggered by LR 1 
• Toggle Vertices Sub-Group2 Triggered by LR2 
•........ 
• Vertices in Chain one 
• Vertices in Chain two 
•......... 
( • Single Vertices 
Figure 6. Partitioning of the Input Vertices. 
In the P 3 assignment stage, if the vertex assigned to P 3 is from one of the chain 
from the Chain Group, some other vertices in the same chain have to be considered as P 3 
vertices not to violate the C_IN chain connection constraints. If the P 3 vertex is from a 
toggle sub-group of the Toggle Group, some of the vertices in the same toggle sub-group 
may also be considered as P 3 vertices if the Local Reset Constraints are not violated. If 
it is from the Single Group, we just need to check its normal State Macrocell Connec-
tivity Constraints. This is much simpler than the vertices from the first two groups. 
24 
Therefore, in the following sections, we only analyze the properties of vertices from the 
Toggle Group and the Chain Group. 
IV.1.3 Properties of Toggle Group Vertices 
The problems associated with the assignment of vertices from the Toggle Group 
will be explained on the example of "dess2_ba.fit" file, shown in Figure 7. This file 
represents a netlist which was synthesised by the logic synthesis stage. The Symbolic 
Vertices Number indicates the vertices index of the input netlist. The Type indicates the 
type of vertices based on the input excitation signal classification. Type 1 -> START, 
Type 2 ->TERMINATE, Type 3 ->TOGGLE, Type 4 ->LOCAL RESET and Type 5 -> 
GLOBAL RESET. Here, we only list the Type 3 TOGGLE vertices. 
The function poss_ab_rec() assigns the vertices into P 3 assignment according to 
the top_down algorithm. In this example, the toggle vertices in several Toggle Sub-
groups are at the beginning of the natural entry order of the netlist. Therefore, such tog-
gle vertices will appear or tend to appear at the beginning of the vertex searching queue. 
Therefore, it is easy for the function poss_ab_rec() to choose a highly probable P 3 
assignment, which contains all toggle group vertices from the same Toggle Sub-Group, 
at the beginning of the search process. The determined P 3 assignment violates the local 
reset constraints and we call this problem of the P 3 assignment an "split-local-reset-
group" problem, shown in Figure 8. Since almost all the toggle group vertices are at the 
beginning of the natural entry order of the netlist, such problem assignment will tend to 
happen at the beginning , or quite near the beginning of the search process. 
As shown in Figure 7, the example netlist contains two toggle sub-groups. One 
toggle sub-group consists of 12 Toggle vertices, which all are triggered by the symbolic 
local reset lrl. Another one consists of 4 Toggle vertices, which are all triggered by the 
Symbolic Vertices No. 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
Type 
(Only list Toggle Cells' type 3) 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
) 
Figure 7. Netlist of "dess2_ba.fit" File. 
Reset by Symbolic 
Vertex No. 31 or 
Reset by lr1 
12 Toggle Vertices 
in this Group 
Reset by Symbolic 
Vertex No. 32 or 
Reset by lr2 
4 Toggle Vertices 
in this Group 
25 
26 
V2 V3 V4 V5 V6 V7 Vll V12 
-- ---- ---- ---- ---- ----
LRGl LRG2 LRG3 LRG4 LRG5 LRG6 LRG7 LRG8 
t t t t t t t t 
lrl lrl lrl lrl lrl lrl lrl lrl 
Fi~ "Split-local-reset-group" Problem. 
symbolic local reset lr2. In Figure 8, all the vertices assigned to P 3 partition are from the 
first toggle sub-group, and they are all triggered by the symbolic local reset lrl. But in 
the CY7C361 chip, we get only 8 local reset groups(LRGs). Although the symbolic local 
reset lrl can be split into several LRGs, it can not dominate all eight LRGs because the 
symbolic local reset lr2 has to dominate at least one LRG. 
Although we can use a greedy approach such as examining all the vertices in each 
possible P 3 assignment, it is very time consuming and ineffective. 
Therefore, if we can find some outstanding vertices in the toggle cell groups, we 
can put them at the beginning of the searching queue because these outstanding vertices 
from the Toggle Group are likely to become the physical global macrocells . Then we 
put the remaining toggle vertices at the end of the searching queue. Since the remaining 
toggle vertices are near the end of the searching queue, the poss_ab_rec() function is not 
likely to choose the set of vertices which can cause the "split-local-reset-group" problem 
assignment at the beginning of the searching space. Although such a problem will hap-
pen (or tend to happen) at the end of the searching space, it is possible that a feasible 
solution will be found before the program accesses such a "bad" assignment. Therefore, 
we do not have to add any checking technique in the program to take out such a "bad" 
assignment. So, the program will not be slowed down. 
Based on the analysis of the real feasible solution files found by the PABFIT, we 
27 
found the toggle vertices from the same toggle sub-group tend to be mapped into the 
macrocells in the same local reset group or in the adjacent local reset groups of the physi-
cal chip. Therefore, when assigning these toggle vertices into the P 3 Assignment, we 
should take into account the relationship of the vertices in the same toggle sub-group and 
estimate a good splitting of these vertices among different LRG groups to achieve the 
best benefit. These estimations are shown in Figure 9. 
• toggle sub-group length <= 4 
* ,. ,. ,. 
if one toggle vertex is put into P3, it is probable 
that the other toggle vertices in the same sub-
group may not be put into P3; 
• 5 <= toggle sub-group length <= 11 
,·,.I., ·I.,. I., ·I ,. ,. I., 
if one toggle vertex is put into P3, it is probable that another toggle vertex 
in the same toggle sub-group may be put into P3; 
• toggle sub-group length >= 12 
if one toggle vertex is put into P3, it is probable that (toggle sub-group 
length /4) toggle vertices in the same toggle sub-group may be put into 
P3 Assignment. 
Figure 9. Estimation of the Vertices in the Toggle Group. 
IV.1.4 Properties of Chain Group Vertices 
Actually, the toggle vertices estimation can be considered as a special case of the 
vertex ordering. The estimation of the toggle group vertices deals with arranging only 
28 
toggle group vertices into the proper location of the vertex searching queue to achieve 
the goal of speed optimization. When taking into account the outstanding vertices with 
much higher Total-Degree and Multiple-Degree, we regard the outstanding vertices as 
the most possible P 3 assignment candidate vertices and regard the vertices coming from 
the toggle group estimation as the least possible P 3 assignment candidate vertices. In 
order to achieve the best benefit, we shift the estimated Toggle Group vertices after the 
outstanding vertices in the vertex searching queue. 
For all the vertices in a chain, if one vertex is chosen into P 3 assignment, every 
fourth vertex after such chosen vertex should be put into P 3 . Vice versa , if one is not 
chosen into P 3, every fourth vertex after such unchosen vertex can not be put into P 3 . 
Since there are at most 4 vertices in one local reset group(LRG), there exists at least one 
global vertex in a chain longer than 4 (or a chain with more than 4 vertices). This is 
shown in Figure 10. We call the chain with more than 4 vertices the long Chain, the 
chain with only 2 or 3 vertices the short Chain. So if there is a long chain in an input net-
list, every reasonable P 3 assignment should contain at least one vertex from this long 
chain. Therefore, we can put all the long chains at the top of the searching queue and 
trace all the vertices in each P 3 assignment in order to make sure that each P 3 assign-
ment contains at least one vertex from the long chain. If the P 3 assignment searching 
pointer has already moved off the long chains range, stop the searching because no rea-
sonable P 3 assignment can lead to a feasible solution without choosing at least one ver-
tex from the long chains. 
If there are some outstanding vertices in a long chain, we should consider the so 
called chain look up order. The chain look up order indicates how the P 3 assignment 
tries to pick up the vertices from the long chain under the strong C_IN chain connection 
constraints. There are only 4 combinations of look up orders for any long chains, which 
is shown in Figure 11. Obviously, as shown in Figure 10, if we try to assign the outstand-
29 
ing vertex into the P 3 assignment at first, we should set the chain look up order as 
[3,1,2,4] in order to make the algorithm achieve the goal of speed optimization. 
LRGl 
~ 
* 
* 
* 
* 
for all combinations of a 4-vertices chain 
at least one global vertex exists 
LRGl 
I t I I I 
I* I I I 
l 
is chosen 
into P3 
LRG2 
I t I 
I* I 
l 
has to be chosen 
into P3 
Figure 10. Properties of the Chain Group Vertices. 
IV.1.5 Proposed Searching Order 
From the above analysis, we know that if there exist long chains, the searching 
space for the example with vertices in long chains is much smaller than the example with 
vertices only in the other groups and a reasonable P 3 assignment should contain at least 
one vertex from the long chain. Therefore, long chains should be put at the top of the ver-
tex searching queue. 
The outstanding vertices with much higher Total-Degree and Multiple-Degree are 
the most possible P 3 assignment vertices candidates, therefore, we can put the outstand-
ing vertices just after the long chains in the vertices searching queue. For the short 
chains with outstanding vertices, we can just break up the short chains and choose the 
outstanding vertices. 
It is possible that there are some toggle vertices in the outstanding vertices group. 
In order to achieve the best results, some of the vertices from the same toggle sub-group 
Order 1: 
Order 2: 
Order 3: 
Order4: 
V1 V2 V3 V4 V5 V6 V7 V8 
I * 
* I 
G 
* I 
I • I I 
L________ 
* I I I Outstanding Vertex V7 
I *I I 
l____ 
I * [:J 
Order 3 will 
chose V7 
t t 
Vertex V 4, V8 are assigned into 
P3 by chain look up order 4. 
natural chain look up order [1,2,3,4] 
heuristic chain look up order [3, 1,2,4] 
Figure 11 Chain Look Up Order. 
30 
related to the outstanding toggle vertices may be arranged just after the outstanding ver-
tices group. 
In order to avoid the "split-local-reset-group" problem, the remaining Toggle 
Group vertices may be put at the end of the searching queue. Therefore, the Single Group 
vertices and the broken short chain vertices may be arranged just before the remaining 
Toggle Group vertices. 
We can draw a conclusion that the best searching order should be: 
1. [ Long chain vertices ] 
2. [ Outstanding vertices 
31 
3. [ Estimated Toggle Group vertices ] 
4. [ Single Group vertices 
& broken short chain vertices ] 
5. [Remaining Toggle Group vertices] 
IV.2 THE NUMBER OF VERTICES ASSIGNED TO P3 
There are 32 state macrocells in a CY7C361 chip and 8 of those 32 macrocells are 
global macrocells. Therefore, for a certain input netlist file, if the total number of vertices 
is S ( O<S ~32 ), then the minimum number of global vertices K must be 
K = S-32+8 = S -24. If S ~24, then K =0. So og ~8. The total number of vertices 
chosen by P 3 assignment is remarked as m. 
The original P 3 assignment in PABFIT will first choose m =K global vertices, then 
increase m step by step until m =8. This order can be remarked as [ K ~ 8 ]. In the 
heuristic P ABFIT.h algorithm, new heuristic methods are applied in P 3 assignment stage 
to achieve the goal of speed optimization. So, for a certain input file, if the outstanding 
vertices number is P , ( P <8 ) and if the possible global vertices number is from K to 8, 
( og <P ), we can anticipate that there should be P global vertices in this input netlist 
file. Instead of allowing the P 3 assignment to do the search by choosing [ K ~ 8 ] glo-
bal vertices, we can force the P 3 assignment to do the search by choosing [ P ~ 8 ] glo-
bal vertices at first, then choosing [ K ~ P ] global vertices. It is likely that a feasible 
solution will be found by just assigning P global vertices into the P 3 assignment. 
For the example files with more than 30 symbolic state vertices, we should con-
sider some other possibilities. It will be explained in example "dess2_ba.fit" input file, 
32 
which contains 30 symbolic vertices. The number of the vertices that can be put into P3 
assignment is from 6 to 8. If we put 6 vertices toP 3, the meaning is shown below. 
No global 
macrocells 
here 
LRG1 LRG2 
Six global macrocells 
I t 1 t 
LRG3 LRG4 LRG5 LRG6 
Figure 12 Assignment of Six Global Macrocells 
t t 
LRG7 LRG8 
But actually, a global macrocell is more flexible than a local macrocell or inter-
mediate macrocell because it has more output possibilities. So, we can assign 8 vertices 
to P 3 instead of 6. 
eight global macrocells 
1 t 1 1 f 1 t 1 
LRG1 LRG2 LRG3 LRG4 LRG5 LRG6 LRG7 LRG8 
Figure 13. Assignment of Eight Global Macrocells. 
By applying this heuristic method to the files with more than 30 state vertices, we 
make the P 3 assignment do the search by choosing [8->6] global vertices instead of 
choosing [6->8] global vertices. Since the vertices mapped into the global macrocells 
will be more flexible than those mapped into the other macrocells, this method can make 
the algorithm find a feasible solution easier and faster. 
CHAfYfER V 
IMPLEMENTATION OF THE VERTEX ORDERING ALGORITHM 
V.l PABFIT.H WITH VERTEX ORDERING 
Based on the analysis of the general properties of the input netlist, we now propose 
the algorithm with vertex ordering heuristic, which will be described below. Instead of 
doing the P 3 assignment search randomly, we first examine all the vertices in the netlist 
file and rearrange the vertex order according to the general properties of the input netlist. 
The goal of the vertex ordering is to form a proper vertices searching queue for the P 3 
assignment to achieve speed optimization. The detail description of the vertex examina-
tion and vertex ordering is presented below: 
( 1) Examine all the vertices in the netlist file and partition them into three different 
groups, Chain Group, Toggle Group and Single Group, according to the vertices' 
connection characteristic. The Chain Group may consist of several different chains. 
Each vertex in the Chain Group belongs to only one of these chains. The Toggle 
Group may consist of several toggle sub-groups, in each of which all the vertices are 
triggered by the same symbolic local reset. Any non Toggle Type, non C_IN chain 
connection vertices belong to the Single Group. 
(2) Sort all the vertices in the input file according to their Total-Degree and Multiple-
Degree in order to distinguish some outstanding vertices with much higher Total-
Degree and Multiple-Degree. Put all these outstanding vertices into a separate 
34 
outstanding vertices group, which is independent of the three groups in (1). 
(3) Check the Chain Group vertices and divide the chains into Long chains and Short 
chains. If there exists some outstanding vertices in the short chain, break up the short 
chains and leave the outstanding C_IN chain connection vertices in the outstanding 
vertices group, but move the other short chain vertices into the Single Group. 
( 4) Check the Toggle Group vertices. H there exists some outstanding vertices in the 
Toggle Group, consider these toggle vertices in the same toggle sub-group as the 
outstanding toggle vertices, and estimate how many toggle vertices may have a 
stronger relationship with such outstanding toggle vertices. This estimation is based 
on the value of the Total-Degree and Multiple-Degree and can be stated as: "higher 
degree, stronger relation." Those estimated toggle vertices can also be put into a 
separate estimated toggle vertices group. 
(5) Based on step (1)- step (5), form the proper vertices searching queue for P 3 assign-
ment search as: 
1. [ Long chain vertices ] 
2. [ Outstanding vertices ] 
3. [Estimated Toggle Group vertices] 
4. [ Short Chain vertices, Single Group vertices 
or broken up short chain vertices] 
5. [Remaining Toggle Group vertices] 
from top down. 
35 
(6) Based on the new order, we choose the vertices for P3 assignment, starting with long 
chain vertices; followed by the outstanding vertices; estimated Toggle Group Ver-
tices; short chain vertices, Single Group vertices or broken up short chain vertices; 
and the remaining Toggle Group vertices at the end. This assignment order is based 
on the principle, first pick up the vertices which have to be in P 3, otherwise no feasi-
ble solution exits, then the ones with high assignment probability, next, all the other 
ordinary vertices. The assignment queue is created as follow: 
1. m1_ vertices from the long chain, which have to be in P 3; 
2. m0 _ vertices from the outstanding vertices group with high probabilities; 
3. m, _vertices from the other 3 groups. 
A number of vertices chosen for P3 assignment ism= mz + m0 + m, where m~l· 
After the vertex ordering stage, we have to decide on the number of vertices 
assigned to P3, which is represented as "m." So "m"= total number of vertices assigned to 
P 3. Based on the properties of the input netlist, we estimate the best choice for m to start 
with. 
1. The upper bound on m is 8, that is the total number of global macrocells on 
CY7C361 chip. 
2. The lower bound on m is 0 or S-32+8, where S is the total number of vertices in the 
input netlist. If S~24, lower bound is 0, otherwise, lower bound is S-32+8. The 
lower bound is set by the netlist limitation. 
3. A number of long chain vertices which have to be in P 3 , is denoted by mz. 
36 
4. Based on Rule 2 and Rule 3, the proper lower bound mproper is the maximum 
number of [0, S-32+8, m1]. 
Based on the above shown limitations, a starting number of Vs; chosen to P 3 Parti-
tion is m = mpropen then m is increased by one until it reaches 8. If there exists a number 
of outstanding vertices in the netlist file, we use the symbol "P" to indicate the number of 
the outstanding vertices. If mproper <P <8, we can use the heuristic method here to assign 
the m =P number of Vs; to P 3 Partition first, then increase m by one until it reaches 8, if 
the solution is not found, m will be decreased by one form P to mproper· 
Therefore, two heuristic strategies are added to the original P ABFIT to achieve the 
speed optimization. The comparison of the flowchart of the original P ABFIT and that of 
the new heuristic PABFIT.h is shown in Figure 14. 
V.2 CHOOSING THE VERTEX ORDER 
V.2.1 Outstanding Vertices 
To order all the vertices in the input netlist based on their Total-Degree and 
Multiple-Degree and to identify "outstanding vertices" with high Total-Degree and 
Multiple-Degree, we first set up several data arrays to store the original index number of 
the vertices in the input netlist and their corresponding Total-Degree and Multiple-
Degree as well as other necessary information. This is shown in the following list. The 
relationship of these arrays is shown in Figure 15. 
To distinguish the outstanding vertices, we first sort the array Multiple-
Degree[Snodes] and the array Total-Degree[Snodes] from the highest value to the lowest 
value based on the Multiple-Degree and Total-Degree, respectively. And then re-arrange 
their corresponding input vertices' natural entry order in the array Multiple-Degree-Order 
37 
Original PABFIT 
Check input file 
~ 
P3 Assignment 
~ 
First Partitioning 
~ 
P13, P23 Assignment 
~ 
Second Partitioning 
+ 
Physical Placement 
Figure 14. Algorithm Comparison. 
If there are S input symbolic vertices, allocate 
S integer memory cells for each array. 
Multiple-Degree-Order ~ (1) 
Multiple-Degree (2) 
Total-Degree-Order ----. 
(3) 
Total-Degree (4) 
(1) store the natural entry order of the input vertices 
(2) store the multiple degree of the corresponding input vertices 
(3) store the natural entry order of the input vertices 
(4) store the total degree of the corresponding input vertices 
Snodes: 
integer number used for total number of input vertices 
Multiple _Degree[Snodes]: 
multiple degree of the input vertices 
Multiple _Degree_ Order[Snodes]: 
Natural entry order of the input vertices for the corresponding multiple degree 
Total_Degree[Snodes]: 
total degree of the input vertices 
Total_Degree_Order[Snodes]: 
Natural entry order of the input vertices for the corresponding total degree 
Figure 15. Arrays for the Outstanding Vertex Search. 
38 
[Snodes] and the array Total-Degree-Order[Snodes], according to the vertex degree. 
example: input vertices=28; possible global vertices number is 4 to 8; Since 
there are 4 vertices in one LRG at most, there will be at least 7 LRG 
occupied. So the number of global vertices should be around 7; 
let the rough number of global vertices = 7; 
index multi-deg diff first total-deg 
38 28 7 18 
firnt ~ 36 26 vertices 16 
~ertices 
30 20 14 
28 18 12 
25 15 12 ---ii ____ -4-- -\ 3 10 
3 12 4 © 8 
the --J-y- ---io---- -0-- - Big Diff - - -i-- _______ .__ 7 
8-th - - - - - - - - - - - - - ---- -
vertex 22 10 4 5 
• compare the multiple degree of the first 7 vertices (v7 ,v6,v9,v12,vl,v5,v3) 
with the 8-th vertex (V2) in order to find the vertices with big multiple-
degree differences. (Big Difference>= 5) 
• Find v7,v6,v9,vl2,vl have big differences in multiple degree; 
• check the first 7 vertices in Total-Degree[Snodes] in order to check whether 
v7,v6,v9,v12,vl exist in the array and find that v7,v9,v6,v12 exist; 
• So v7,v9,v6,v12 are regraded as the most possible candidates for global vertices. 
Figure 16. Choosing the Outstanding Vertices. 
39 
Second, roughly detennine the number of global vertices. Since at most there are 4 
vertices in one LRG, so there are at least k LRGs, where k =total number of input ver-
tices + 4. Therefore, the global vertices number is probably around k. Since the index 
order in the array Multiple-Degree-Order[Snodes] and the index order in the array Total-
40 
Degree-Order[Snodes] are arranged according to the vertex degree, the global vertices 
can be identified from the first k vertices of both arrays because the global vertices should 
have higher Multiple-Degree and Total-Degree. 
Third, compare the multiple degree of the first k vertices in the Multiple-
Degree[Snodes] with the ( k+ 1 ) -th vertex. If there exits a large difference between 
them, (for instance, difference ~5 ), mark the vertex with large difference out of these 
first k vertices in the Multiple-Degree-Order[Snodes] queue. 
Last, check the first k vertices in the Total-Degree-Order[Snodes] queue. H the 
marked vertex in the Multiple-Degree-Order[Snodes] queue does not appear in the 
Total-Degree-Order[Snodes] queue, clear the mark; therefore, such finally marked ver-
tices are considered outstanding vertices. We use the symbol t to indicate the number of 
the finally marked vertices. 
Figure 16 shows an example how the outstanding vertices are identified. 
Here is the pseudo code for distinguishing the outstanding vertices. 
I**** distinguish the outstanding vertices *******I 
int Snodes; I* total number of input vertices *I 
int Multiple _Degree[Snodes]: 
I* multiple degree of the input vertices *I 
int Multiple _Degree_ Order[Snodes]: 
I* symbolic index order of the input vertices for the 
corresponding multiple degree *I 
int Total_Degree[Snodes]: 
I* total degree of the input vertices *I 
int Total_ Degree_ Order[Snodes]: 
I* symbolic index order of the input vertices for the 
corresponding total degree *I 
( 1) Sort Multiple-Degree[] from the highest value to the lowest value; 
(2) Re-arrange Multiple-Degree-Order[] so that the vertices pointed by 
Multiple-Degree-Order[] has the same decreasing multiple degree 
order as shown by Multiple-Degree[]; 
(3) Sort Total-Degree[] and re-arrange Total-Degree-Order[] similarly 
to ( 1) and (2); 
mark num=O; I* number ofverticesfirst marked *I 
check_num=Snodesl4; I* the number of vertices being checked *I 
for (i=O; i<check num; i++) 
{ -
} 
if ((Multiple_ Degree[ i)-Multiple _Degree[ check_ num] )>=Big_Difference) 
mark_num++; 
t=O; I* number of vertices finally marked *I 
for (i=O; i<mark num; i++) 
{ -
for (j=O; j<mark num; j++) 
{ -
} 
if(Total_Degree_Order[j]==Multiple_Degree_Order[i]) 
{ 
judge the vertex i as a outstanding vertex; 
t++; 
} 
} 
V.2.2 Toggle Group Vertices 
41 
Although the global vertex candidate from the Toggle Group vertices dominate 
only one section of the vertex searching queue, the process to choose these candidate ver-
tices is more complicated than for other groups of vertices. 
First, we identify all the local reset groups in the Symbolic Graph representing the 
input netlist. If a local reset is found, examine all the toggle vertices triggered by this 
local reset and store their symbolic index order in a double pointer data structure A, 
which is shown in Figure 17. The data structure A is organized as follow, if there are n 
local resets, the double pointer **toggle-grp points ton different single pointers *toggle-
grp[i] (O~i ~ -1 ) and each single pointer *toggle-grp[i] points to maximum 32 integer 
memory cells. If the i-th local reset triggers k; ( 0~;~32, Os ~ -1 ) toggle vertices, we 
use the first k; out of the maximum 32 integer memory cells to store the symbolic index 
order of the toggle vertices. 
*toggle-grp[O] -.;. 
*toggle-grp(l] -.;. 
**toggle-grp .... , ~ -.;. 
*toggle-grp[n-1] ~ 
) 
n local reset=> 
n signal pointers 
*toggle-grp[i] ~ 
32 integer memory cells 
) 
if the i-th local reset triggers Ki 
toggle vertices, then use the first 
Ki integer memory cells out of the 
maximum 32 integer memory cells 
to store the symbolic index order 
of the vertices. 
Ki toggle vertices' 
symbolic index 
order 
Figure 17. Data Structure A. 
42 
**toggle-cell-group 
[ 
index-total ______,... 
total-degree ---;. 
*toggle-cell-group[O] ~ index-multiple~ 
multiple-degre~ 
[ 
index-total ---... 
total-degree ---;. 
*toggle-cell-group[l] ~ index-multiple~ 
multiple-degre~ 
*toggle-cell-group[i] 
[
index-total ~ 
total-degree ~ 
~ index-multiple-. 
multiple-degre~ 
*toggle-cell-group[n-1] 
Sub Data Structure j store the symbolic index order of the 
toggle vertices 
index-total tore 
of the corresponding 
total-degree )II j I f j ......... I j f toggle vertices 
store the symbolic 
index-multiple ~I f I I ......... f l / indexorderofthe 
toggle vertices 
multiple-degree ______,..., I I I ......... I I ' 
1 
· 
1 store the mu tip e 
If i-th local reset triggers Ki toggle vertices, 
allocate 4xKi integer memory cells for one 
Sub Data Structure. 
Figure 18. Data Structure B. 
degree of the 
corresponding toggle 
vertices 
43 
44 
Then we set up another double pointer data structure B. The double pointer ** 
toggle-cell-group points to n different single pointers *toggle-cell-group[i] (O~i ~n -1), 
and each single pointer points to a sub data structure of 4 fixed pointers. These 4 fixed 
pointers in each sub-data structure *toggle-grp[i] point to four groups of integer memory 
cells, which store the multiple degree, total degree and their corresponding symbolic 
index orders for k; toggle vertices triggered by the local reset i, shown in Figure 18. 
Actually, each sub-data structure of data structure B stores the information of all the ver-
tices in one toggle sub-group of the Toggle Group. In other words, one sub data structure 
of data structure B represents one toggle sub-group. 
After having established these two data structures A and B, we call the procedure 
"Calculate Total-Degree" and "Calculate Multiple-Degree" to calculate Total Degree and 
Multiple Degree for each toggle vertex in data structure A and transfer the corresponding 
information into data structure B. Sort the data structure B from the highest value to 
lowest value for the total-degree and multiple-degree respectively and re-arrange their 
corresponding toggle vertex's symbolic index order according to the value of the degree. 
Finally, search all the toggle vertices in each sub data structure of data structure B 
in order to mark the outstanding vertices found. Since each sub data structure of data 
structure B represents one toggle sub-group, that is the same as to mark all the outstand-
ing toggle vertices in each toggle sub-group. Some toggle sub-group may contain out-
standing vertices, some may not Then we can choose the candidate global vertices from 
the toggle group according to the following rules. 
(1) If there are at most 3 toggle vertices in one toggle sub group, ignore this toggle sub-
group no matter whether outstanding vertex exist in this group. 
(2) If there are more than 3 but less than 8 toggle vertices in one toggle sub-group, 
ignore this toggle sub-group only when there is no outstanding vertex in this sub-
group. If there exists one outstanding vertex, search all the toggle vertices in the tog-
45 
gle sub-group except the outstanding vertex and put the one with the highest total-
degree and multiple-degree in the estimated Toggle Group vertex. 
(3) If there are more than 8 toggle vertices in this toggle sub-group, the number of the 
estimated toggle vertices should be E, where E is equal to the total number of ver-
tices in the toggle sub-group divided by 4. The process to estimate the Toggle Group 
vertices is similar to (2). 
(4) If there are more than 2 outstanding vertices in one toggle sub-group with less than 8 
toggle vertices, we only leave one outstanding vertex in the toggle sub-group, move 
all the other outstanding vertices with less total-degree or multiple degree to the 
Single Group. 
(5) If there are more than 3 outstanding vertices in one toggle sub-group with more than 
8 toggle vertices, only two outstanding vertices are left in the toggle sub-group, 
move all the other outstanding vertices with less total-degree or multiple degree to 
the Single Group. 
Since the outstanding vertices are placed right after the long chain vertices in the 
searching queue, they are quite near the top of the vertex searching queue. In order to 
reduce the probability of "Split-local-reset" problem at the beginning of the P 3 assign-
ment search, we try to maintain some balance and only leave a reasonable number of out-
standing vertices in the outstanding vertices group. This rejudgement of the outstanding 
vertices is combined with the estimation of the toggle vertices and can be regarded as one 
part of the estimation. 
The pseudo code of the estimation of the Toggle Group vertices is listed below. 
!***** estimation o[Toggle Group vertices ********************I 
int num _sym _lr; I* the number of symbolic local reset *I 
int num_gl; !* the number of symbolic global reset *I 
int t=O; I* the index number for toggle sub-group *I 
for (i=nodes+num_sym_lr+num_g/-1; i>=O; i--) 
{ if(Type[i1==4) 
I* Type 4 indicate the local reset *I 
{ 1=0; 
for (j=O; j<MAX SIZE; j++) 
{ if(Symbolic Graph[i1fj1==1) 
I* Symbolic Graph is the input netlist *I 
{ toggle_g"rpft1fl1=j; 
I++; 
}} 
num _toggle[ t1=1; 
I* store the number of toggle vertices in one 
toggle sub-group *I 
t++;} 
if(t==num sym lr) 
break;} - -
for (i=O; i<num sym lr; i++) 
{for (j=O; j<niiin_toggle[i1; j++) 
{ k=toggle_grp[i1fj1; 
toggle _cell_group[i1->index _total[j1=k; 
toggle_cell_group[i1->index_multiple[j1=k; 
toggle_cell_group[i1->total_degree[j1=Gb_In_Degree[k1+Gb_Out_Degree[k1; 
toggle_ cell_group[ i1->multiple _ degree[j1= 
(Gb_In_Degree[k1) * (Gb_Out_Degree[k1);} 
I* the In-degree and Out-degree have been stored in 
Gb_In_Degree[1 and Gb_Out_Degree[1 *I} 
initial toggle_mark[1=0 
I* indicate one Sub-Toggle Group has not been checked *I 
for (i=O; i< check num; i++) 
{ if(Original_Order[Total_Degree_Order[i11==3 && 
nod[Total_ Degree_ Order[ i 1-> Type==3 ) 
I* indicate the outstanding toggle vertices *I 
{for (k=O; k<num sym lr; k++) 
{ if (Total_Degree_Order[i1== 
toggle_ ce ll_group [ k 1->index_ multiple[O]) & & 
toggle_ mark[k]==O) 
I* find the outstanding toggle vertices in its Sub-Toggle-Group *I 
{ toggle_mark[k1=-1; 
if (num_toggle[k1>8) 
{for ( j=1; j<=num_toggle[k114-1; j++) 
{ Original_Order[toggle_cell_group[k]-> 
index multiple[j]1=2; 
!*estimate the toggle vertices*! 
}}}}}} 
46 
47 
V.2.3 Vertex Ordering Implementation 
After determining the outstanding vertices and examining the toggle vertices, we 
determine the vertex order. First, we define an array Original-Order[i] (Og~ -1) and 
initialize it with 110. II The index number i of this array indicates the natural entry order of 
the vertices in the input netlist. Then we define another array Real-Search-Order[i], 
(Og~-1). The index number i of this array is a natural number from lower bound to 
upper bound. 
After having set up these two arrays, we examine each vertex i (O~i ~ -1 ) and 
then classify it according to the following rules by marking its corresponding Original-
Order[i] with a proper distinguished value w ( 1~w~). 
Rules: 
(1) If the vertex belong to a long chain, marked value w = 4; 
(2) If the vertex is an outstanding vertex, marked value w = 3; 
(3) If the vertex is an toggle vertex related to an outstanding toggle vertex, marked 
value w = 2; 
(4) If the vertex belongs to a short chain without any outstanding vertices, marked 
value w = 1; 
(5) If the vertex is from a broken up short chain, marked value w = 1. 
We do not have to mark the remaining Toggle Group vertices with w = 0 because 
the array Original-Order[] has already been initialized with value 110. II Therefore, the ini-
tialization default to set all the remaining toggle vertices with marked value w = 0. 
48 
According to the result of the classification, we rearrange the original symbolic 
index order of the input vertices in the array Real-Search-Order[] from the highest 
marked value w = 4 to the lowest marked value w = 0. This process is shown in Figure 
19. 
The pseudo code for Hierarchic Classification and Vertex Ordering is listed below. 
I* Vertex Ordering *I 
for (i=O; i<Snodes; i++) 
{ if(vertex i is a finally marked outstanding vertex) 
original_order[ i)=3; 
if (vertex i is a estimated toggle vertex related to 
a outstanding vertex) 
original_order[i)=2;} 
for (i=O; i<Snodes; i++) 
I* re-judge the long chain with outstanding vertices *I 
{ if(original_order[i)==3 && nod[i)->chain_length>=4) 
I* distinguish long chain vertex in outstanding vertex *I 
{if (nod[i)->num in chain==]) 
for (j=O; j<nod[T}->chain_Iength; j++) 
original_ order[ i + j) =4; 
else if(nod[i)->num_in_chain==nod[i)->chain_Iength) 
for (j=O; j<nod[i]->chain_Iength; j++) 
original_order[i-j]=4; 
else { 
} 
for (j=O; j<nod[i]->chain_num; j++) 
original_order[i-j]=4; 
for (j=l; j<(nod[i]->chain_Iength-nod[i]->chain_num+l); j++) 
original_order[ i + j] =4; 
} 
I* break up the short chains *I 
else if(original_order[l]==3 && nod[i]->chain_length==2) 
{if (nod[i)->num in chain==]) 
original_ordei}t+l]=l; 
else 
original_order[i-1]=1;} 
else if(original_order[i]==3 && nod[i]->chain_Iength==3) 
{ 
if (nod[i]->num in chain==]) 
{ original_order[i+lJ=l; 
original_order[i+2}=1; } 
else if(nod[i]->num_in_chain==nod[i]->chain_length) 
{ original order[i-1]=1; 
original~order[ i+ 1)=1; } 
else 
{ original order[i-1]=1; 
original~order[i-2]=1; } 
} 
else {if (original_order[i]!=3 && nod[i]->chain_length!=l) 
{ if(nod[i]->chain_length>3) 
} 
original_order[i)=4; 
I* pick up the long chains *I 
else 
original_order[i]=l;} 
if(original_order[i]!=3 && nod[i]->chain_length==l && nod[i]->type/=3) 
original_order[i]=l;} 
for (i=O; i<Snodes; i++) 
I* vertex ordering in Real_ Search_ Order[] *I 
{ 
switch( original order[i]) { 
case 4: long-chain num++: 
break; -
case 3: outsd_grp _num++; 
break; 
case 2: estimated_tgl_num++; 
break; 
case 1: single_grp_num++; 
break; 
case 0: toggle_grp_num++; 
break; } } 
h=O; 
j=h+long_ chain_ num; 
k=j+outsd _grp _num; 
l=k+estimated tgl num; 
m=l+single _grp-num; 
for (i=O; i<Snodes; i++) 
{switch( original order[i]) { 
case 4: Real Search Order[h]=i; 
h++; -
break; 
case 3: Real Search Order[j]=i; 
j++;- -
break; 
case 2: Real Search Order[k]=i; 
k++,-:- -
break; 
case 1: Real Search Order[k)=i; 
k++,-:- -
break; 
case 0: Real_Search_Order[m]=i; 
m++; 
break; } } 
49 
Original-Order[] Real-Search-Order[] 
store the symbolic index 
order of the vertices 
index marked value 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
3 
1 
0 
4 
4 
4 
4 
4 
3 
1 
1 
25 2 
26 2 
27 0 
28 0 
29 3 
30 1 
31 2 
32 1 
' ' ' 
~~~ 
\ ', ~~~ ~~~ 
\ ' ~ ~ 
\ ..>.~ ~ ~ 
,,~, ~ ... ~ 
)"'\~ ~""'~ ~~~ 
\ \ ,' ',,~ ~~~ 
\ "' ~ .... , ,' 
~1' \ ~ ~ -
\ ).' ~'' ~""' 
~ ,'(' \ .... , ~ ~" 
\ ,(' , ' 
~" \ ~~ ' 
~ \ v ' 
,(~\ ~ 
~ ~ \ \ 
\ \ 
- - - \ - , - - - - - -> 
\ \ 
\ \ 
\ \ ' ' ~ ' ' ' ' ' 
\ \ 
\ \ 
\ 
I 
I 
' \ \ \ 
\ \ \ \ I 
\ ', \ \ I 
' \ \ ~ :1 
' \ \ 1\ " 
\ \\ I V A 
\ \ I "\ /' 
\ \1 " "' \ i~ ... , ~ A,>",,, 
"I '" \ I "I"' ,1\ 
" I" ~ \ ~I ,"I" ,'{ , .. 
... ... )I' ~ 
" " I \ \ 
" " I I \ \ 
" " I I h 
" I \ " " I / I I 
I \ 
' I ,' \~~ 
'I.. I ~"\ 
I ' I ~" \.A 
', '1 ~ v' .... .... ~ ... 
I ' I 'V " \ 
I 'I "" ' " \ I> ' / ,.. .... ,.. \ 
," .............. ........ ~ 
" I " ' ' ,' ," ', ~ 
I 
" .... " .... " .... 
," ~ 
4 
5 
6 
7 
8 
1 
9 
29 
. 
25 
26 
31 . 
2 
10 
11 
30 
32 
3 
27 
28 
. 
Figure 19. Vertex Ordering. 
~ 
• 
• 
• 
• 
symbolic index order 
of long chain vertices 
with marked value 
w=4; 
symbolic index order 
of outstanding vertices 
with marked value 
w=3; 
symbolic index order 
of estimated toggle 
vertices with marked 
value w=2; 
symbolic index order 
of Single Group vertices, 
short chain vertices 
with marked value 
w=l; 
symbolic index order 
of remaining toggle 
vertices with marked 
value W=O . 
50 
51 
V.3 CHOOSING 11-IE NUMBER OF GLOBAL VERTICES 
The determination of the number of global vertices is much simpler than the vertex 
ordering. We already know that the lower bound of the possible global vertices is 0 or S-
32+8, the upper bound is 8. For example, with long chains, there are at least m1 vertices 
from the long chains, which have to be in any reasonable P 3 assignment. Therefore, the 
proper lower bound mproper is the maximum number of [0, S-32+8, m d. Usually, we can 
start with mproper and increase m one by one until it reaches 8. However, if there exist 
outstanding vertices, to achieve speed optimization, we start with m =P, where P is the 
number of outstanding vertices, next increase m one by one until it reaches 8. If the solu-
tion is not found, m will be decreased one by one from P to mproper. For those examples 
with more than 30 total input vertices, we can start with m =8 and decrease m one by one 
until it reaches 6 in order to make good use of the flexibility of the connectivity of global 
vertices. Besides this, we can also combine the high degree vector heuristic method as a 
special case in the determination of the number of global vertices. 
The pseudo code for this part is shown below. 
I***** Determination of the number of global vertices *******I 
int ab _que[] I* queue of possible number of global vertices *I 
int start_ ab I* the first member of ab _que[] *I 
if( number of symbolic nodes>= 30) 
I* large number input vertices example *I 
{ ab que[]<= [ 8,7,6 ]; 
if( number of outstanding vertices> 0) search with real search order; 
else search with original order; 
return;} 
if (number of symbolic nodes is ( 24 < Snodes< 30)) 
I* Here decide the start ab value *I 
start ab = Snodes - 24; 
else start ab = 1; 
if ( number of symbolic nodes < = 24 and 
the input vertices queue has no long chains ) 
start ab = 0 
if (start ab <the number of long chains in the input vertices queue) 
start_ ab = the number of long chains 
if ( start_ab <high degree vector number And 
outstanding vertices number < high degree vector number) 
{ ab que[]<= [high degree vector number, 
- high degree vector number + 1, 
8, 
high degree vector number- 1, 
start ab 1 
if ( outstanding vertices number > 0 ) 
search with real search order; 
else search with original order; 
return} 
if( start_ab >high degree vector number And 
start ab < outstanding vertices number ) 
{ ab que[]<= [outstanding vertices number, 
- outstanding vertices number + 1, 
8, 
outstanding vertices number -1, 
start ab 1 
if ( outstanding vertices number > 0 ) 
search with real search order; 
else search with original order; 
return;} 
if ( start_ ab > outstanding vertices number ) 
{ ab que[] <= [start ab, 
- start_ab + [, 
8] 
if ( outstanding vertices number > 0 ) 
search with real search order; 
else search with original order; 
return;} 
52 
CHAPTER VI 
RESULTS OF THE NEW VERSION OF P ABFIT 
The PABFIT.h program that was presented in Chapter V has been tested on several 
industrial examples. These examples are real life test examples obtained from Cypress 
Semiconductor Inc. The results of PABFIT.h are compared with the results of PABFIT 
so as to verify the speed benefit achieved by the PABFIT.h program which utilizes the 
vertex ordering and global vertices number estimation. 
Tables II, III, IV, V, and VI show the examples on which the PABFIT.h and PAB-
FIT were tested. Actually, Tables II - VI can be represented as one Table. However, for 
convenience, we keep these five separated Tables on different comparison. Table I deals 
with simple low connectivity examples. Table II presents remarkable high connectivity 
examples. Chain connection examples are grouped in Table III. Table IV presents toggle 
group examples. No solution examples are grouped in Table V. Tables II, III, and VI use 
the same format. The column "Time_old" and "Time_new" indicate the CPU executing 
time generated by the PABFIT and PABFIT.h, respectively. All the test results with both 
PABFIT.h and PABFIT were obtained from a Gateway 2000 PC with Intel486 and DOS 
5.1 environment. The measure for the CPU time is seconds. If there is an "NC" in 
"Time_old," it shows that PABFIT could not verify in a reasonable time whether a feasi-
ble solution exits or not. However, as it is shown, a new version of the program 
PABFIT.h was able to find a solution. 
A "feasible solution" of the fitting problem is a feasible placement of each element 
of the netlist and the symbolic local/global resets to the state macrocell and physical 
54 
local/global resets of the chip. If there is no feasible solution, it means that the netlist can 
not be mapped on the CY7C361 chip, or in another word, the VHDL design does not fit 
into the CY7C361 physical resource. The column "FIT" on Tables II, III, IV, V, and VI 
show the results of the PABFIT.h and PABFIT. The answer "yes" means that both 
PABFIT.h and PABFIT were able to find a state/reset-cell feasible placement; "no" 
means that both PABFIT.h and PABFIT showed that the netlist of the given examples 
can't be fitted on the device. 
Tables II, III, IV, V, and VI also contain information about the characteristics of 
the input netlist file, or the Symbolic Graph Gs of the corresponding input netlist. This 
information can be used to evaluate the complexity of the algorithm for a given example. 
The number of vertices and the number of edges of the symbolic graph G s are given in 
the column "vert" and "edge," respectively. Inside the column "vert," the second and 
third sub-column also gives information about the number of global and local resets 
included in the netlist The whole column represents state macrocells/global resets/local 
resets. Information about the connectivity of G s can be obtained from the ratio 
edge/vertices. If the ratio is large, Gs is highly connected and the partitioning properties 
of Gs restrict the solution space of the fitting problem very effectively. Furthermore, ver-
tex ordering and global vertices number estimation are used very effectively such that the 
starting search point is very closed to the feasible solution, if one exists. H the ratio is 
low, the solution space is only weakly restricted by the connectivity. In this situation, if 
the netlist does not contain long chains, the solution space of the algorithm may be large. 
However, since the complexity of the input netlist is considerable low, it should be easy 
to find a solution to the fitting problem in this case. The chain constraints, especially the 
long chain constraints can restrict the solution space of the fitting problem very effec-
tively. They can influence the partitioning of the Symbolic Graph Gs very strongly, 
because chain vertices have to be mapped to adjacent macrocells on the chip and at least 
55 
one vertex from a long chain should be presented in the global vertices assignment. 
Therefore, vertex ordering is applied to shift the long chain vertices to the top of the ver-
tex searching queue for the global vertices assignment stage. Some restrictions can also 
be added into the PABFIT.h in order to make sure each global vertices assignment con-
tains at least one vertex from the long chain. This will highly restrict the solution space 
and lead to a solution very quickly. The number of vertices of Gs that have an input chain 
form other vertices is given in the column "C_IN" in Table II, V, and VI. 
TABLE II 
SIMPLE EXAMPLES WHERE PABFIT.H AND PABFIT ARE EQUIVALENT 
FILENAME vert edge max conn C_IN Time_Old (s) Time_New (s) Fit 1 
busa_ba 16/0/0 10 3 8 0.054 0.109 yes I 
busa_ba1 16/0/0 12 3 10 0.16 0.16 yes 1 
cntr_ba 9!011 45 9 0 0.32 0.109 yes, 
counterl 13/0/0 12 2 1 0.00 0.054 yes 1 
cadman_b 25/0/l 45 11 0 62.02 62.91 yes' 
cadman_ba 25/0/1 45 11 0 72.14 72.08 yes 
demo2 9!0/0 9 3 4 0.054 0.054 yes 
epeell 24/0/0 35 8 13 183.62 183.40 yes 
epeecon_ba 16/1/0 34 6 8 17.03 18.18 yes 
micro_ba 9/l/1 20 9 0 0.054 0.00 yes 
examp33 14/0/2 47 9 0 0.43 0.65 yes 
mlt_fsml 21/0/0 22 3 5 0.109 0.109 yes 
reword 15/1/1 23 4 5 0.109 0.054 yes 
stepper 16/0/0 19 6 8 3.40 3.57 yes 
tsr2_bug 18/0/0 19 2 2 0.109 0.109 yes 
- - - -- - -- ·-·-
Table II shows the examples with relative low edge/vertices ratio. Here, both 
PABFIT.h and PABFIT demonstrate that all the examples in Table IT can be fitted into 
the CY7C361 chip. Comparing the CPU executing time of PABFIT.h and that of PAB-
FIT, we found that PABFIT.h can not be much faster than PABFIT. The executing time 
of both programs are quite close to each other. Actually, this situation is reasonable 
56 
because the solution space for the examples with low connectivity is small and the ver-
tices with particular properties, which are regarded as the most possible global vertices 
assignment candidates, are not obvious among all vertices form the input netlist file. 
Besides, since PABFIT.h has to spend some CPU time to form the proper vertex order 
and choose the proper starting global vertices number, the executing time of PABFIT.h 
may be a little bit longer than that of PABFIT. 
From the results in Table III, we found that the PABFIT.h can find a solution faster 
than the PABFIT. Especially for the example "dees_ba.fit" and "e019_ba.fit," the 
improvement is great. 
In the example "dees_ba.fit," there are 30 state vertices, including 2 TOGGLE 
Sub-Groups. One with 12 vertices, another with 4 vertices. The PABFIT.h found 1 out-
standing vertex in the 12-vertex TOGGLE Sub-Group, and it estimated that there should 
be another vertex in this TOGGLE Sub-Group, which is related to the found outstanding 
vertex and should be shifted to the proper section of the vertices searching queue. So 
P ABFIT.h recognized the vertex with the second maximum Multiple and Total degree as 
the estimated Toggle vertex. This is described in Chapter.IV and Chapter.V. PABFIT.h 
also found another outstanding vertex in the TOGGLE Sub-Group with 4 vertices. 
According to the rules we described in Chapter.lli, it ignores the other toggle vertices. 
The other outstanding vertices are found in the single vertices group. By examining the 
result of "dees_ba.fit," we found six global vertices, which were used in the solution, 
among the first 10 location in the Real-Search-Order queue, which is formed by the ver-
tex ordering as the new vertices searching queue. The other 2 real global vertices are in 
the middle of the Real-Search-Order queue. Since there are 30 state vertices, PABFIT.h 
chose 8 as the starting global vertices number, that is, the same number of the real global 
vertices as the Real-Search-Order queue shows. Therefore, it took PABFIT.h only 57.85 
seconds to find a feasible solution. This result is much better than that from PABFIT 
57 
since ••dees_ba.fit" is listed as NP (Not completed) examples for PABFIT. 
In the example II eO 19 _ba.fit, II there are 30 state vertices and 7 short chains, the 
PABFIT.h found 3 outstanding vertices in three 2-vertex short chains, respectively. By 
applying the chain-break-up technique, the vertex ordering procedure leaves the found 
outstanding chain vertices at the top of the vertex searching queue, but moves the other 
vertices from the broken up chain to the bottom. The other outstanding vertices were 
found in a single vertices group. After examining the Real-Search-Order queue, we 
found that the 8 vertices chosen to a found solution as global vertices are exactly at the 
first 8 locations of the Real-Search-Order queue. Therefore, PABFIT.h chose 8 as the 
starting global vertices number and just picked up the first 8 vertices in the new vertex 
searching queue as the real global vertices. It took P ABFIT.h 98.02 seconds to figure out 
a feasible solution and it is much better than what PABFIT did, which took 5 hours. 
Besides these two examples, all the other examples in Table III show that 
PABFIT.h is faster than P ABFIT in the CPU executing time for these type of the netlist 
files. 
TABLE Ill 
EXAMPLES WHERE PABFTI.H IS BEITER THAN PABFIT 
FILENAME vert edge max conn C_IN Time_ Old (s) Time_New (s) Fit 
tgen_ba1 25/l/1 60 14 0 3.46 2.52 yes 
warpt_ba 32/1/0 64 5 2 28.90 15.38 yes 
e019_ba 32/0/2 85 13 8 11176.81 98.02 yes 
dees2_bal 30/1!2 100 13 15 NP 57.85 yes 
Table IV shows the fitting results of the C_IN chain examples where PABFIT.h is 
faster than PABFIT. We can see that Table IV is a little bit different from Table II and 
Table Ill. Instead of giving the total number of how many C_IN chain vertices in one test 
58 
example, Table IV lists the numbers of how many vertices in each of the sub-chain 
groups in the test example. Here we used "Chain_num" to indicate this column. For 
example, "4/5/6" means that there are 3 different chain groups and the first chain contains 
3 vertices, the second 5 vertices and the third 6 vertices. For all the examples containing 
C_IN chains, whether they can be fitted into the device or not, PABFIT.h can find a feasi-
ble solution much faster than PABFIT. This proves the correctness of our Heuristic 
PABFIT.h algorithm. The original PABFIT only calculate how many vertices in a C_IN 
chain should be put into P 3 Assignment and it chooses those vertices according to their 
natural entry order. But our vertex ordering PABFIT.h, first estimates which of the ver-
tices in the whole chain is the most possible P 3 assignment candidate or the so-called 
outstanding vertex. Then it reforms the chain look up order in order to pick up the out-
standing C_IN chain vertices and its relative C_IN chain vertices first. Here, the distance 
between the relative C_IN chain vertices and the outstanding C_IN chain vertex must be 
4 or the multiple of 4. Otherwise, they are not related to the outstanding vertices anymore 
and must be excluded from the P 3 assignment. 
TABLE IV 
C_IN CHAIN EXAMPLES WHERE PABFIT.H IS BETTER THAN PABFIT 
FILENAME vert edge max conn Chain_num Time_ Old (s) Time_New (s) Fit 
chain_cad 25/0/1 45 9 4/ 340.3 330.0 yes 
chain_d 30/1/2 100 7 2/5/2/3 34.6 30.0 yes 
chain_t2 16/1/0 33 4 2(1/4/2/3 21.7 11.6 yes 
chain_tr 18/0/0 44 6 2(1/5 51.7 29.6 yes 
chain_w2 19/1/1 37 4 3!2/217 11.2 8.3 yes 
chain_g 25/1/1 60 7 17/ 18.2 14.8 no 
chain_t 16/l/0 33 4 2!1/2/2/6 93.1 8.1 no 
chain_w 19/l/1 43 5 3(2/5/2!2 67.1 36.5 no 
-- ~-------- '------------------ ------ ____ L__ 
59 
TABLE V 
TOGGLE GROUP EXAMPLES WHERE PABFIT.H IS BETTER THAN PABFIT 
FlLENAME vert edge max conn C_IN tgl_num Time_Old (s) Time_New (s) Fit 
tgl_e 14/0/2 60 7 0 3{3 70.0 57.5 yes 
tgl_kk 20/1{3 57 6 0 4{313 612.9 11.0 yes 
tgl_ne 25!0!2 68 5 0 7/8 57.3 31.5 yes 
tgl_ll 25/0{3 53 5 10 4{3/1 36.3 5.7 yes 
tgl_nw 28/0/2 77 5 12 7/4 468.0 343.5 yes 
tgl_2 30/1/2 99 7 9 12/4 146.8 39.3 yes 
tgl_eb 30/0/2 89 5 15 11{3 7529.7 183.2 yes 
Table V shows the fitting results of the toggle group examples where PABFIT.h is 
faster than PABFIT. Instead of presenting the number of C_IN chain vertices and the 
number of C_IN chain groups, Table V presents the numbers of toggle vertices in each of 
the sub-toggle groups. Here we use "tgl_num" to represent this column. Similar to Table 
IV, the item "6/4" means that there exist two different toggle sub-groups and the first tog-
gle sub-group contains 6 vertices and the second 4 vertices. For all these toggle group 
examples, PABFIT.h can find a feasible solution much faster than PABFIT. PABFIT 
does not care whether a chosen vertex is in the toggle group or not, it just pick it up into 
the P 3 assignment according to the natural entry order. Therefore, a random toggle ver-
tex may be put into the P 3 assignment separately without considering the connection 
constraints of all the other toggle vertices in the same toggle group. This confuses the 
fitter and causes the fitter to spend a long time checking the connection constraints. 
P ABFIT.h takes the whole group of toggle vertices into account. It chooses some out-
standing vertices from the same toggle group into the P 3 assignment and moves all the 
other unchosen toggle vertices from the same toggle group to the end of the searching 
queue. The symbolic connection constraints of the chosen vertices can easily match the 
physical connection constraints of the state macrocells and local/global reset cells. There-
60 
fore, a feasible solution can be found by P ABFIT.h much faster than PABFIT. 
Table IV shows the example that both P ABFIT.h and P ABFIT prove no feasible 
solution exists. Such designs can not be fitted into the CY7C361 physical device. There-
fore, both PABFIT.h and PABFIT have to examine all the possible cases one by one 
without exception to prove that there is no feasible solution. The executing times 
obtained from both programs for these examples are almost identical. Since "Data_8.fit" 
and "Examp ll.fit" are simple files with small solution space, the executing times 
obtained from both PABFIT.h and P ABFIT for these two examples are almost the same 
and they are also very short. "mbarn_ba.fit" contains three long chains, the first with 5 
vertices, the second and the third all with 7 vertices, respectively. It also contains 4 2-
vertex short chains. Since at least one vertex from the long chain should be placed the 
global vertices assignment, the starting global vertices number is 3. Although PABFIT.h 
kept the long chain at the top of the vertices searching queue and used some additional 
constraints to assure that there are at least three vertex from the long chains respectively 
in every global vertices assignment, it did not gain much in speed because there are too 
many long chain vertices. As the possible global vertices number increased from 3 to 8, 
P ABFIT.h did not exclude many assignment cases even though it utilized the long chain 
constraints, but unfortunately it had to spend time checking the possible assignment. 
"epeel_res.fit" is an example with high connectivity and contains only 2-vertex short 
chains. So both PABFIT.h and PABFIT have to go step by step to examine each possible 
assignment case. Therefore, for the examples with no feasible solution, the executing 
time is very difficult to cut down. 
The PABFIT algorithm does not rely on the properties of the input files. It 
explores all possible cases of the searching space in the random way. Therefore, for the 
simple file with small searching space, it will generate the result sooner or later, but for 
the complex or large searching space files, it will keep on running for long, long time 
61 
without producing a feasible answer. 
The PABFIT.h algorithm judges the properties of the input files at first. It forms 
the proper vertices searching queue and chooses the proper starting global vertices 
number before the real searching. Therefore, it is possible for this new algorithm to gen-
erate the feasible result faster than the original algorithm. 
TABLE VI 
NO FEASIBLE SOLUTION EXAMPLES 
FILENAME vert edge max conn C_IN Time_Old (s) Time_New (s) Fit 
data8 10/1/4 78 15 0 0.98 1.31 no 
exampll 13/1/2 70 9 8 0.109 0.16 no 
mbarn_ba 31/0/0 28 4 22 129.94 130.05 no 
epee1_res 27/0/0 53 13 13 5656.31 5728.35 no 
CHAPTER VII 
COMPLEXITY ANALYSIS 
As we have presented in Chapter Ill, the new heuristic methods are based on 
choosing the proper order of how many vertices (number given by m) out of all the input 
vertices are assigned a to P 3 assignment and are also based on choosing the proper order 
of how these m different vertices are assigned to P 3 . Let n be the number of total ver-
tices and let m be the number of vertices assigned to P 3 assignment. Here, 0 ~ m :::;; n. 
Therefore, the number of possible P 3 assignments can be calculated according to the fol-
lowing equation. 
Number of possible P3 assignments= }: (~ 
m=O 
If the i -th P 3 assignment tries to assign m = k; vertices into P 3, here 
0 ::;; m = ki < n, there are [ ~J different solution methods to assign these ki vertices into 
the P 3 assignment. 
Assume that there exists L long chains in the input netlist file and each long chain 
L 
contains li C_IN chain connection vertices, here L~l, 4~/;<n, L = Lm:::;; n. In the worst 
i=O 
case, there must be at least one vertex from each long chain in the P 3 assignment. There-
fore, the number of possible solutions for the i -th P 3 assignment on the long chain 
exan1ple can be calculated by the following equation. If denote Np; as the number of 
63 
possible solutions for the i -th P 3 assignment, then, 
[
n-Lml L [/~ 
Npi_long chains= ki- L J !] lj 
[ 1 12 • · • IL (n- Lm)! 
Npi long chains= (k; -L)!(n-Lm -K; +L)! 
n! _ [nl 
Np;_[ong chains < k; ! (n -k; )! - k~ 
So the solution space of the long chain example is much smaller than that of the 
regular example. 
By using heuristic, we assign the outstanding vertices in P 3 assignment first. 
Denote the number of outstanding vertices as P, 0~ 5k;. The number of possible i -th 
P 3 assignment on the outstanding vertices case can be calculated by the following equa-
tion. 
[n -Pl [n 1 (n -Pl Np;_outstanding vertices= k;-1 + k~ - k;-1 
Obviously, ~i::::8 < ~J. If the PABFIT.h algorithm can find a feasible solution 
within the first [zi::::8 assignments, it will not go on checking the remaining possible 
assignments. This will achieve the performance speed. 
For the Toggle Group vertices, we moved the remaining Toggle Group vertices at 
64 
the end of the searching queue after the estimation of the Toggle Group vertices. Assume 
that there are T Toggle Group vertices. In the worst case, none of them can be assigned to 
P 3 assignment. Therefore, the number of possible solutions for the i -th P 3 assignment 
can be calculated by the following equation. 
Np;_roggkGroup = (;J- [z;=B + ~;=B 
Since there are no Toggle Group Vertices in the first (;J - [zi=B assignments, the possi-
bility to encounter the "Split-local-reset" problem within these assignments is zero. 
Therefore, in this case, we search the non-Toggle type vertices first for a proper P 3 
assignment to get the feasible solution. 
Obvious! y, if there is no long chains, whether we use heuristic or not, the number 
of the possible P 3 assignment and the number of the different solution methods on cer-
tain P 3 assignments will not be changed. The whole heuristic algorithm only changes the 
order of vertices assignment. 
If we use a circle to represent the whole searching space, the dots inside the circle 
represent the possible solution. The PABFIT algorithm is trying to search the whole 
searching space step by step without any exception. But the heuristic PABFIT tries to 
guess which section of the whole searching space is the section where the possible solu-
tion exists. Then searches this section first. See Figure 20. From the previous theory 
description and result analysis, we can prove that the heuristic method can speed up the 
whole PABFIT algorithm. That means, a possible solution can be found by PABFIT.h 
faster than PABFIT. 
65 
Searching Space 
* * 
Blind Searching Heuristic-Oriented Searching 
Figure 20. Searching Space. 
CHAPTER VIII 
CONCLUSION AND FUTURE WORK 
In this thesis, we developed a vertex search approach to improve the time complex-
ity of the PABFIT algorithm. These new heuristic methods are developed based on the 
analysis of the general properties of the chip architecture. 
We have introduced a heuristic approach to influence the search order and the 
algorithm still remains exact. 
The test files for the PABFTI.h are the real industry design examples from Cypress 
Semiconductor Inc. The test results of P ABFIT.h shows this program can find a feasible 
solution much faster than the original PABFIT, especially for the example with high con-
nectivity and containing long chains. 
These Architecture-Driven fitting approach and Architecture-Driven speed optimi-
zation heuristic methods with some modification can also be applied to the layout prob-
lem of the other CPLD devices (Complex PLD) with highly restricted connections. 
The PABFIT.h can be separated into two-level partitioning stages, the global ver-
tices assignment & the first-level partitioning, and the intermediate vertices assignment 
& the second-level partitioning. The un-related assignment combinations of the global 
vertices assignment can be processed simultaneously by parallel computing. So do the 
assignment combinations of the intermediate vertices assignment. Therefore, the future 
work can be concentrated on utilizing the parallel computing technique on this program. 
CHAPTER IX 
APPLICATION 
In this Thesis, we developed an architecture-based partitioning fitting algorithm 
with vertex ordering heuristic targeted on an EPLD device, the CY7C361 EPLD. The 
design ideas of this algorithm can also be applied on the layout problem of the other 
EPLD or CPLD devices with highly restricted segmentation connection architecture, 
similar to CY7C361 EPLD. 
The MAX5000 192-macrocell EPLD, provided by Altera Corporation, is a highly 
restricted segmentation connection device. The 192 macrocells in MAX5000 are divided 
into 12 Logic Array Blocks (LABs), 16 pre LAB. There are 384 expander product terms, 
32 per LAB, to be used and shared by the macrocells within each LAB. Each LAB is 
interconnected with a Programmable Interconnect Array (PIA), allowing all signals to be 
routed throughout the chip. 
Each LAB contains 16 macrocells. In LABs A, F, G and L, 8 macrocells are con-
nected to 1/0 pins and 8 are buried, while in LABs B, C, D, E, H, I, J and K, 4 macrocells 
are connected to 1/0 pins and 12 are buried. The buried macrocells are connected to PIA. 
The architecture of MAX5000 device is shown in Figure 21. 
From Figure 21, we can know that MAX5000 EPLD has the partitioning properties 
in its chip architecture. Therefore, the arrangement of the partitions of the 192 macrocells 
can be used as an orientation in designing the MAX5000 EPLD layout algorithm. We can 
divide the vertices in the netlist into different sets according to the partitioning properties 
of the MAX5000 device and map such sets of vertices into the groups of macrocells in 
1 (A6) 
2 (AS) 
41 (K6) 
42 (J6) 
,,~~ ~ I II ~ INPUT INPUT INPUT 
INPUT 
(C6) 84 
(C7) 83 
(l7)44 
(J7) 43 
(C5) 
(A4) 
(B4) 
(A3) 
(A2) 
(83) 
10 (A1) 
11 (82) 
12 (C2) 
13 (81) 
14 (C1) 
15 (02) 
16 (01) 
17 (E3) 
20 (F2) 
21 (F3) 
MACROCELL 21 - 32 
LA80 
22 (G3) ~ MACROt;I=.LL 4!1 J 
23 (G1) ut.r<>nr:"'' o;n 
25(F1)~ 
26(H1)~ 
27 (H2) 
28 (J1) 
29 (K1) 
30 (J2) 
31 (L1) 
32 (K2) 
33 (K3) 
34 (L2) 
35 (L3) 
36 (K4) 
37 (l4) 
38 (J5) 
p 
I 
A 
3, 24, 45, 66 (85, G2, K7, E10) C>- Vee 
18, 19, 39, 40. 60, 61, 81,82 (E1, E2, K5, L5, G10, G11, A7, 87) C>- GNO 
~ 
L-J\.~ 
.~-v 
MACROCELL 117 - 12B 
k=. I ::[}: r LA81 
~ MACROCELL 133 - 144 
I ;L£ J LABJ I 
~ MACROCELL 149 - 160 
I;LJ I LA8K I 
-yl MACROCELL 164 t---0 
46 (LS) 
47 (LB) 
48 (K8) 
49 (L9) 
50 (l10) 
51 (K9) 
52 (L11) 
53 (K10) 
54 (J10) 
55 (K11) 
56 (J11) 
57 (H10) 
58 (H11) 
59 (F10) 
62 (G9) 
63 (F9) 
64 (F11) 
65 (E11) 
67 (E9) 
68 (011) 
69 (010) 
70 (C11) 
71 (811) 
72 (C10) 
73 (A11) 
74 (810) 
75 (89) 
76 (A10) 
n (A9l 
78 (88) 
79 (AB) 
80 (86) 
() • PERTAIN TO 84-PIN PGA PACKAGE 
Figure 21. Architecture of MAX5000 EPLD. 
68 
72 
16 
72 
16 
72 
16 
72 
16 
~ 
16 
72 
16 
~ 
16 
~ 
16 
64 
PIM 
CLOCK 
INPUTS 
72 
72 
16 
72 
64 
Figure 22. Architecture of CY7C376 EPLD. 
69 
proper LABs. Obviously, the LABs with more buried macrocells should have more con-
nections to PIA than the LABs with less buried macrocells. So the ordering of the macro-
cells with more connections to PIA should be an important factor in the layout problem. 
Therefore, the design ideas for CY7C361 fitting algorithm can be partially applied or be 
applied with some modification on the MAX5000 layout problem. 
The CY7C376 256-macrocell Flash PlD, provided by Cypress Semiconductor, is 
an EPLD device with a similar Architecture to Altern's MAX5000 EPLD. The 256 
macrocells in the CY7C376 are divided between sixteen logic blocks. Each logic block 
includes 16 macrocells, a 72 x 86 product term array, and an intelligent product term 
allocator. The logic blocks in the CY7C376 architecture are connected with an fast and 
predictable routing resource: the Programmable Interconnect Matrix (PIM). Unlike 
70 
MAX5000, each logic block in CY7C376 has only one buried macrocell along with each 
1/0 macrocell. In other word, in each logic block, there are eight macrocells that are con-
nected to I/0 cells, eight macrocells that are internally fed back to PIM. Architecture of 
CY7C376 is shown is Figure 22. 
Obviously, from Figure 22, the CY7C376 EPLD also has the partitioning proper-
ties in its design architecture. Therefore, its layout problem can use the partitioning 
fitting concepts, similar to the CY7C361 fitting problem. However, since the logic blocks 
in CY7C376 are identical to one another, the ordering of such logic blocks are the order-
ing of macrocells mapped into such logic blocks may not be an important factor on its 
layout problem. 
Conclusionly, the design ideas of the vertex ordering PABFIT.h algorithm can be 
applied on the layout problem of the other EPLD or CPLD devices with similar architec-
ture as CY7C361 EPLD. 
REFERENCES 
[1] Anderson, R.E., Coppola, A., Perkowski, M.A., "VHDL Synthesis of Concurrent state 
machines to a programmable Logic Device", VHDL International User's Forum, Scotts-
dale, Arizona, May 3-6, 1992. 
[2] U. Schlichtmann, F.Brglez, M.Hennarm, "Characterization of Boolern Function for 
Rapid Matching in FPGA Technology approaching", Proc. of ACMIIEEE DAC, pp. 
374-379, 1992. 
[3] Murgai, R., Shenoy, N., Brayton, R.K., Sangiovarmi-Vincentelli, A., "Improved Logic 
Synthesis Algorithms for Table Look Up Architectures", ICCAD-91, pp. 564-567, Santa 
Clara, CA, November 1991. 
[4] D. Filo, J.C. Yang, F. Mailhot, G.D. Micheli, "Technology Mapping for a Two-Output 
RAM-based Field-Programmable Gate Array" European Design Automation Conj. pp. 
534-538, February 1991. 
[5] Robert J. Francis, Jonathan Rose, Zvonko Vranesic, "Chortle: Fast Technology Mapping 
for Lookup Table-Based FPGAs", Proc. 28th ACM/IEEE Design Automation Conf., pp. 
227-233, San Francisco, CA, June 1991. 
[6] Robert J. Francis, Jonathan Rose, Kevin Chung, "Chortle: A Technology Mapping Pro-
gram for Lookup Table-Based Field Programmable Gate Arrays", Proc. 27th ACMIIEEE 
Design Automation Conf. pp. 613-619, 1990. 
[7] E. A. Walkup, S. Hauck, G. Borriello, C. Ebeling, "Routing-Directed Placement for the 
Triptych FPGA," 1992 ACM First International Workshop on FPGAs pp. 33-38, 1992. 
[8] Sha, L., Dutton, R., "An Analytical Algorithm for Placement of Arbitarity Sized Rec-
tangular Blocks", Proc. 22st pp 602-608, 1985. 
[9] Burstein, M., Pelavin, R., "Hierarchical Wire Routing", IEEE Trans. on CAD, Special 
Issue on Automatic Wire Routing, Ed. E.S. Kuh CAD-2" October 1983, pp.223-224. 
[10] Leiserson, C.E., Printer, R.Y., "Optimal Placement for River Routing", Proc.1981 CMU 
Conj. VLSI Systems and Computations, pp 126-142, 1981. 
[11] Sangiovanni-Vincentelli, A. Santomauro, M.,Read, J., "A New Gridless Channel Router: 
Yet Another Channel Router the Second(YACR-11)", Proc.ICCAD, 1984 
72 
[12] Sechen, C., Sangiovanni-Vincentelli, "The Timber-Wolf Placement and Routing Pack-
age'\ IEEE Journal of Solid-State Circuits, Vol. SC-20, 2 April, 1985. 
[13] M. Olnanowska-Jeske, S. Goller, I. Schafer, "An Architecture-Driven Approach for the 
Fitting Problem in an Application- Specific EPLD", IEEE International Symposium on 
Circuits and Systems, Proceedings Volume 3 of 4 VLSI and Parallel Processing" p.1782, 
1993 
[14] M. Chrzanowska-Jeske, S. Goller, "Partitioning Approach to Find an Exact Solution to 
the Fitting Problem in an Application-Specific EPLD Device" EURO-DAC CCH Ham-
burg, Germany Sept. 20-24, 1993. 
[15] Perkowski, M.A., Chrzanowska-Jeske, M., Coppola, A., Pierzchala, E., "An Exact Algo-
rithm for the Technology Fitting Problem in the Application Specific State Machine 
Device," /SCAS, 1992. 
[16] Steffen Goller. "Partitioning_Based Approach to The Fitting Problem In Special Archi-
tecture EPLDs", Portland State University, 1992. 
[17] Yap, H.P., "Some Topics in Graph Theory", Cambridge University Press, Cambridge 
1986, p.88. 
[18] Cypress BIMOS CMOS DATA Book, 1992. 
[19] Nam-Sung Woo, "A Heuristic Method for FPGA Technology Mapping Based on the 
Edge Visibility", Proc. 28th pp. 248-251, San Francisico, CA, June 1991. 
[20] R.I. Francis, I. Rose, Z. Vranesic, "Technology Mapping of Lookup Table Based FPGAs 
for Performance", Proc. ICCAD pp. 568-571, Nov. 1991. 
[21] W.T. Tutte "Graph Theory", Addison-Wesley Publishing Company pp. 261, 1984 
