Abstract
Introduction
The traditional design flow of VLSI IC below the RTL level consists of logic synthesis, technology mapping, and layout synthesis. These steps are frequently performed iteratively to achieve timing closure. In deep-sub-micron (DSM) technology, due to strong influence of interconnect delays on circuit performance, it is very difficult for physical design to converge when logic optimization is performed without considering layout.
Besides the problematic convergence issue, the algorithms to perform layout synthesis are highly complex and often require many hours of computation time to complete on large industrial designs. In the last decade, numerous incremental improvements to physical synthesis have been explored, such as incremental remapping, local re-routing, white space and buffer insertion, wire and transistor sizing. Correspondingly, there is a growing interest in approaches that integrate logic and layout synthesis in an attempt to solve the interconnect problem earlier in the design flow. Recently, many efforts were devoted to exploring various regular circuit and layout structures [9, 10, 12] , as they are more predictable and have advantages from a manufacturing viewpoint.
Starting with the early research of Akers [1] , regular structures have become an attractive alternative to the traditional design styles. Several variations of the regular structures have been proposed [2, 3, 4, 5, 6, 7, 10, 11, 14] . They provide different tradeoffs between the complexity and applicability of the synthesis methods and the efficiency of the resulting implementation.
This paper explores a specialized type of decision diagrams [4, 5, 15] , called Pseudo-Symmetric Kronecker Functional Decision Diagrams (PSKFDDs), as a vehicle for achieving efficient regular implementations. The goal is to transform Boolean functions into PSKFDDs, which are next mapped into regular circuits composed of Shannon and Davio gates. The problems of congestion and long interconnect are eliminated because connections between gates are local, mostly neighbor-to-neighbor, and distributed evenly among the gates. Since gates are placed using a regular pattern, the length and thus delay of local interconnects can be easily predicted before the final layout is generated. Because the majority of connections are short, the need for additional buffers is reduced and, the total area of the final circuit is also reduced. It has been shown that CMOS technology is well suited for regular implementations. Application of similar decision diagrams to generate regular layout for wave pipelining has been studied [7] .
The main contributions of this paper are two new efficient algorithms for generating regular layout using PSKFDDs constructed for Boolean functions. The first algorithm is based on the extended set of generalized variable-pair symmetries [8, 13] . Using the set of 15 generalized symmetries allows us to extend the work of [5] , resulting in the improved regular design for many benchmarks. The second algorithm performs heuristic PSKFDD synthesis while combining efficient variable expansion/selection with a number of look-ahead strategies.
The rest of the paper is organized as follows. Section 2 gives the definitions used in the paper. Section 3 discusses generalized symmetries. Section 4 presents our adaptation of the longest paths computation. Section 5 described two synthesis algorithms. Section 6 lists experimental results. Section 7 concludes the paper.
Definitions
Given a Boolean function F : B n ! B, where B = {0,1}, the negative (positive) cofactor of F with respect to (w.r.t.) variable x is the Boolean function F 0 (F 1 ) derived by substituting into F instead of x the value 0 (1). We denote F 2 the exclusive sum (EXOR) of the negative and positive cofactors:
Three canonical expansions of F are defined as follows:
Positive Davio expansion (pD) (2)
Negative Davio expansion (nD) Fig.1 . Join-Vertex operation rules the for left-to-right propagation.
Cofactors w.r.t. two and more variables are defined as repeated co-factoring w.r.t to each variable in the set. 
Regular structures discussed in this paper are called lattices 1 . A lattice is a set of regularly placed gates locally interconnected to form a grid. Each gate has a control signal propagating from left to right and two data signals propagating from bottom to top. Lattice synthesis is performed from top to bottom.
The Join-Vertex operation has been introduced in [3] as a way of dealing with the incompatibility of cofactors of the adjacent nodes (cofactors a 1 and b 0 in Fig. 1 ). The idea of this operation is to multiplex the cofactors using the control variable x of the given level in such a way that nodes A and B shared one of the cofactors but preserved the original functions. Fig. 1 lists the Join-Vertex operation rules for Shannon, Positive Davio, and Negative Davio gates. In this paper, we consider only Kronecker diagrams, which have the same type of gates throughout a level. In the case of the Davio expansions ( Fig. 1) , to preserve the function of node B, it is necessary to balance the negative cofactor of this node by adding a remainder to the positive cofactor. In this case, Join-Vertex is not a local operation and leads to the propagation of a remainder from left to right. Fig. 2 shows similar rules, in which the wave of remainders is propagated from right to left. The heuristic synthesis algorithm described in this paper achieves additional flexibility by using both sets of propagation rules. The generated function representation is called PseudoSymmetric Kronecker Functional Decision Diagram (PSKFDD).
Generalized Variable Pair Symmetries
A Boolean function F has a classical non-equivalence (equivalence) two-variable symmetry [8] iff replacing the first variable by (the complement of) the second and the second variable by (the complement of) the first yields the same function. 1 It is not related to the set-theoretic concept of a lattice.
Given the four cofactors of F w.r.t. a pair of variables, F 00 , F 01 , F 10 , F 11 , it is possible to give another definition of classical symmetries. F has a classical two-variable non-equivalence symmetry iff F 01 " F 10 = 0 and a classical two-variable equivalence symmetry iff F 00 " F 11 = 0. If the variable pair exhibiting the symmetry is ordered above other variables in the reduced decision diagram, then the upper part of the diagram looks as shown in Fig. 3 .
It is possible to generalize [13] the concept of classical symmetries by defining other conditions when at most three out of the four cofactors are non-constants. This gives rise to four new symmetries called constant-cofactor symmetries. Another way of extending the concept of classical symmetries is by considering the Davio cofactors, that is, the EXORs of cofactors from the set { F 00 , F 01 , F 10 , F 11 }. For example, if the three two-variable cofactors satisfy F 00 " F 10 " F 11 = 0, this can be considered as a new kind of symmetry called a Kronecker symmetry. Table 1 lists 15 types of symmetry derived using the set of 4 two-variable cofactors. Notice that all formulas in the column "Property" can be equal to constant 0 or constant 1. This leads to two subtypes of each of the 15 symmetries. When the expression is equal to constant 0, the symmetry is a non-skew symmetry; when it is equal to constant 1, the symmetry is a skew symmetry. Experimental results [9] show that, in MCNC benchmarks, non-skew symmetries are more common then skew. Generalized symmetries of the given function can be computed using several methods, however, discussion of these algorithms is beyond the intended scope of the paper. Table 1 . Classification of generalized symmetries.
Three/four-cofactor (Kronecker) symmetries
K4
The Reduction Types The study of generalized symmetries is motivated by the fact that functions with these symmetries can be represented by a reduced decision diagram, with at most three nodes on the third level. Fig. 4 shows the upper part of the decision diagram for constant-cofactor symmetries (C 0 -C 3 ) and single-variable symmetries (T 3 -T 6 ). Figs. 3 and 4 illustrate the usefulness the first 10 symmetries in Table 1 for the generation of regular layout. 
The symmetries (K 0 -K 4 ) also lead to the reduction in the number of non-constant nodes on the third level, if the Positive and Negative Davio expansions are used instead of the Shannon expansion. For example, suppose the expansions of the first two levels are Positive Davio and Shannon respectively and the variables have the four-cofactors symmetry K 4 . Notice that after the transformation of the diagram as shown in Fig. 5 , the effect of these two expansion (pD followed by S) with symmetry K 4 is similar to the effect of symmetry T 4 .
The above example shows that symmetry T 4 (in fact, any of the first 8 symmetries) can be seen as a reduction type, which describes how the cofactors are combined on the third level of the decision diagram. The mapping of symmetries into reduction types for the nine possible expansion pairs is given in Table 2 . Table 2 . Mapping of generalized symmetries into reductions types for different expansion pairs.
Generalized Symmetries Expan. Pair

C0 C1 C2 C3 NE Eq T3 T4 T5 T6 K0 K1 K2 K3 K4 S -S C0 C1 C2 C3 NE Eq T3 T4 -------S-pD C0 T3 C2 T4 --C1 C3 ---Eq -NE -S-nD T3 C0 T4 C2 --C1 C3 --Eq -NE --pD-S C0 C1 ----T3 -C2 C3 --Eq NE T4 pD-pD C0 T3 --NE -C1 -C2 T4 Eq ---C3 pD-nD T3 C0 ---NE C1 -T4 C2 -Eq --C3 nD-S --C0 C1 ---T3 C2 C3 Eq NE --T4 nD-pD --C0 T3 -NE -C1 C2 T4 --Eq -C3 nD-nD --T3 C0 NE --C1 T4 C2 ---Eq C3
The dash in Tab. 2 means that, for the given expansion pair and generalized symmetry, a regular layout cannot be created.
Longest Path Computation
In [9] , the concept of symmetry compatibility graph was introduced as a directed graph, whose nodes represent variables of the function and edges represent variable-pair symmetries. If two variables have no symmetries, there is no edge between the corresponding pair of nodes; otherwise, the edge is labeled by the symmetries. Edges of the graph are directed because some symmetries are sensitive to the ordering of variables in the pairs.
Given a symmetry graph, it is possible to find a sequence of variables such that each pair of adjacent variables in the sequence have a generalized symmetry. This variable sequence in called a variable path (a variable chain in [9] ). When the path of sufficient length is found, the expansions are assigned to each variable on the path in such a way that the corresponding reduction type (see Table 2 ) allows for the planar layout to be created on each level similar to how it is created in Figs. 5. The longest path is found by depth-first search (DFS) in the symmetry graph starting from every variable that has symmetries with other variables. Because for large graphs the enumeration of all candidate paths takes time exponential in the number of variables, the number of allowed backtracks is restricted. For many benchmarks, this method yields the exact solution in a short computation time.
Not every symmetry sequence results in an applicable sequence of reduction types for every expansion pair, and thus inclusion of variables into the path should be restricted by additional requirements. Also, it can be observed that certain reduction type sequences cannot be exploited to create the regular layout. These restrictions can be easily incorporated into the DFS algorithm.
Synthesis Algorithms
The implementation of PSKFDD synthesis is based on two approaches: systematic and heuristic. The systematic approach requires computation of generalized symmetries of the function w.r.t. all variable pairs and finding the longest path in the variable-pair symmetry graph. The symmetry sequence is computed for each level (except the first one) as a set of (1) an input variable to be used a control variable for this level, (2) an expansion type, (3) the symmetry between the control variable of this level and the previous one in the order, (4) the reduction type of this expansion and the previous one in the order.
If the function has no symmetries, or if the symmetry path does not include all variables, only the upper part of the diagram is synthesized using the approach based on the longest path. The rest of the diagram is constructed using a heuristic algorithm, which does not guarantee that the control variables are not repeated. Common to both systematic and heuristic synthesis algorithms is the iterative lattice construction, level by level from top to bottom.
The main difference in the heuristic synthesis compared to the systematic is the need to determine the variable/expansion pair at each level. There are several criteria for selecting the variable and expansion to be used on the given level: (1) the number of joinvertex operations that should be performed on the given level; (2) the number of constant cofactor that are produced as a result of expanding all the functions of the level using this variable; (3) the sum total of variables in the support of all cofactors produced as a result of expanding all the functions of the level using this variable. First, variables/expansion pairs are evaluated according to criterion (1) . If there is no tie, the best variable/expansion pair is returned, without the need for further computation. If there is a tie, the next level of the lattice is constructed for each variable and criteria (2) and (3) are used to find the best variable. This strategy corresponds to the look-ahead of depth 1.
Experimental Results
The algorithms have been implemented and tested using MCNC benchmarks. The resulting regular circuits for each output were written into BLIF files and verified for correctness against the original functions using SIS. The reported runtimes are in seconds on a Pentium III 933Mz computer. This runtime includes only logic synthesis. Tables 4 and 5 give the comparison of the presented heuristic synthesis algorithm with the previous work, [5] and [6] . In both tables, column "Name" lists the name of the benchmark, column "Outs" lists the total number of outputs followed, in parentheses, by the 1-based number of the output of a single-output function used. Column "Ins" gives the number of inputs in the multi-output benchmark function, followed, in parentheses, by the number of inputs in the support of the given single-output function. Results in [5] do not report the number of gates, so the comparison in Table 4 is in terms of logic levels and runtime. The shaded column shows the best result for each function. In Table 5 , we compare the numbers of logic levels with the number of nodes (gates). In all but a few cases, the proposed algorithm generated a more compact layout. The runtime for all the examples reported in Table 5 was close to one second. Table 6 compares the results of [9] with both the systematic ("System") and the heuristic ("Heuris") algorithms presented in the paper. The comparison is in terms of logic levels and nodes (gates). The two last rows show the sum total of entries in the cells of each column and the ratio of the results compared to the column "DVS", taken to be 100%. For the examples in this table, the heuristic algorithm is on average better than the systematic. Table 7 shows results for multi-output MCNC benchmarks. Inside each section, the column "Nodes" gives the total number of nodes for all outputs. Column "Join" shows the number of outputs that require the application for Join-Vertex operation. Number 0 in this column means that all the outputs of the given benchmark can be expanded without variable repetition. The number in parentheses in the column "Join" gives the number of outputs, for which the layout could be computed within less than the predefined limit (100) on the number of levels. Finally, the column "Time" gives the synthesis runtime for all outputs of each benchmark, for which synthesis completed. 
Conclusions
We proposed a methodology to synthesize logic functions into regular structures using decision diagrams. The resulting regular structures are easy to layout, because the signal wires connect only adjacent cells, and control wires are local. The structures can be mapped into library of pre-designed Davio and Shannon cells.
Among the two synthesis algorithms, the systematic one takes a global view of the problem by pre-computing the set of generalized symmetries for all variable pairs. The heuristic algorithm, on the contrary, uses a look-ahead strategy and thereby takes a greedy local view of the problem of ordering variables. The experimental results show that the heuristic algorithm is several times faster but sometimes loses quality compared to the systematic one. 
