Abstract
Introduction
Disjoint-support decomposition of a Boolean function f : {0, 1} n → {0, 1} is a representation of the form f (X, Y ) = h(g(X), Y ) where X ∩ Y = Ø, g : {0, 1}
|X| → {0, 1, ..., k − 1} and h : {0, 1, ..., k − 1} × {0, 1} |Y | → {0, 1}. The k-valued function g can be encoded as f (X, Y ) = h(g 1 (X), g 2 (X), . . . , g log 2 k (X), Y ) giving a decomposition with all functions being Boolean. Every set of variables X for which such a decomposition exists is called a bound set for f . This paper addresses two problems related to disjoint-support decomposition. First, we present a heuristic for finding a bound set which results in a disjoint-support achieving a good area/delay trade-off. Choosing a suitable bound set is important because disjoint-support decomposition does not necessarily simplify the function.
Second, we present a technique for transforming the original circuit implementing f (X, Y ) into a circuit implementing the decomposed representation h(g(X), Y ). Previous algorithms computed circuits for the decomposed representation from Binary Decision Diagrams (BDDs) of g and h, by applying various BDD-to-circuit transformation techniques. The algorithm presented in this paper uses BDDs only for analysis of the decomposition. The actual synthesis of the circuits for g and h is done by restricting the original circuit with respect to a given assignment of input variables. This guarantees that the sizes of the circuits of g and h are strictly smaller than the size of the original circuit.
Bound Set Selection
To find a suitable bound set X for f , we examine all linear intervals of variables of the BDD representing f . To check whether a given linear interval is a bound set, we use INTERVALCUT algorithm [1] . INTERVALCUT is very fast, because it does not require expensive BDD re-ordering.
If a bound set X with the column multiplicity k < |X| is found, it is stored together with the following three parameters characterizing the associated decomposition
1. the number of outputs having X as a bound set: s(X);
2. the number of outputs of g: c(X) = log 2 k ; 3. the difference in sizes of the bound set X and the free set Y :
Let X be the set of bound sets computed by INTERVALCUT. The best candidate is selected from X as follows. First, a subset X s of X containing all bound sets with the maximum s(X) is chosen. Maximizing of s(X) increases the sharing of common logic among different outputs of the circuit. Next, a subset X c of X s containing all bound sets with the minimum c(X) is selected. Minimizing of c(X) promotes the selection of bound sets with the smallest column multiplicity (more precisely, smallest log 2 k). Finally, a subset X d of X c containing largest bound sets with the minimum d(X) is obtained. Minimizing of d(X) allows balancing the partitioning of logic between the functions g and h.
Any element of X d is considered to be a "best" bound set for f , i.e. the one which produces a decomposition with the best area/delay trade-off. The original circuit implementing
Transformation Algorithm
Let X be a bound set for f and let G g and G h be BDDs representing the functions g and h in the decomposition f (X, Y ) = h(g(X), Y ). These BDDs are computed by INTERVALCUT.
Constructing the circuit for h
Suppose A is an assignment of variables of X leading to the 0-terminal node in G g . Then g(A) = 0, and thus f (A, Y ) = h(g(A), Y ) = h(0, Y ). Therefore, a circuit implementing the co-factor h(0, Y ) can be obtained from the circuit implementing f by applying the assignment A to the inputs X and propagating the constants through the circuit using the usual reduction rules. Similarly, circuits implementing co-factors h(i, Y ), i ∈ {1, 2, . . . , k−1}, can be obtained by propagating an assignment of variables of X leading to the i-terminal node of G g . Recall, that g is a function of type g : {0, 1}
|X| → {0, 1, ..., k − 1}, so G g is a multiterminal BDD with k terminal nodes.
To maximize the sharing of common logic of the i circuits implementing co-factors h(i, Y ), i ∈ {0, 1, . . . , k−1}, i assignments A are chosen so that they differ in the fewest number of bit positions.
The function h(g(X), Y ) is obtained by combining the co-factors in a Shannon expansion as follows:
where (i 1 , i 2 , . . . , i r ) is the binary expansion of i, r = log 2 k , and the term g ij j is defined by
for j ∈ {1, 2, . . . , r}.
Constructing the circuit for g
Suppose that B is an assignment of variables of Y such that h(i, B) = h(j, B) for some i, j ∈ {0, 1, . . . , k − 1}, i = j. Then f (X, B) = h(g(X), B) where the co-factor h(g(X), B) is neither constant 0, nor constant 1, i.e. it depends of g(X).
Since h is a function of type {0, 1, ..., k − 1} × {0, 1}
|Y | → {0, 1}, the co-factor h(g(X), B) is a function of type {0, 1, ..., k − 1} → {0, 1}. Note that, for k = 2, h(g(X), B) is either an identity, or a complement. Thus, at this step, the problem of constructing the
The k-valued function g(X) can be expressed as
where g i : {0, 1, . . . , k − 1} |X| → {0, 1} are multiplevalued literals defined as:
For a given encoding of k values of g(K), each of the functions g 1 (X), g 2 (X), . . . , g r (X), r = log 2 k , encoding g(X), can be represented as a sum of some literals
Consider a decomposition chart of h(g(X), Y ) with columns representing k values of g(X) and the rows represent all combinations of the variables of Y . Any nonconstant row of h(g(X), Y ) represents a sum of some literals g i (X), i ∈ {0, 1, . . . , k − 1}. In the best case, there exist rows in the decomposition chart corresponding directly to the encoded functions g 1 (X), g 2 (X), . . . , g r (X). If h(g(X), A) = g j (X) for some assignment A of the variables of Y , then the circuit implementing g j (X) can be obtained from the circuit implementing f by applying the assignment A to the inputs Y and propagating the constants.
In the worst case, the literals g i (X), i ∈ {0, 1, . . . , k − 1}, need to be computed by ANDing selected rows of h(g(X), Y ). Afterward, the functions g 1 (X), g 2 (X), . . . , g r (X) are obtained as a combination of g i (X).
Conclusion and Future Work
This paper has two contributions: (1) a heuristic for finding a bound set X which results in the disjoint-support decomposition with a good area/delay trade-off; (2) an algorithm which transforms the original circuit into the decomposed circuit.
Our preliminary experimental results on IWLS'02 benchmarks set show that the proposed technique usually results in a smoother trade-off between area and delay compared to the one of SIS. More experiments are needed to make a thorough evaluation.
