An effective logic synthesis procedure based on parallel and serial decomposition of a Boolean function is presented in this paper. The decomposition, carried out as the very first step of the .synthesis process, is based on an original representation of the function by a set of r-partitions over the set of minterms. Two different decomposition strategies, namely serial and parallel, are exploited by striking a balance between the two ideas. The presented procedure can be applied to completely or incompletely specified, single-or multiple-output functions and is suitable for different types of FPGAs including XILINX, ACTEL and ALGOTRONIX devices. The results of the benchmark experiments presented in the paper show that, in several cases, our method produces circuits of significantly reduced complexity compared to the solutions reported in the literature.
INTRODUCTION
he dominant direction of growth in electronics industry nowadays is in the area of highly functional and universal programmable devices, in particular, user programmable circuits like PLDs and FPGAs. Several CAD systems developed in the 1980s and aimed at programmable logic have been very effective in significantly reducing the number of circuits required for implementing a given logic function using PLDs, but not so effective for FPGA-based implementations. This has been a primary reason for recent interest in FPGA based logic synthesis.
There are different types of FPGA structures with different number of inputs and outputs and the design strategies for those cells differ from one another. Most design methods are based on Multilevel Boolean Networks [4] , functional decomposition [2] , [6] , [8] or Binary Decision Diagrams (BDDs) [9] . Recently published papers try to improve functional decomposition procedures to make them applicable to Look Up Table   ( LUT) structures [8] , [27] . Their promising results seem to indicate that the concept of functional decomposition should be investigated more generally and in more detail.
The only disadvantage of the functional decomposition is its restriction to only one type of FPGA architecture, i.e. LUT structures.
The intention of this paper is to develop a general method of decomposition for different types of FPGAs, for which until now there is no common design procedure.
Therefore the proposed design technique relies exclusively on the functional capabilities of FPGA logic cells. In other words, the logic cell is treated as a universal cell capable of implementing any Boolean function with fixed number of inputs and outputs. Such an assumption raises the possibility of developing a method applicable to a variety of FPGA structures. This assumption restricts the logic synthesis strategies mainly to functional decomposition methods i.e. to the process of reexpressing a function of n variables as a function of functions of fewer variables. For example, a function F(X) is decomposable if it can be expressed as F H(A,G(B)), where A and B are proper subsets of the set of input variables X, G and H are components of F, and H has fewer input variables than E Numerous decomposition algorithms have been developed. Ashenhurst , in his fundamental paper [2] , stated 289 290 T. LUBA AND H. SELVARAJ the disjunctive decomposition theorem based on the notion of decomposition charts. Curtis extended the Ashenhurst' s results to multiple decomposition when F if expressed as F H(A, GI(B) Gr:(B)) [6] . The use of decomposition charts for functional decomposition of logic networks is applicable only to restricted classes of functions. Therefore, a number of decomposition improvements were suggested. In the Roth-Karp scheme, a more compact representation of a function in the form of a cover of the on-set and a cover of the off-set has been used [23] . Later, in the early 70's, an attempt was made to apply orthogonal transform techniques to the design of digital circuits. The problem of constructing optimal decomposition schemes using spectral techniques was also considered 13] . Recently, the spectral approach has been improved and employed in the so called "groupability" method intended for FPGAs [8] .
In the early 80's, functional decomposition methods lost their importance because of the rapid development of synthesis techniques for multilevel logic. Algebraic division of sum-of-products expressions represented by the sets of cubes has been a basic operation in the procedures of substitution and kernel extraction used for decomposition of Boolean functions [4] .
Since the late 80's logic decomposition has been again attracting some attention as a technique used for design of PLAs. Devadas et al. proposed a Boolean decomposition of a PLA into two cascaded PLAs [7] . This procedure is however conceptually more similar to multiple-valued symbolic minimization than to the classical decomposition. Moreover, the method is confined to PLA synthesis and cannot be considered as a general functional decomposition approach.
Logic decomposition can play an important role in the design of FPGA-based circuits because their structure imposes constraints on the number of inputs only and the two-level minimization is not needed. However, the multilevel synthesis became so deeply rooted that earlier synthesis methods for FPGAs were based on the traditional, multilevel minimization approach.
This approach was used in MIS-PGA [20] , Hydra 10] and Chortle [11] . In the ASYL system, multilevel synthesis was improved by using the idea of lexicographical order [25] . Functional decomposition has been sometimes used in FPGA design, but only as an auxiliary process, as in MIS-PGA and Hydra systems. So far, there is only one FPGA-based technology mapper, namely TRADE, that fully exploits the idea of functional decomposition [27] . Its promising results seem to indicate that the concept of functional decomposition should be investigated more generally and in more detail.
Following this trend, we propose an original decomposition method which in contrast improves functional decomposition through interleaving two different strategies of decomposition and by applying an original calculus based on the representation of a function by a family of partitions over the set of cubes. Our decomposition procedure is universal, i.e., it can be applied to completely or incompletely specified, single-or multiple-output functions. Thus, it favorably compares with the earlier methods, often limited to either single-output or completely specified functions.
The paper is organized as follows. In Section 2, we introduce the basic notions and discuss the partitionbased representation of a Boolean function. In Section 3, the theoretical fundamentals of the decomposition methods are given. Section Clearly, MCCs are not disjoint and therefore the output consistency relation is not an equivalence relation on M. Hence, it "partitions" M into non-disjoint subsets.
Nevertheless, to describe the output consistency relation, we use the same notation as for the indiscemibility relation, i.e. PF is an output "partition" (index F is intended to distinguish between the input (IND) and output (CON) relation). A "partition" with non-disjoint blocks and such that no block is a subset of some other block is referred to as a rough partition (r-partition). The concept of an r-partition is a simple extension of that of an ordinary partition and typical operations on r-partitions are the same as used in the ordinary partition algebra [12] . In particular, the relation less than or equal to holds between two r-partitions 1-I and H 2 (1-I <-l-I2) PF (1,6 3 ;2,4 ;5,6 6,9 4,8,9; 7,10).
LOGIC DECOMPOSITION
Along with the widely known and frequently used serial decomposition, this paper proposes to interleave a new strategy of decomposition called parallel decomposition.
Both the strategies are illustrated in Fig.1 . The parallel decomposition leads to a structure in which two "independent" components operate in parallel (Fig. b) , whereas the serial decomposition results in TABLE   Truth Table of Table 3 .
In many situations, the parallel decomposition can be used as a subsidiary step to the more general, serial decomposition procedure.
Serial Decomposition
Let F(X) denote a Boolean function F with the set of input variables X. Let X A t3 B and C C A.
The problem of serial decomposition of F(X) is illustrated in Fig. c X   TABLE 2   Truth Table of Example 2   X6  X7  X8  X9   Y   Y2  Y3  Y4  Y5  Y6  0  0  0  2  0  0  0  3 This construction of G is always possible because II -> P(B U C), i.e., (m3uc) (muc) implies that m' and m" are in the same block of H. Clearly, this construction is not unique, i.e., many different functions G can be found for a given partition H.
TABLE 3a
Results of Parallel Decomposition of Table 2   XI   X2  X3  X4  X7  Y2  Y4  Y5  0 Table 4 .
For A {Xl,X3,X4}, B {x2,x 5 }, C ), we have P(A) (1,7 8,13 ;2,3 ;9,14,15 4,5 10; 6 11,12) P(B) (1,3,15 ;2,13,14 4,6,7,8,9,10,12 ;5,11) and TABLE 4   Truth Table of Example 3 x) ), where G is a singleoutput function.
The main task is to find the subset of inputs B U C for component G which, when serially connected with component H will implement function F.
Consider the subset of input variables, B U C, and the corresponding partition P(B U C). The relation of compatibility of partition blocks is used to verify whether or not partition P(B U C) defines a viable decomposition. Once a suitable set B U C and the corresponding partition II are found, the truth tables of functions G and H can be easily derived from P(A), 1-I, and PF. Example 4 For the function of Example 3, let A {x3,x4}, B {x,x2,xs}, and C .T hen, P(A) ( Unlike the traditional approach of first performing logic minimization and then mapping the design using decomposition and other methods, it is proposed to start with the mapping process straight away using different decomposition strategies discussed in the earlier chapters. The effectiveness of such an approach has been proved by Luba et al. [15] , [17] , [18] . Y0: {x0,x) Y: {x0,xx2). Now, the algorithm decomposes the two functions separately using the same iteration procedure. As the function Yo depends only on two inputs, it can be directly implemented using a single given cell. The function y is decomposed separately and its solution is given in Fig. 3 .
If it is found that a serial decomposition with Gin and Fig. 5a .
Functions fo, f and fe are f0 g'2 * X0; fl g'l * RO; f2 f0 ( f.
The three-input two-output function G' is decomposed separately using the same algorithm and its implementation is presented in Fig. 5b where f3, f4, f5 and f6 are f3 x1 ( x3 f4 x1 " x2 f5 f3"R2 f6 x.4 + x3
EXPERIMENTAL RESULTS
The described method has been implemented in a prototype decomposition program. The input to the program is a truth table and the output is a network of n-input m-output cells, each realising an n-variable function of m outputs. However the numbers n, m can be fixed arbitrarily, in the present version we assumed 2 -< n -< 5 and -< m <_ 2 as it covers all the existing applications.
Results of such a decomposition for some known benchmarks are presented in Table 8 Table 9 shows the results of decomposition of the benchmark circuits into five-input two-output cells, which is in fact the decomposition aimed at the Xilinx Logic Blocks. The comparison of out results with the other published results shows that the proposed method does not suffer because of its universality but, in fact, provides better solutions in many cases. The general method is based on supplementing the better known decomposition strategy called serial decomposition with parallel decomposition and on an original representation of Boolean functions by a set of r-partitions and the corresponding calculus.
Based on the presented decomposition algorithms, a prototype version of a logic synthesis system has been developed. Our results demonstrate that logic synthesis based on functional decomposition is usually more efficient than the conventional approach in which technology-independent minimization is followed by technology mapping. This observation, first formulated in our earlier paper [17] and supported by the results of other studies [8] , [27] , is especially true for designs involving PLD or FPGA components.
The presented decomposition procedures are very general. The user is given the possibility of working on minterms or cubes and can arbitrarily specify the maximum number of inputs and maximum number of outputs for all the components of a function to be decomposed.
Alternatively, the designer is given the option of interactively controlling the decomposition process by selecting, for each iteration of the decomposition, the component of the partially decomposed function to be dealt with and the type of decomposition (parallel or serial). This way, the decomposition procedure, allows the designer to examine several alternative solutions. In particular, it makes it possible to compare different implementation styles, e.g. among various FPGA structures, and select the one which is most suitable for a given project. As the conceptual layer of the method and its core are general, it is possible to apply the method to decompose multiple-valued functions also. This has lead to the development of a PLA-based Synthesis System vhich finds its application specially in designs using PLAs with decoders [19] . The presented algorithms can therefore form a basis for development of a general decomposition-based synthesis tool which would accept a set of design constraints and decompose a given function so that to meet those constraints.
Name
Our method 
