SUMMARY This paper presents a design method for three-level programmable logic arrays (PLAs), which have input decoders and two-input EXOR gates at the outputs. The PLA realizes an EXOR of two sum-ofproducts expressions (EX-SOP) for multiple-valued input two-valued output functions. We developed an output phase optimization method for EXSOPs where some outputs of the function are minimized in the complemented form and presented techniques to minimize EX-SOPs for adders by using an extension of Dubrova-Miller-Muzio's AOXMIN algorithm. The proposed algorithm produces solutions with a half products of AOXMINlike algorithm in 250 times shorter time for large adders with two-valued inputs. We also proved that an n-bit adder with two-valued inputs requires at most 3 · 2 n−2 + 7n − 5 products in an EX-SOP while it is known that a sum-of-products expression (SOP) requires 6 · 2 n − 4n − 5 products.
Introduction
Programmable logic arrays (PLAs) with two-input EXOR gates at the outputs, also known as AND-OR-EXOR PLAs ( Fig. 1 ) [28] , are a powerful architecture to realize many logic functions. The AND-OR-EXOR PLA realizes an EXOR of two sum-of-products expressions (EX-SOP). Minimization of the number of products in EXSOPs is an important step in the optimization of AND-OR-EXOR PLAs, because the number of products is directly related to the cost of PLAs. EX-SOPs are promising because, for many practical logic functions, they often require many fewer products than sum-of-products expressions (SOPs) [7] , [10] , [11] , [15] , [16] , [28] .
AND-OR-EXOR three-level networks are suitable for implementing adders, which serve as building blocks for synthesizing many other arithmetic circuits [21] . For example, Texas Instruments' SN181 arithmetic circuit and SN283 four-bit adder have two-input EXOR gates in the outputs [31] ; Monolithic Memories' ZHAL20X8A eight-bit counter realizes EX-SOPs [19] . An AND-OR-EXOR is one of the simplest three-level architecture, since it contains only a single two-input EXOR gate. However, its logic capabil- ity is quite high. Because of this, various programmable logic devices (PLDs) with two-input EXOR gates in the outputs were developed. Especially, RICOH, Lattice and AMD (MMI) produced series of such PLDs [19] , [22] , [23] and millions of complex PLDs (CPLDs) with output EXOR gates have been shipped [1] , [2] . An AND-OR-EXOR threelevel network is also suitable for efficient implementation of many random functions. For example, simplified EX-SOPs for six-variable pseudo-random functions require 25 percent fewer products and 40 percent fewer literals than simplified SOPs [5] . For an arbitrary function of six variables, minimum SOPs require up to 32 products [29] , while minimum EX-SOPs require at most 15 products [5] . Minimization of EX-SOPs were considered in the past [13] , [30] , and a cut-and-try method was reported [22] . Design methods for adders by using AND-OR-EXOR PLAs with more than one-bit input decoders were developed at IBM [32] . Exact minimization algorithms for EX-SOPs and upper bounds on the number of products in EX-SOPs are also reported [4] - [6] , [9] . AND-OR-EXOR networks where output EXOR gates have unlimited fan-in is considered [27] . During the last several years significant progress in the heuristic minimization of EX-SOPs have been made and many interesting results are reported [7] , [10] , [11] , [15] , [16] , [28] . However, no efficient algorithm to design AND-OR-EXOR PLAs for adders is developed.
Important contributions of the paper are as follows:
• We present a method to reduce the number of products in EX-SOPs by considering output phase optimization [26] , where some components of the function are implemented in the complemented form.
• We develop a heuristic method to minimize EX-SOPs for adders with two-and four-valued inputs by using an extension of the AOXMIN algorithm [10] .
• We proved that an n-bit adder with two-valued inputs A crucial step in AOXMIN is to partition the products of an SOP of the given function into two sets, which is done by a random method. We propose a partitioning method for adders. Our experimental result demonstrates that, for an n-bit adder with sufficiently large n, the proposed algorithm produces solutions with a half products of the random partitioning method in 250 times shorter time.
The remainder of the paper is organized as follows: Section 2 reviews terminology. Section 3 considers output phase optimization techniques. Section 4 summarizes AOXMIN and describes its extensions. Section 5 presents design methods for adders. Section 6 derives an upper bound on the number of products in EX-SOPs for adders. Section 7 shows experimental results. Section 8 presents conclusion.
Definitions and Terminology
In this section, we review basic terminology related to multiple-valued functions [25] , [26] .
Definition 1:
A multiple-valued input two-valued output function, or function in short, is a mapping
where P i = {0, 1, . . . , p i − 1}, p i ≥ 2, B = {0, 1}, and X i is a multiple-valued variable taking a value from P i .
Definition 2: Let
n is AND of literals. A cube is a convenient representation of a product for computer manipulation.
Definition 3:
A sum-of-products expression (SOP)
is OR of products. An SOP is represented by a cover, which is a set of cubes. An EX-SOP
is the EXOR of two SOPs.
Definition 4:
Let f i (X 1 , X 2 , . . . , X n ) (i = 0, 1, . . . , m − 1) be an n-input m-output function. The two-valued output function F(X 1 , X 2 , . . . , X n , X n+1 ), where X n+1 is an m-valued variable representing the outputs such that F(X 1 , X 2 , . . . , X n , i) = f i (X 1 , X 2 , . . . , X n ), is the characteristic function for the multiple-output function [26] . 
. If S i ∩ T i = ∅ for some i, then the intersection denotes a null cube.
Definition 7:
Disjoint sharp of two covers F and G, denoted by F # G, represents only those minterms of F which are not contained by G.
Definition 8:
ON-set, OFF-set, and DC-set is the set of cubes for which the function value is 1, 0, and unspecified, respectively.
In this paper, we often use the same symbol for a function and its cover; and unless otherwise specified, adder refers to adder without carry input, and adrn represents an n-bit adder.
Output Phase Optimization
In many cases, we can realize a function f in either positive phase ( f ) or negative phase (f ). For m-output function, we can choose the output phases in 2 m ways. The choice of the output phases in the realization of a function influences on the number of products in its minimized expressions. To reduce the number of products by choosing the output phases is output phase optimization [26] .
such that the number of products in G is minimal, is the output phase optimized SOP for ( f 0 , f 1 , . . . , f m−1 ).
Similarly, we can define an output phase optimized EX-SOP. We handle the output phase optimization of EX-SOPs by using the output phase optimization techniques for SOPs. We use an output phase optimized SOP as the input of the EX-SOP minimizer. For a function with m outputs, an EX-SOP minimizer produces two SOPs each having m outputs. We optimize the output phases of the 2m-output SOP to obtain an output phase optimized EX-SOP.
Let the output phase for the function f i be a i ∈ {0, 1}, where a i = 0 indicates f i is in the positive phase and a i = 1 indicates f i is in the negative phase. Let the output phases of the two SOPs of the EX-SOP for f i be b i0 and b i1 . Therefore, the output phase of the EX-SOP for f i is a i ⊕ b i0 ⊕ b i1 . When output phase optimization of the two m-output SOPs is impractical, we consider a i as the output phase of the EX-SOP for f i . The output phase optimization technique for AND-OR-EXOR three-level PLAs is shown in Fig. 2 . An output phase optimized EX-SOP can be realized in an AND-OR-EXOR PLA, where the polarity of the outputs are programmable.
Minimization Techniques
In this section we review AOXMIN [10] , which is a heuristic algorithm to simplify EX-SOPs. We then present an extension of AOXMIN. 
Overview of AOXMIN
Basic steps of AOXMIN are as follows:
1. Obtain a minimized cover F for the given function f and compute a cover R forf . 2. Group the cubes of F into clusters of cubes. Two cubes are in the same cluster if they intersect or they are connected through a chain of intersecting cubes. Fig. 3 obtains a minimized cover for a function, where F k , D k , and R k represents the ON-set, DC-set, and OFF-set, respectively. 5. Iterate steps 3 and 4 for some specified number of times, and take the best EX-SOP among all the EXSOPs generated so far.
In addition, AOXMIN simplifies complement of the given function and uses some output phase optimization technique to obtain better solution.
Extension of AOXMIN
The proposed heuristic method to simplify EX-SOPs, which is an extension of AOXMIN [10] , have the following features:
• It can simplify EX-SOPs for functions with two-and four-valued variables, and can treat functions where different variables have different domains (two-valued or four-valued). On the other hand, AOXMIN simplifies only two-valued functions.
• It uses heuristic algorithms to partition the clusters of cubes for adders. In this regard, AOXMIN uses only a random partitioning method.
• During iterative improvement, it concurrently minimizes both SOPs of the EX-SOP to reduce the total number of products by increasing shared products between two SOPs. On the other hand, AOXMIN uses simultaneous minimization of both SOPs only once as part of its simplification technique for multiple-output functions.
• For multiple-output functions, it performs concurrent simplification of all the outputs. However, AOXMIN simplifies each output separately throughout the algorithm. A modified AOXMIN considers simplification of all the outputs simultaneously [11] .
• For the output phase optimization of EX-SOPs, it uses techniques for the output phase optimization of SOPs [26] . AOXMIN handles the output phase optimization problem in a different way.
• To find good solutions quickly, especially for adders, it selects from two different minimizers for SOPs. On the other hand, AOXMIN uses only Espresso [3] .
• The method makes efficient use of the given don't care conditions during grouping the cover into clusters of cubes and also during every minimization of the SOPs of the EX-SOP. AOXMIN does not use don't care conditions during these two operations.
The minimization of an SOP for a multiple-output function corresponds to the minimization of an SOP for its characteristic function [26] . Similarly, we can prove the following:
Theorem 1: The minimization of an EX-SOP for a multiple-output function corresponds to the minimization of an EX-SOP for its characteristic function. Now, the definition of the clusters of cubes can be extended as follows:
Definition 10: Let F and D be the covers for the ON-set and DC-set, respectively, of the characteristic function for a multiple-output function. Then, two cubes c i , c j ∈ F are in the same cluster if (Fig. 4) .
Make DoubleOut Cover(F k , G k ) in Fig. 4 receives n-input m-output covers F k and G k , and returns an n-input 2m-output cover such that covers corresponding to outputs 0, 1, . . . , m − 1 and m, m + 1, . . . , 2m − 1 represent F k and G k , respectively.
In Fig. 4 (Fig. 5) or Espresso-MV [25] . Simplify Local uses a single pass of Reduce, Expand, and Irredundant operations to obtain a simplified SOP [25] . It reduces the number of cubes by locally changing the shape of the cubes. Espresso-MV iterates these operations as long as the solution improves. Sections 5 and 7 explain how the choice of the two-level minimizers influence the quality of the solution and execution time. 
Design of Adders
In this section, we propose partitioning methods of the cluster of cubes for adders with one-and two-bit decoders, and discuss about the choice of the two-level minimizers. Note that EX-SOPs for functions with two-and four-valued inputs correspond to AND-OR-EXOR PLAs with one-and two-bit decoders, respectively (Fig. 1) .
During minimization of adders, we use Simplify Local for Simplify Single and Espresso-MV for Simplify Double in Fig. 4 . We observe that if Espresso-MV is used for Simplify Single then the resulting awkward shape of R assigned in Fig. 4 prevent us from obtaining a good solution in the next minimization by using Simplify Double.
Adders with One-Bit Decoders
We found that an output phase optimized SOP for n-bit (3 ≤ n ≤ 11) adder with two-valued inputs has 4n − 1 clusters of cubes. Figure 6 shows the distribution of these clusters, where an entry c k represents k clusters each having c cubes. It is interesting that the number of cubes in the clusters have a regular structure. To partition the clusters of cubes into two covers F A and F B , we use the following method: The above partitioning method is devised by considering outputs. Adders have pairs of clusters, where each pair belongs to a particular set of outputs. Roughly, the strategy is to put the clusters from such a pair into two different partitions. A similar method is also devised for adders with four-valued inputs. Figure 7 shows Karnaugh map for a six variable function [20] . Its SOP requires 16 products and EX-SOP, (p 1 ∨ p 2 )⊕(p 3 ∨ p 4 ∨ p 5 ), requires five products as shown in Fig. 7 . The EX-SOP is designed by using the method presented in this section. 
Adders with Two-Bit Decoders
We obtained functions with four-valued inputs from their two-valued counterparts by pairing two variables using Espresso-MV [25] . Figure 8 shows the distribution of the clusters of output phase optimized SOPs for adders with two-bit decoders, where an entry c k represents k clusters each having c cubes. It shows that the output phase optimized SOP for n-bit (4 ≤ n ≤ 11) adder with two-bit decoders have 2n clusters. Note that the number of cubes in the clusters for adders with two-bit decoders also have a regular structure. We use the following method to partition the clusters into two covers F A and F B :
1. Sort the clusters in descending order of the number of cubes in them. 2. Starting from the beginning of the sorted list of the clusters, at first add a pair of clusters to F A , then alternately add a cluster to F A and F B .
Number of Products in Adders
In this section we derive an upper bound on the number of products in an EX-SOP for an n-bit adder with two-valued inputs.
Let adrn be the n-bit adder without carry input as follows: For adrn, we have the following relations:
Let t(S OP, f ) be the number of products in a minimum SOP for f . Let t(EX-S OP, f ) be the number of products in a minimum EX-SOP for f . 
Lemma 1:
t(S OP,ḡ i−1 ⊕ p i−1 g i−2 ) = 5. t(S OP,p i−1 ⊕ g i−2 ) = 6. t(S OP,p i−1 ) = 2.
t(S OP,p i−2 ∨c i−3 ) = 2 + t(S OP,c i−3 ).

Lemma 2: t(EX-S OP, z
0 ) = 2.
Lemma 3: t(EX-S OP, z
1 ) = 3.
Lemma 4: t(S OP,c i
) = 3 · 2 i − 1.
Proof: Note that t(S OP
,
Lemma 5: t(EX-S OP, z i ) ≤ 8 + t(S OP,c i−2 ).
Proof:
Since t(S OP, p i ⊕ g i−1 ) = 6 and t(S OP,p i−1 ) = 2, we have the lemma.
Lemma 6:
Two functions c i−1 and z i−1 can be realized with an EX-SOP at the same time by using 15+t(S OP,c i−3 ) products.
From Lemmas 1 to 5, we have this lemma.
Theorem 2:
An n-bit adder without carry input can be represented by an EX-SOP with at most 3·2 n−2 +7n−5 products for n ≥ 3.
Proof: Let W be the number of products necessary in an EX-SOP. Then, we have 
Experimental Results
We implemented the proposed method to simplify EX-SOPs for adders in C by using Espresso-MV [25] routines on a 2.40 GHz Pentium 4 PC running Linux. For the experiments we prepared minimized SOPs and output phase optimized SOPs by using Espresso-MV with default options. We obtained adders with four-valued inputs from their two-valued counterparts by pairing two variables using Espresso-MV. Tables 1, 2 , and 3 summarize the experimental results, which are obtained by using: a) output phase optimized SOPs as the input for the EX-SOP minimizer; b) two different techniques to partition the clusters of cubes: partitioning method for adders from Sect. 5 and random partitioning method from AOXMIN [10] ; and c) Simplify Local for Simplify Single and Espresso-MV for Simplify Double in Fig. 4 .
In Table 1 , the columns with heading 'SOP', 'OPO SOP', 'EX-SOP', and 'OPO EX-SOP' indicate the number of products in the corresponding expression, where 'OPO' is an abbreviation for 'output phase optimized'. The fifth column with heading 'Time' indicates the CPU seconds spent by the Espresso-MV [25] to minimize SOPs. The other columns with heading 'Time' indicate the CPU seconds spent by our program to simplify EX-SOP and they do not include the time to prepare minimized SOPs or output phase optimized SOPs. Table 1 shows that, for an n-bit adder with two-valued inputs and with sufficiently large n, the proposed partitioning method produces solutions with a half products of the random partitioning method in about 250 times shorter time. We used adr6 to see how the choice of the two-level minimizers in Fig. 4 influence the quality of the solution and execution time. By using random partitions and 1000 iterations, we found that when Espresso-MV is used for both Simplify Single and Simplify Double the algorithm requires 567.15 seconds and produces a solution with 122 products; however, when we use Simplify Local for Simplify Single and Espresso-MV for Simplify Double, the algorithm produces a solution with 81 products and requires 541.44 seconds. We found similar tendencies for other adders too. It should be noted that in Table 1 data on the 7th and 9th columns are the same for the last four rows. This is because of the memory overflow of Espresso-MV as outlined in Sect. 3. In spite of this as Table 1 shows, adders based on EX-SOPs require far fewer gates than those based on SOPs. Table 2 shows the number of connections to the inputs of gates for adders with two-valued inputs. For large n, three-level AND-OR-EXOR PLAs achieve about 85 percent saving in the cost of connections. Table 3 shows that the proposed partitioning method also produces good solutions quickly for adders with fourvalued inputs. However, in most cases, these solutions can be obtained by random partitioning method by a reasonable increase in the computation time. The experimental data also reveals that the minimization time for EX-SOPs with four-valued inputs is much smaller than that for the corresponding EX-SOPs with two-valued inputs, because the former requires many fewer products than the later. Note that an EX-SOP for an n-bit adder with two-bit decoders requires at most (n 2 + n + 2)/2 products [28] .
Conclusions and Comments
Adders are important because they form the basic building blocks of numerous digital systems, and EX-SOPs are promising because they often require many fewer products than SOPs. We presented partitioning methods, which are effective in optimizing EX-SOPs for adders. Our experimental result shows that random partitioning method is unsuitable for designing adders when n is large, because it requires excessive amount of CPU time to obtain a moderately optimized design. We found that the choice of twolevel minimizers in AOXMIN-like algorithm have a great influence on the number of products in EX-SOPs and that a powerful minimizer is not always a good choice. We proved that an n-bit adder with two-valued inputs requires at most 3 · 2 n−2 + 7n − 5 products in an EX-SOP while an SOP requires 6 · 2 n − 4n − 5 products. We obtained adders with four-valued inputs from their two-valued counterparts by pairing two variables using Espresso-MV code [25] , which reduces the number of products in SOPs [26] . A different pairing algorithm targeting EX-SOPs may lead to better solutions. Investigations are underway for integrating the proposed AND-OR-EXOR design techniques with three-level OR-AND-OR synthesis methods [8] and for adapting the integrated design systems to synthesize logic circuits for commercial CPLDs that have four-level OR-AND-OR-EXOR architecture [2] . Logic synthesis for such a four-level architecture is a challenging problem and very little has been published on the topic [27] .
A limitation of the proposed method is its inability to handle large adders. However, in the practical LSI applications, optimization of only small adders is sufficient in implementing large adders. The fan-in of the gates of an AND-OR-EXOR three-level realization for an n-bit adder increases with n. In the LSI realization, gates with large fan-in are difficult to fabricate and tend to be slow [24] , [33] . Therefore, monolithic implementations of n-bit adders for large n are impractical. When n is large, fast adders are implemented by combining well-designed adders of smaller sizes [17] , and the design strategies are primarily guided by the overall speed of the adders. Various schemes for such design have been developed. One of them is carry-skip adders which use √ n/2-bit adders for implementing an n-bit adder [17, p.117] . Therefore, large carry-skip adders such as one to add 128-bit numbers can be implemented by using only 8-bit adders. A 64-bit hybrid carry lookahead adder also uses 8-bit adders as its building blocks [18] . Another variant of carry-skip scheme uses 2-to 6-bit adders for implementing a 128-bit adder [14] . Various module-based designs are used in practice [12] , [17] , [21] .
