Abstract-Design methods for OR-AND-OR three-level networks are useful for exploiting the flexibility of logic blocks in many complex programmable logic devices. This paper presents TRIMIN, a fast heuristic algorithm for designing OR-AND-OR networks from sum-of-products expressions. Each output of the network realizes a sum-of-complex-terms expression, where a complex term (CT) is similar to a product-of-sums expression. TRIMIN's objective is to lower the number of gates in the network. It first generates a set of CTs by applying factorization techniques; then, it solves a set-covering problem by using a greedy algorithm to select a subset of the CTs. The effectiveness of TRIMIN is demonstrated through experimental results.
I. INTRODUCTION

M
ULTILEVEL synthesis usually produces the most costeffective realization of logic functions [6] , [11] . However, many complex programmable logic devices (CPLDs) are best suited for the implementation of three-level OR-AND-OR networks [3] , [10] , [34] . For most functions, three-level OR-AND-OR networks require fewer gates than two-level AND-OR networks; moreover, statistical analysis shows that a drastic reduction in the number of gates cannot be expected by using many levels of AND-OR networks [30] . In addition, delays in three-level networks are easy to predict and, under the assumption of unit delay per logic level, three-level networks are faster than multilevel ones. Therefore, area efficient high-performance realizations of functions are achievable by using a three-level architecture.
Numerous exact and heuristic algorithms have been developed for designing two-and multilevel networks [4] , [11] . Although three-level networks have received little special attention, design methods for several classes of them have been considered. Among them are three-level NAND networks with only true input signals [13] , AND-AND-OR and AND-OR-OR networks [17] , OR-AND-OR networks [30] , XOR-AND-OR networks [8] , [32] , AND-OR-AND networks with only two-input AND gates at the outputs [12] , [19] , AND-OR-XOR and OR-AND-XOR networks [28] , and AND-OR-XOR networks with only two-input XOR gates [16] , [29] . Easily testable three-level networks have also been considered [7] , [15] , [24] , [25] . We note that, in this paper, unless otherwise specified, gates have unlimited fanin and fanout, and networks have both true and complemented input signals. Programmable logic devices (PLDs) permit immediate realization of digital circuits. Of the two major categories of PLDs, i.e., CPLDs and field-programmable gate arrays (FPGAs), CPLDs are well suited for high-speed applications and have highly predictable speed performance, which makes them ideal for the realization of a wide range of complex designs [5] , [36] . Thus, computer-aided design issues related to the CPLDs are important. A special kind of three-level programmable logic arrays (PLAs)-which we subsequently refer to as ( )-PLAs ( Fig. 1 ) or logic array blocks (LABs) [3] that can efficiently realize OR-AND-OR networks (Fig. 2) -are the building blocks of many CPLDs. The LABs are linked together via programmable interconnections. Each LAB consists of an AND-OR PLA [6] and some extra circuitry known as a logic expander [3] , [10] . Since both true and complemented input signals are available, the logic expander realizes sum terms or a single-level network of OR gates that feed the AND-OR PLA. Thus, a LAB realizes an OR-AND-OR network. A LAB or ( )-PLA has inputs, outputs, circuitry to realize product and sum terms, and some additional flexibilities and limitations that depend on the brand of the CPLD. Logic expanders can also be programmed to realize 0278-0070/03$17.00 © 2003 IEEE product terms; in that case, an ( )-PLA can realize product and sum terms, where . Examples of the commercially available CPLD families that have logic expanders are Altera's MAX 9000 [3] , Cypress' CY7C340 [10] , and Xilinx's XPLA3 [34] . MAX EPM9560-a device of the MAX 9000 family-contains 560 macrocells that can be considered as 35 (36, 16, 64, 16 )-PLAs, where each macrocell consists of a (36, 1, 4, 1)-PLA. Similarly, XCR3512XL-a device of the XPLA3 family-contains 512 macrocells that can be considered as 32 (36, 16, 40, 8) -PLAs. Fanin of the gates, which these CPLDs can realize, are so large that they can be regarded as unlimited; for example, OR gates with up to 36 inputs can be realized in the logic expanders of XCR3512XL. Therefore, to take advantage of this flexibility, we need design methods for OR-AND-OR networks whose gates have large fanin. While millions of CPLDs that can realize OR-AND-OR networks have been shipped [2], very little has been published on how to synthesize such circuits in them. Logic synthesis for these CPLDs has been done by using proprietary software.
This paper presents a fast heuristic algorithm, TRIMIN, for designing OR-AND-OR networks. Each output of the network realizes a sum-of-complex-terms expression (SCT), which is a disjunction of complex terms (CTs), where a CT represents a product-of-sums expression (POS) with some additional restrictions. CTs correspond to the OR-AND part of the network. TRIMIN accepts a near-minimum sum-of-products expression (SOP) as its input and can handle multiple-output functions.
A distinctive property of OR-AND-OR networks is that they often require far fewer gates than two-level AND-OR networks [30] . Moreover, for many functions, at least three-level realizations are required because their two-level ones are unacceptably expensive. A notable example is an -bit adder, whose AND-OR and OR-AND-OR networks require and at most gates, respectively [30] . We also note that some OR-AND-OR networks are easily testable [7] , [24] , [25] .
The problem of synthesizing economical OR-AND-OR networks is an old one [1] , [20] . In his pioneering work on OR-AND-OR networks, Abhyankar presented a set of theorems for designing such networks [1] . Meo proposed a technique based on the determination, elimination, and selection of the product-of-sum maximal implicants [20] . By using multilevel prime implicants and covering techniques, Lawler developed an exact algorithm for designing multilevel networks [18] , which also finds minimum cost OR-AND-OR networks. Muroga showed how to share the OR gates in an OR-AND-OR network [21, p. 249 ]. Sasao and Higashida developed a heuristic algorithm for synthesizing OR-AND-OR networks by using multiple-valued minimization techniques [27] ; a design method for two-level AND-OR networks with decoders [31] is the basis of the algorithm. Sasao also analyzed the complexity of OR-AND-OR networks [30] . In the context of three-level NAND networks, Vink [33] and Perkowski and Liu [22] presented exact algorithms for designing OR-AND-OR networks. The algorithm in [22] is based on permissible implicants and covering techniques, whereas that in [33] is an extension of Gimpel's algorithm [13] to minimize TANT networks. Perkowski et al. presented a heuristic algorithm for designing OR-AND-OR networks [23] , which is an extension of the method presented in [22] . A heuristic algorithm for generating OR-AND-OR networks for a special class of symmetric functions by using algebraic factorization is also reported [7] . We note that except [7] , [27] , and [30] , none of the algorithms for designing OR-AND-OR networks have reported benchmark results; the authors in [23] tested their algorithm on functions of up to 14 variables, and the exact minimization algorithm in [22] is practical for functions of up to ten variables. Because of their high complexities, the other OR-AND-OR synthesis methods that we have mentioned have limited applications; however, the extent of their practical significance cannot be verified without implementation.
The exact algorithms for minimizing OR-AND-OR networks are impractical for functions with many variables. The heuristic algorithm in [27] obtains good quality solutions, but often requires considerable computations. In the CPLD design environment, where minimizations of a large number of OR-AND-OR networks are required, a faster algorithm is desirable; moreover, in this environment, networks that are only just small enough to fit in the LABs are acceptable. The cost of a CPLD realization of an OR-AND-OR network is proportional to the number of gates in the network; in addition, the cost is independent of the number of fanin of the gates. Therefore, the objective of TRIMIN is to lower the number of gates and to reduce the computation time. To achieve this goal, TRIMIN works in two steps:
• [Generate CTs] First, it generates CTs from the product terms of the given SOP, where any product term may be used to generate more than one CT. The CTs are generated primarily by using factorization methods that are a subset of the transformations used in multilevel synthesis [11] , [14] . We note that the factorization techniques used in TRIMIN generate only OR-AND logical expressions, whereas the methods used in multilevel synthesis can generate logical expressions of any form.
• [Select CTs] Second, it solves a set-covering problem by using a greedy heuristic algorithm [9] . The purpose of the algorithm is to select a subset of the generated CTs, such that each product term of the given SOP is used to generate at least one of the selected CTs, while lowering the number of gates required to realize the SCT formed from the selected CTs. The novelty of our contribution is in using factorization techniques for generating CTs and a greedy heuristic method for selecting CTs to design OR-AND-OR networks. We perform both of these steps without using the time consuming operations on a cover [4] , [11] . When necessary, TRIMIN can regulate the computation time and the quality of the solutions by controlling the generation of the CTs. Experimental results demonstrate that TRIMIN is highly effective.
The remainder of the paper is organized as follows. Section II introduces the terminology. Section III discusses how different types of CTs help lower the number of gates in an OR-AND-OR network. Section IV develops several methods for generating CTs from a given SOP. Section V shows a greedy algorithm for selecting a subset of the generated CTs. Section VI reports the experimental results. Section VII presents conclusions.
II. DEFINITIONS AND TERMINOLOGY
This section defines the basic terminology that is necessary to explain the material in the paper. Definition 2: A CT may be: 1) a product term (product part of the CT); 2) a sum term or a conjunction of sum terms where the same sum term appears at most once (sum part of the CT); or 3) a conjunction of a product part and a sum part, provided that literals for the same variable are not present in both parts. The product terms required to form a CT are said to be covered or contained by the CT. Let be the set of product terms contained by the CT .
Example 3: The product terms , , , and are required to form the CT . Therefore, . TRIMIN repeatedly uses the covered product terms of CTs. The covered product terms of a CT are determined when it is generated and each CT keeps track of its covered product terms.
Definition 5: Let be the set of literals in the product part of the CT . Let be the set of sum terms of the CT . Let be the number of elements in the set . , and the common term of and is logical 1.
Definition 7:
The cost of a network is the number of gates in it. The cost of an SCT is the number of gates required to realize it, provided only one OR gate is used to realize the sum term that appears more than once in it. The cost of an SOP or of a CT is the number of gates required to realize it. Let be the cost of the CT . The relative cost of is . Example 6: In Example 2, the costs of , , , , and are 8, 4, 3, 2, and 1, respectively, and the relative costs of , , , and are 0.50, 0.75, 1.00, and 1.00, respectively. The relative cost of is 0.50, because six product terms are required to form it. (We will show how to form such a CT in Example 11.) Definition 8: A CT is trivial if it is a product term; otherwise, it is nontrivial. A CT is called unisum if it has only one sum term, and it is called multisum if it has two or more sum terms.
III. PRELIMINARIES
In this section, we consider several types of CTs, which are the building blocks of SCTs, and discuss how they help lower the cost of an OR-AND-OR network.
The relative cost of a CT indicates how effective the CT is in lowering the cost of an OR-AND-OR network. If the relative cost of a CT is less than 1.00, its cost is lower than the number of product terms covered by it and vice versa. Therefore, CTs that have the relative cost less than 1.00 are often more useful than other CTs; they help lower the cost of an OR-AND-OR network because each such CT covers more than one product term for every gate required to realize it. Since the relative cost is the ratio of the cost of a CT and the number of product terms covered by it, a CT may cover many product terms but still have a higher relative cost and vice versa.
One of the simplest methods in designing an OR-AND-OR network is to rewrite a given SOP into an SCT. Let be an SOP. We can rewrite into an SCT . A realization of , which is an AND-OR network, requires six gates, whereas a realization of , which is an OR-AND-OR network, requires four gates. For this simple example, the realization for SCT requires two fewer gates than that for SOP. The reduction in the number of gates comes from the CT , which contains four product terms that require four AND gates to realize, whereas the CT requires only two gates-an OR gate and an AND gate-to realize.
The cost of a CT depends on the number of sum terms in it; however, it is independent of the number of literals in the sum terms. Since we are considering gates with unlimited fanin, only an OR gate is required to realize a sum term irrespective of the number of literals in it. Let and be CTs. The cost of or is three; however, contains six product terms, whereas contains nine product terms. The relative costs of and are 0.50 and 0.33, respectively, which indicate that is more effective than in lowering the cost of a network. Therefore, CTs with sum terms that have as many literals as possible are highly desirable.
Sharing of OR gates corresponding to the sum terms of an SCT is an effective way to lower the cost of a network. Let be an SCT. Although the same sum term appears twice in the SCT, only one OR gate is required to realize it. Therefore, requires five gates to realize as an OR-AND-OR network. However, an AND-OR network for the function represented by requires eight gates. Similarly, any sum term can be realized by using only one OR gate irrespective of the number of times it appears in an SCT. Example 2 also shows the effectiveness of shared OR gates.
When the sum terms contain only two literals, sharing of OR gates corresponding to them is especially important to lower the cost of a network. Let be an SOP. We can rewrite it into an SCT , which is composed of two unisum CTs. In the realization, requires five gates, whereas requires four gates because only one shared OR gate is sufficient to realize the sum terms. If we cannot share the OR gate, the networks for and require the same number of gates. Therefore, shared OR gates are especially important when an SCT has unisum CTs that contain sum terms with only two literals.
Multisum CTs are often more effective in lowering the cost of a network, even if the number of literals in the sum terms is as few as two. Let be an SOP. It can be rewritten into an SCT . The networks corresponding to and require six and five gates, respectively. For this small example, we saved one gate by using OR-AND-OR network instead of AND-OR network. However, if the number of literals in the sum terms of multisum CTs is greater than two, the reduction in the cost of a network is often significant.
IV. GENERATION OF CTS
This section develops a systematic method for generating nontrivial CTs from an SOP. In this paper, "combining product terms or CTs" means "generating a new CT from the disjunction of product terms or CTs."
A. Algebraic Method
First, we present a method that combines a set of CTs into a new CT. The method, which we call the algebraic method, is a subset of the algebraic factorization techniques used in multilevel synthesis [11] , [14] .
Definition 9: The algebraic method forms a CT by combining a set of CTs, where is the common term of the CTs in , and ( ) is a literal. Example 7: By using the algebraic method, we can combine CTs , , , and , whose common term is , into CT . Selecting CTs from a set , such that the selected CTs form a new CT by using the algebraic method, is a nontrivial task, especially when both and are large. However, an efficient algorithm for the problem can be developed based on the following theorem.
Theorem 1: Let be a set of distinct CTs such that forms a CT by using the algebraic method for each value of ( ) and that s are equal for each value of ( ), where represents the common term of and . Then, forms a CT.
Proof: Let a subset of be the variables on which ( ) depends, and let be the common term of and for each value of ( ). Since forms a CT where is the common term, we must have and , such that , where and are literals of the variables and ( ), respectively. Therefore, is functionally equivalent to , which produces . On the other hand, no two elements in are the same because, according to the statement of the theorem, the CTs in are distinct. As a result, all of the elements in are also distinct. Thus, no two literals in the sum term are the same. Therefore, is a CT obtained by using the algebraic method.
Definition 10: Let a set of CTs be used to form a CT by using the algebraic method. Then, any CT formed from a subset of CTs in by using the same method is said to be a sub-CT of . Example 8: In Example 7, we can also combine any two or three of the original CTs to generate sub-CTs of , such as , , etc. A CT that has sum terms with as many literals as possible is desirable, because it can cover more product terms without increasing its cost (Section III). However, in Example 8, we have generated sub-CTs, whose sum terms do not have the largest possible number of literals. Then, why do we need sub-CTs? The following example shows how they help lower the cost of a network.
Example 9: The sum terms of the SCT contain as many literals as possible. If we generate two sub-CTs and for , the given SCT can be rewritten as , which can be used to generate the simplified SCT . The cost of the given SCT is eight, whereas seven is the cost of the simplified SCT obtained by generating sub-CTs.
Definition 11: Let be the number of literals in the sum term . Let , where ( ) is the ordered tuple of the sum terms of the CT , such that when . Two CTs and are said to have the same order if and . Property 1: A set of CTs cannot be combined into a new CT by using the algebraic method if their orders are not the same.
Property 2: The algebraic method cannot be applied on a set of CTs if they do not have product parts.
Based on the above discussion, we developed an algorithm for generating CTs by using the algebraic method. The algorithm receives two sets of CTs. Let them be and . The CTs in have the same order; they are used to generate new CTs and, to do so, the algorithm finds subsets of the CTs in such that each subset forms a new CT. The CTs in are the already generated ones by using any of the methods in Sections IV-A-C; they are used to prevent the generation of the same CTs more than once. In addition, also serves as the accumulator for the newly generated CTs. The outline of the algorithm is as follows. , but limit the total number of CTs generated from to 1,000. (Sub-CTs are generated by combining at most five CTs to save computation time.) Add all of the newly generated CTs at this step to .
Sometimes, the CTs generated by using Algorithm 1 can be used to generate more CTs by using the same algorithm. The process can be repeated until no more CTs can be generated. The following example shows how the algorithm can be used twice to combine a set of product terms into a multisum CT.
Example 10: Let , , ,
, and be the product terms, which are used as the input for Algorithm 1 to generate a set of CTs, three of which are , , and . When , , and are used as the input for Algorithm 1, they form the CT .
B. Boolean Method
In addition to the algebraic method, we use a Boolean method to generate CTs. The method generates only OR-AND logical expressions and is a subset of the Boolean factorization techniques used in multilevel synthesis [11] , [14] . In the Boolean method, a pair of CTs can be combined into a new CT according to the following three rules of Boolean algebra.
Rule A: , where is a common term, and , , and can be either a literal or a sum term. Depending on whether , , and are either a literal or a sum term, we can have six possible cases where the rule is applicable.
Example 11: By using Rule A, we can combine and into . The original CT pair requires five gates to realize, whereas the resulting CT requires only three gates. By using the same rule, we can form by combining and . Similarly, we can combine and into . In the last case, the new CT requires one more gate than the original CTs. However, we will show shortly that such CTs can also be useful.
Rule B:
, where is a common term, and and are literals.
Rule C:
, where is a common term, and , , and are literals. We note that Rule B is a special case of Rule C.
We have argued in Section III that CTs with a relative cost less than 1.00 are usually more useful than other CTs, but the relative costs of CTs generated by using the Boolean method are sometimes more than 1.00, because the relative cost of a new CT is sometimes higher than the relative cost of the CTs from which it is generated. However, as the following two examples show, such CTs can help lower the cost of a network or lead to CTs whose relative cost is less than 1.00. This is accomplished by sharing the OR gates corresponding to the sum terms of the CTs in an SCT or by forming some intermediate CTs that can be combined with other CTs to generate new CTs whose relative cost is less than 1.00.
Example 12: Let be an SOP. By using the algebraic method and Rule A, we can combine , , and into . The relative cost of the new CT or any of the product terms from which it is generated is 1.00. The given SOP can be rewritten as an SCT . In spite of having CTs with relative costs not less than 1.00, a realization of the SCT requires one fewer gate than is required to realize the given SOP.
Example 13: By using Rule A, we can combine and into , and into , and and into . The relative cost of any of the new CTs is 1.50. However, we can combine them into , whose relative cost is 0.67, in spite of having intermediate CTs with a relative cost of 1.50.
C. Merging Method
A new CT can be formed by merging the sum terms of a pair of CTs, if they have the same order and only differ in one sum term. A CT generated by using this method, which we subsequently call the merging method, has a lower relative cost than the relative cost of any of the CTs from which it is generated. Some of the CTs generated by using the merging method cannot be generated by using the algebraic or Boolean methods alone. The following example shows how the method works.
Example 14: By merging the sum terms and , we can combine and into . The relative cost of or is 0.75; however, has a lower relative cost, which is 0.50.
It is often necessary to apply more than one method or rule to combine a set of CTs into a new CT. The following example uses the merging method and all of the rules of the Boolean method to combine four product terms into a CT.
Example 15: Let , , , and . We can combine and into by using Rule B, and and into by using Rule C. By using the merging method, we can combine and into . By using Rule A, we can combine and into . The relative costs of , , , and are 1.50, 1.50, 1.00, and 0.75, respectively.
D. CTs for Multiple-Output Functions
The methods presented in Sections IV-A-C for generating CTs are applicable on a set of product terms without any consideration of how the individual product terms are shared by different outputs of a multiple-output function. A CT formed from a set of product terms can only be useful if they share the same set of outputs. Therefore, we partition the given product terms into a set of blocks, which we subsequently refer to as product blocks, such that the product terms which are shared by the same set of outputs become members of the same product block. Then, we generate CTs by combining the product terms that belong to each individual product block.
Example 16: Let the SOPs , , and generate the outputs , , and , respectively, where ( ) represents a product term. We can partition the product terms into three product blocks , , and . The product terms in , , and are used by the outputs in , , and , respectively. Let , , ,
, and . Sets of nontrivial CTs generated from the product terms in , , and are , and , respectively. Therefore, sets of CTs corresponding to the product terms in , , and are , , and , respectively. In this paper, unless otherwise explicitly noted, an SOP for an -output function represents a set of SOPs for singleoutput functions; the same is also true for an SCT. To lower the cost of realizations of such expressions, gates are shared where possible.
E. Algorithm for Generating CTs
By using the techniques discussed in Sections IV-A-D, we developed the following algorithm to generate a set of CTs from a given SOP.
Algorithm 2 (Generate CTs: All Methods): 1) Let be the given SOP. Partition the product terms of into a set of product blocks (Section IV-D). ( is the input, and and are the outputs of the algorithm, such that ( ) represents the set of CTs generated from the product terms in .) 2) For each ( ), place elements of into , set to 0, and execute steps 3 to 5. Time complexity of Algorithm 2, which depends on Algorithm 1, is exponential. In step 3(b) of Algorithm 1, we generate only a subset of the total CTs to control the exponential growth in computation time.
V. SELECTION OF CTS
In this section, we present an algorithm for selecting a subset of the generated CTs; its objective is to reduce the number of gates in a realization of the SCT formed from the selected CTs.
The problem of selecting a subset of the CTs can be stated as follows. Let be the set of product terms of a given SOP. Let be the set of CTs that is generated from , such that ( ) covers a set of product terms, where ( ). The objective is to select a subset of , such that the SCT formed from the CTs in requires the fewest gates and each element of is covered by at least one CT in . The problem, which we refer to as the CT-selection problem, is similar to the set-covering problem that is known to be NP-complete [9] . Therefore, a polynomial time algorithm for the problem to obtain a solution with minimum cost is highly unlikely. In this section, we present a greedy heuristic algorithm that produces an acceptable solution in polynomial time. The algorithm selects the best CTs one after another. Up to this point, we have primarily used the relative cost of a CT to determine its quality. However, once a CT is selected, the next best CT is chosen mostly based on the uncovered relative cost, which is defined as follows. The greedy algorithm begins by selecting a CT that has the lowest uncovered relative cost, while selecting a CT that has the largest number of uncovered sum terms in case of a tie. The frequency of occurrences of the sum terms in the CTs and a random method are also used to break any further ties. The process of selecting CTs is repeated until all of the product terms are covered by the selected CTs. During the selection process, information gathered from the already selected CTs is used to compute the uncovered relative cost and the number of uncovered sum terms for the unselected CTs. Before the beginning of the repeated greedy selection, the algorithm selects trivial CTs that are not covered by any other CT. The concepts of relative costs and uncovered relative costs provide a realistic measure of the cost of CTs in the greedy selection algorithm; small changes in their definitions often increase the cost of the final solution.
Algorithm 3 runs in polynomial time in and , where is the set of product terms of the given SOP, and is the set of CTs generated from .
Step 2 of the algorithm requires a constant time, because the set of trivial CTs that are not covered by any other CTs is stored in TRIMIN's data structure when CTs are generated. TRIMIN's data structure also stores for each CT for easy computation of their uncovered relative cost. The number of iterations of the loop between steps 3 and 5 is at most , and the loop body runs in time . Therefore, Algorithm 3 runs in time . The following example shows how an SCT for each individual output of a multiple-output function is generated from a set of selected CTs.
Example 18: In Example 16, we generated sets of CTs. The remaining task is to select CTs from , such that each product term in is covered by at least one selected CT. We select , , and , because the SCTs formed from them have a lower cost. Since is generated from the product terms in and the product terms in are shared by the outputs in , is one of the CTs that form SCTs for and . Similar arguments can be made for and . Thus, we have SCTs , , and for the outputs , , and , respectively. The total cost of the SCTs is seven, whereas that of the original SOPs is 11.
Therefore, TRIMIN algorithm can be summarized as Algorithm 4 (TRIMIN: Simplify SCT): 1) Generate CTs from the given SOP (Algorithm 2).
2) Select CTs that cover the product terms of the given SOP (Algorithm 3).
3) Generate an SCT from the selected CTs (Example 18).
By regulating the generation of CTs in Algorithms 1 and 2, TRIMIN can control the computation time and the quality of the solutions; this feature is useful in the CPLD design environment. Usually, the more CTs TRIMIN generates, the better solution it finds and the longer computation time it requires. To regulate the generation of CTs, TRIMIN changes the number of CTs it uses to form a new CT, the number of sub-CTs it generates for each CT, and the number of rules of the Boolean method it uses.
VI. EXPERIMENTAL RESULTS
We implemented TRIMIN in C language and carried out experiments on a Sun Enterprise 420R Server, which has a SPECint95 rating of 19.7, by using Berkeley and the Microelectronics Center of North Carolina benchmark functions [4] , [35] . We used Espresso-MV [26] to obtain near-minimum SOPs that served as the input for TRIMIN. The computation time which we reported for TRIMIN (Algorithm 4) excludes the time needed to prepare the initial SOPs. When TRIMIN needed less than 5 ms, we reported it as 0 s. For space reasons, tables for experimental results include only the benchmark functions for which TRIMIN gives an appreciable improvement from the starting SOPs. Table I shows the total number of gates required by AND-OR and OR-AND-OR networks. We used TRIMIN to design OR-AND-OR networks by generating two sets of CTs and compared the total number of CTs generated and the total number of gates and the CPU seconds required to design them. The CTs in one set were generated by using only the algebraic method, while those in the other set were generated by using a combined method of algebraic, Boolean, and merging methods. The results show that the combined method is better or as good as the algebraic method in lowering the cost of a network for all of the functions except b2. Although TRIMIN works in two steps, namely to generate CTs and select CTs, only an insignificant portion of the total computation time is required for selecting CTs. For example, when the combined method is used to generate CTs for apex2, TRIMIN required 0.02 s-about 0.6% of the total computation time-for selecting CTs.
We counted a gate even if its fanin is only one. For example, it appears that the expressions and require a total of four gates, but we counted them as six. This is because CPLDs require circuitry for an AND gate to realize any product term and use circuitry for an output irrespective of how many product terms or CTs are used to form it. Table II compares the number of gates required to design OR-AND-OR networks by using TRIMIN and that by using Sasao's algorithm [30] . It shows that, depending on the benchmark function, each of the algorithms can outperform the other in terms of the quality of the solution; however, TRIMIN is often an order of magnitude faster, even if the Espresso-MV time is included. We note that the performance of our computer [30] is an order of magnitude better than the computer used to generate Sasao's results. The longer computation time required by Sasao's algorithm is due to the fact that the algorithm is based on multiple-valued minimization techniques [31] , where different variables can have different values, and such an algorithm has greater computational complexity than TRIMIN. To design OR-AND-OR networks for Table II, TRIMIN used algebraic, Boolean, and merging methods. If only the algebraic method is used, networks for apex5 and ryy6 required 472 and 10 gates, but TRIMIN took only 0.93 and 0.04 s, respectively. Fig. 3 shows an OR-AND-OR network for ryy6. Table III compares the number of gates and computation time required for designing OR-AND-OR networks by using TRIMIN and those required for designing AND-OR-AND networks by using Dubrova and Ellervee's algorithm [12] . Although both types of networks in Table III have three levels, they are structurally different. All of the gates in an OR-AND-OR network considered by TRIMIN have unlimited fanin; however, the AND-OR-AND network considered in [12] realizes logical AND of two SOPs, i.e., the AND gate near the output of the network has only two inputs. The number of product terms required to form the two SOPs that realize AND-OR-AND networks for different benchmark functions shown in Table III are reported in [12] . In addition to the AND gates needed to realize the product terms, the AND-OR-AND network also requires three additional gates-two OR and one AND-for each output. Under the assumption that a product term with only one literal also requires one AND gate-which is consistent with the assumption that we made for OR-AND-OR networks-we computed the number of gates that AND-OR-AND network requires. The computation can be explained by using alu3 in Table III . Since the two SOPs that realize the AND-OR-AND network require 47 product terms [12] , we need 47 AND gates to realize them; in addition, we need 16 OR gates and eight AND gates because the network has eight outputs; thus, 71 gates are necessary to realize alu3 in the AND-OR-AND form. Table III shows that the OR-AND-OR networks require fewer gates and a shorter computation time for the given set of benchmark functions. Since the computation times reported in [12] may include the times taken to prepare input data which must be in SOP forms, the reader should compare times required to design AND-OR-AND networks with the sum of TRIMIN and Espresso-MV times shown in Table III . Table IV compares the performance of TRIMIN and Jabir and Saul's algorithm [16] for designing AND-OR-XOR networks, which realize logical XOR of two SOPs. The discussion regarding AND-OR-AND networks in the preceding paragraph also applies to AND-OR-XOR networks, except that the later use a two input XOR gate near the output. A simplified AND-XOR expression is used as the input for the AND-OR-XOR minimizer, which excluded the computation time of such expressions from its reported time. Table IV shows that OR-AND-OR networks often require fewer gates and a shorter computation time.
Although algorithms for designing XOR-AND-OR networks are available [8] , [32] , their comparison with OR-AND-OR networks is impractical because the total costs for realizing benchmark functions in the XOR-AND-OR form cannot be computed from the published data. A comparison of TRIMIN with exact minimization algorithms [18] , [22] , [33] for OR-AND-OR networks is also impossible because no benchmark results for the exact algorithms have been published.
We developed a technology mapping algorithm for Altera's MAX 9000 CPLDs [3] , whose macrocells can be considered as (36,1,4,1)-PLAs (Section I). The algorithm generates a netlist in terms of MAX 9000 macrocells from an OR-AND-OR network that is obtained by using TRIMIN. Table V compares the number of macrocells required by the algorithm and that by the MAX+PLUS II software (version 9.64) [3] . Since computation times for TRIMIN and MAX+PLUS II are similar, they are not listed in Table V . We note that our mapping algorithm takes an insignificant amount of time.
The algorithm for mapping an OR-AND-OR network into macrocells can be explained by using the network in Fig. 3  for ryy6 ; four of its OR gates at the input side require four macrocells to implement, and the rest of the network can be implemented in one macrocell; thus, five macrocells are required to implement the entire network (Table V) . Similarly, three macrocells are required to implement the network in Fig. 2 . A [3] macrocell has circuitry to implement up to five product terms; when a function requires more than five product terms, we use a tree of macrocells.
To calculate the number of macrocells that the MAX+PLUS II required, we used the following data from its report file: logic cells used ( ), expanders used ( ), and expanders not available ( ). The number of macrocells required is if ; otherwise, it is . This is evident from the fact that MAX+PLUS II used expanders, whereas it had expanders available from the already used macrocells; thus, when , it did not need any additional macrocells, but when it needed additional macrocells. We note that the MAX+PLUS II performs multilevel logic synthesis and takes advantage of the XOR gates that are available in the macrocells [3] , whereas our mapping algorithm disregards XOR gates and considers macrocells as (36,1,4,1)-PLAs. In spite of that, the mapping algorithm based on OR-AND-OR networks generated significantly better results for many benchmark functions.
VII. CONCLUSION AND COMMENTS
In many CPLD design environments, fast algorithms for synthesizing OR-AND-OR networks are useful; with this in mind, we developed TRIMIN. Our experimental results demonstrate that TRIMIN is fast, and it generates OR-AND-OR networks which often require many fewer gates than AND-OR networks. A comparison of Sasao's algorithm and TRIMIN shows that the quality of the solutions and the computation time can significantly differ when separate approaches are used to design OR-AND-OR networks. We note that there are functions for which TRIMIN is unable to obtain highly optimized solutions, but requires a short computation time; in the CPLD design environment, such solutions are often useful because it is only necessary to find networks that are small enough to fit in the LABs. When the quality of the solution is more important than the computation time, TRIMIN can be used in conjunction with other algorithms that explore a larger search space but may require much longer computation time.
TRIMIN is complementary to the multilevel synthesis algorithms; their objectives are different. TRIMIN optimizes the number of gates in three-level OR-AND-OR networks whose gates can have unlimited fanin, while multilevel synthesis algorithms usually optimize the number of literals in logical expressions whose realizations can have unlimited levels. Three-level networks generated by TRIMIN can often be area-efficiently mapped in CPLDs with logic expanders. Our experimental results for Altera's MAX 9000 CPLDs demonstrate that the OR-AND-OR synthesis produces networks that can be mapped in fewer macrocells than the networks produced by multilevel synthesis techniques for many benchmark functions.
Several aspects of TRIMIN, as well as OR-AND-OR synthesis methods in general, deserve further consideration. At present, TRIMIN uses simplified SOPs as its input; therefore, it is applicable to the functions whose SOPs have a manageable size. Investigations are underway for adapting TRIMIN to large multilevel netlists, which can be realized as several interconnected OR-AND-OR networks. The approach is promising for the implementation in CPLDs, because the macrocells and LABs in CPLDs have limited resources that can implement OR-AND-OR networks with up to a certain size. Other interesting problems are the synthesis of OR-AND-OR networks from multilevel netlists without generating the intermediate SOP representation and the development of novel techniques for generating and covering CTs that can lead to exact minimization algorithms for the OR-AND-OR networks.
