In the literature, it is generally overlooked that designers use functional models more frequently than behavioral or gate-level models. In functional modeling, the functionality of one or more components, like arithmetic/logic units, memories, and counters, are described as separate concurrent blocks. We present an algorithm, called Functional Synthesis Algorithm (FSA), for synthesis from these functional descriptions. Our algorithm automatically synthesizes components needed to implement a functional description while minimizing hardware costs and performance. Since a functional description uses standard operators in the hardware description language, a mismatch between the operators of the language and the functionalities provided by library components arises. FSA solves this functionality mismatch problem by pattern matching of the description against a library of function patterns. In addition, FSA clusters functions to maximally match components from a given library. Experimental results show that automated functional synthesis produces designs that are comparable to those produced by human designers.
INTRODUCTION
A design can be modeled at the behavioral or at the register-transfer (RTL) description level. The former corresponds to an algorithmic speci cation that describes the behavior of a complete system as a set of sequential actions to be executed over time, i.e., it follows the conventional programming language style. The latter corresponds to a concurrent description that decomposes the design into its major subcomponents and then speci es the functionality of each of these groups using concurrent statements from a hardware description language such as VHDL. A functional speci cation describes one or more (generic or technologyspeci c) components, like arithmetic/logic units, memories, and counters, as separate concurrent blocks. In Figure 1 , we present examples of these two description styles for the Am2901 microprocessor that is composed of an ALU, several shifters, a RAM, several registers, and selectors ( Figure 1(a) ). These components operate concurrently; each controlled by a di erent substring of the 9-bit control word I. A behavioral description of this design is given in Figure 1 (b). It consists of one case statement (with 2 9 cases) that decodes the control word I. Since the di erent subcomponents of the design operate independently from one another and concurrently, each branch of the case statement must describe the appropriate actions for all parts of the design. Note that the description is not and cannot be truly sequential; since all statements within a branch execute concurrently.
For the functional description style, we group the subcomponents of the Am2901 structure into sets of related functionalities, e.g., the RAM shift and the RAM are in one group (Figure 1(a) ). The functionality of each group is described by a block of concurrent statements (Figure 1(c) ). In this description: (1) the di erent parts of the design are explicitly described as operating concurrently, (2) the functionality of each subpart of the design is speci ed independently from all others, and (3) each block statement corresponds to a di erent subpart of the design. Clearly, this is a more compact and natural description of the Am2901.
As demonstrated above, some designs are more easily described by a functional rather than by a behavioral description style. Furthermore, designers use functional descriptions since they are more accustomed to the RTL level than to the algorithmic level of design. They rst determine the major components that compose the overall design; and then give a functional description of each of these subsystems. These subsystems can then be synthesized separately, which decreases the complexity of the synthesis problem. In fact, the smaller scope of functional synthesis allows us to nd optimal solutions in most cases even when using algorithms that are not necessarily suitable for high-level synthesis. Given that both functional and behavioral abstractions are appropriate for describing certain types of designs, a production-quality synthesis system must handle both in order to be a useful tool. Figure 2 shows the relationship between high-level 15, 9, 12, 20], functional, and logic level synthesis 1, 6] . High-level synthesis maps a behavioral description of the desired system to a RTL structure of generic components 15, 20] . Functional synthesis synthesizes a functional description of one or possibly several RTL components to component(s) from a given (technology-speci c) library. It can thus work in synergism with behavioral synthesis by mapping (functional descriptions of) RTL designs produced by the latter onto actual hardware 3]. Lastly, logic synthesis corresponds to automated design and logic optimization at the gate level.
As can be seen in Figure 2 , functional synthesis bridges the gap between behavioral and logic-level synthesis by addressing the technology mapping problem at the RTL level. Previous work has focussed on translating a generic RTL structure into a technology-speci c one 8, 3] . These approaches were structure-based, i.e., they work on structural netlists that capture designs as a set of black boxes and their interconnections rather than functional descriptions. Functional synthesis is an alternative approach towards RTL technology mapping that can accomplish more complex design trade-o s such as resource sharing due to the mutually exclusiveness of functionalities of the design. A description at the functional level is also a perfect basis for redesign of datapaths, particularly, if some but not all components of the design are to be replaced with a newer technology.
The use of a hardware description language such as VHDL for functional descriptions complicates the problem of functional synthesis, since the language provides the modeler with several di erent constructs to describe the same functionality. Also, a single component may be described by several, often nested conditional, statements. These statements need to be grouped in order to map them to the same component. What makes mapping even more di cult is that standard operators in the language may not match the functions supported by the target library components. In this paper, we present a functional synthesis algorithm (FSA) which addresses these problems. FSA is composed of two parts: functionality recognition and component mapping. The rst addresses the mismatch between the standard language operators used to specify a functional description and the functions supported by RTL library components. The second solves the problem that functional descriptions use multiple statements to describe the functionality of a single component. FSA also exploits the mutual exclusiveness of operators for unit sharing in order to minimize hardware area and delay.
Related research is presented in Section 2. The functional synthesis problem is formalized and our solution is outlined in Sections 3 20, 15] . EMERALD 20] formulates allocation as clique partitioning problem, but it can handle only straight-line blocks of assignment statements. Also, operators are grouped into sets of functions that do not necessarily correspond to the functionalities supported by existing RTL components. In our work, we overcome this problem by prepruning the solution space based on the mergeability information derived from the given component library. FSA uses a more accurate pro t function than HAL 15] . Also HAL's 15] overall control strategy is greedy, and thus cannot guarantee an optimal solution. Given a xed resource allocation and a schedule, Splicer 14] minimizes the number of connections between functional units and registers by using a branch-and-bound search with the number of multiplexers as criterion. Splicer uses a heuristic function in place of a proper bounding function which removes the guarantee of nding an optimal solution. By combining the branch-and-bound methodology with clique partitioning, FSA succeeds in pruning the search space without losing the guarantee of an optimal solution.
In summary, FSA handles nested conditional branches. We do not assume a preallocation of functional units and unit binding as done in 15]. Our cost functions incorporate costs for both interconnection (in terms of multiplexors) and control costs (in terms of function select logic). Lastly, due to the smaller scope of functional compared to high-level synthesis problems, FSA nds optimal solutions for most example designs.
Register-Transfer Level Technology Mapping
Previous work on RTL technology mapping has focussed on translating a generic RTL structure into a technology-speci c one 8, 3] . These approaches work on structural netlists that capture designs as a set of black boxes and their interconnections rather than functional descriptions. SYNNER 7, 8] takes a localized approach to the problem by selecting a component to implement a data path node one at a time based on some local criterion ( 8] , page 479). Neither internode dependencies nor absolute design goals, such as, the minimal total area, can be handled by this local optimization strategy ( 8] , pg. 480). SYNNER also performs some logic optimization by reducing certain cascaded logic operations into one logic operation, e.g., it replaces two 2-bit ANDS by one 3-bit AND ( 8] , page 31). In contrast to this, FSA's library-driven approach represents a general solution toward functionality recognition.
Dutt and Kipps 3] address the mapping of generic RTL components (with a xed set of functions) into technology-speci c RTL library cells using a rule-based approach. Functional synthesis introduced in this paper synthesizes an arbitrarily complex functional description into component(s) from a technology-speci c library.
Logic Level Synthesis
Technology mapping at the logic level transforms optimized Boolean equations into an interconnection of technology-speci c logic elements from a given library of gates 6, 1, 10]. Functional synthesis is concerned with technology mapping at a higher abstraction level of design, namely, functional design. The number of di erent patterns in functional synthesis is much smaller than in logic synthesis. A Boolean function can be described by many di erent combinations of logic operators. The functions of current RTL components are rather simple and thus can be described by one statement. Therefore, FSA uses a simple pattern matching and reduction algorithm rather than the sophisticated approaches proposed for logic synthesis 6, 10].
PROBLEM DESCRIPTION
The input for functional synthesis is a functional description of a design written in a hardware description language 1 . The input description shown in Figure 3 consists of three concurrent conditional statements. The rst and second statement form a nested condition, since the variable tmp1 is produced by the rst and consumed by the second. The description is translated by a compiler into the internal ow graph representation 2 depicted in Figure 4 . The functional synthesis problem can be stated using the following graph theoretic formulation. Let O = fop 1 ; op 2 ; :::; op n g be a set of operators, and let U = fu 1 ; u 2 ; :::; u m g, the unit table, be a set of functional unit 1 The current prototype of FSA uses VHDL as input hardware description language. However, VHDL can easily be replaced by another language as long as the input compiler is modi ed accordingly. 2 The ECDFG design representation 18] is an extension of the commonlyknown Control/Data Flow Graph model with powerful constructs, such as, timing constraints, structural bindings, asynchronous events, etc. Figure 4 , n 1 and n 5 are mutually exclusive because the paths between these two nodes, < n 1 , a 3 , n 2 , a 4 , d 1 > and < n 1 , a 3 , n 2 , a 4 , d 1 >, ends at PORT1(d 1 ) and at PORT1(d 1 ) of d 1 , respectively. If the edge < n 5 ; n 9 > were inserted (depicted as a dashed arrow), then n 1 and n 5 would no longer be mutually exclusive because not all paths from n 5 would go through d 1 .
3 FSA actually utilizes a more general notion of mutually exclusiveness based on conditional expressions rather than on the existence of decision nodes; see discussion in Section 6.1 and in 16]. An expression tree, G i = (V i ; A i ), is de ned to be a connected subgraph of G consisting only of operator nodes, i.e., V i N and A i A. One node is designated as the root, and all paths in G i are directed from the leaves towards the root. G i is a complete subgraph of G: for all n j , n k 2 V i , if there is an edge a l = < n j ; n k > in A then a l exists also in A i . The function op: G ! (O ;) is a mapping from an expression tree G i to the operation described by G i .
Example 2. The expression tree E 1 in Figure 4 consists of n 8 , n 9 , and n 10 with n 8 the root. The function op: G ! (O ;) is a mapping from E 1 to the operation described by E 1 , i.e., op(E 1 ) = op E1 . The expression tree E 2 corresponds to a single operator node n 7 , i.e., op(E 2 ) = operation(n 7 ) = \+". P is de ned to be the collection of all partitions P i of the graph G into subgraphs G i . M is de ned to be the collection of all mappings M i : P ) 2 U with U the unit table. A mapping M i from a partition P i of G to a set of functional units from U is de ned to be a legal mapping if and only if the following constraints are ful lled: (1) For all G i 2 P i , op(G i ) 2 functionality(M i (G i )) and (2) Each pair of root nodes n i of G i and n j of G j with M i (G i ) = M i (G j ) must be mutually exclusive in G. Lastly, the functional synthesis problem is to nd a tuple (P i ,M i ) where P i 2 P is a partition of G into subgraphs G i and M i 2 M is a legal mapping from P i to a set of functional units from U that minimizes the cost of the resulting design measured by a weighted sum of area and performance 16].
OUR APPROACH TOWARDS FUNCTIONAL SYNTHESIS
In this section, we outline our approach in solving the functional synthesis problem, while more detailed algorithms will be presented in later sections. The library-speci c information used by FSA is kept in two tables, the functionality table and the unit table. The unit table contains a unique name for else Mark n; end while
The functionality recognizer uses a pattern matching and reduction algorithm ( Figure 5.1 ). This algorithm matches the function patterns captured in the table F against the graph G. It traverses G in a bottom-up manner, such that each operator node n 2 G is visited once. For each n 2 G, the function patterns P i 2 F are matched against the expression tree G i 2 G rooted at n. If more than one match is found, then the pattern P i with the largest cost reduction is selected. For instance, the pattern \A+B+1" will be chosen over \A+1". Once a pattern P i has been selected, the subgraph structure G i of G that corresponds to P i is reduced to one operator node n3 with op(n3) = op(G i ) as shown in Figure 6 . The pattern matching is completed in one pass through the graph, and algorithm's complexity is O((Size of G) (Size of the Pattern Set)).
6 COMPONENT MAPPING
Reformulation of the Component Mapping Problem
Two operator nodes n1 and n2 are de ned to be mergeable with respect to a given unit table U if and only if there is a unit u 2 U with functionality(u) op(n1) op(n2). They are said to be compatible with respect to U, if they are both mergeable and mutually exclusive. FSA creates a compatibility graph CG = (N,E) from a ow graph G = (V,A) with N the set of operator nodes from V , and E a set of undirected edges, called compatibility edges. There is an edge e i = < n j ; n k > in CG for each pair of compatible nodes n j , n k 2 N. Example 3. A ow graph and its corresponding compatibility graph are presented in Figure 7 (a) and 7(b), respectively. They share the same set of operator nodes. First an edge is inserted between each mutually exclusive operator node pair (Section 3). For instance, n 2 is connected with n 7 because the condition E 1 =0 is mutually exclusive with E 2 =1. Next, we check that the selected node pairs are also mergeable. For the library shown in Figure 7 , all dashed lines in Figure 7 (b) that connect logic with arithmetic operations indicate non-mergeable pairs, and they have to be removed from the graph. The nal CG contains the four solid arcs.
Compatibility Graph Reduction
A collection of operator nodes can be mapped to the same functional unit if and only if they are pairwise compatible (Section 3). In other words, a subgraph of CG that is completely connected by compatibility edges, called a clique, can be mapped to one unit. During the process of incrementally creating clique covers on CG, FSA adjusts the graph as described below in order to correctly maintain its structure. Proposition 1. Let CG be a compatibility graph, n an operator node and m a newly created multi-functional node composed of the original operator nodes n 1 , n 2 , ..., n j with n 6 = n i for all i from 1 to j. A compatibility edge e k = <n,n k > (for some k 2 f1; :::; jg) can only be used for future merges if and only if n is compatible with all nodes n i , i.e., the edges e i = <n,n i > exist in CG for all i 2 f1; :::; jg. If this is the case, the following two edge reductions follow: equivalent edge property: If all edges e i = <n,n i > exist in CG for i 2 f1; :::; jg, then they are called equivalent to one another. Thus, CG can be reduced by replacing all of them by one edge, e = <n,m>. illegal edge property: If there is at least one operator node n k (for some k 2 f1; :::; jg) for which no compatibility edge e = <n,n k > exists then none of the other edges e = <n,n i > for i 2 f1; :::; jg can be used for future merges. Thus, CG can be reduced by deleting all of them. 
Operator Merging
During the process of incrementally creating clique covers on CG, FSA not only adjusts CG but also the ow graph G. This allows for the simultaneous consideration of (1) the connections costs due to the sharing of units and (2) the control logic costs for the selection of the correct function of a multi-functional unit. Our component mapping algorithms explore the two cases of whether or not to map two nodes n 1 and n 2 to the same component (expressed by e = < n1; n2 >) using the following graph transformations:
case 1: Map n 1 and n 2 to the same unit: CG ) CG 0 by adjusting the compatibility edges according to the rules described in Section 6.2, G ) G 0 by directly re ecting the sharing of units in the ow graph as will be described below. case 2: Do not map n 1 and n 2 to same unit: CG ) CG 0 by simply deleting edge e from CG, G ) G 0 by simply setting G 0 = G:
The rules for ow graph transformations are given in 19], while below we give an example. Figure 9 (a) describes how n 1 and n 2 are mapped to the same component by merging them into one multi-functional operator node n k . In the transformed G, the new node n k is connected to the original input and output destinations of n 1 and n 2 . The semantic equivalence of the original and the transformed ow graph is assured by inserting decision nodes that select the correct inputs for n k (right side in Figure 9(a) ). We encode the function select logic of n k and store it in the associated decoder node (depicted by a bold outlined box in Figure  9 (a)). 
A Heuristic Function for Compatibility Edge Selection
The heuristic function estimates the change in the design cost (area costs and design performance) for a partial solution when mapping two operator nodes to the same hardware unit 5 . This estimate bene t(): N N ) Real, with N the set of operator nodes in G, is a function from a pair of operator nodes to a cost value de ned by bene t(n 1 ; n 2 ) = area bene t(n 1 ; n 2 ) + delay bene t(n 1 ; n 2 ) with n 1 ; n 2 2 N, and and the relative area and delay optimization parameters. The area bene t includes area-operator-costs(n) := bound node area(n) -bound node area(n 1 ) -bound node area(n 2 ) (with bound node area() the minimal cost of implementing an operator node 16]) plus function decode and connection costs. The delay bene t includes delay-operator-costs(n) := max( bound node delay(n) -bound node delay(n 1 ), bound node delay(n) -bound node delay(n 2 )) plus interconnection delay; with bound node delay() the minimal delay of implementing an operator node 16].
The Heuristic Component Mapper
FSA reformulates the component mapping problem as a clique partitioning problem on CG 20] . The goal is to nd a minimal cost clique partition of the set of operator nodes of CG such that each clique can be mapped to one multi-functional unit. We present two algorithms to solve this problem: the heuristic and the branch-and-bound component mapper.
The heuristic component mapper (Algorithm 6.1) merges mutually exclusive operators of G based on the heuristic function, bene t(), that associates with each compatibility edge e a measure of the bene t of using this edge for clique merging (Section 6.4). It searches for a minimal cost clique partition on CG by repeatedly selecting the edge e with the largest bene t and utilizing it for operator merging (Section 6.3). It stops when no edge with positive bene t remains. Generate a compatibility graph CG for the ow graph G (Sec.6.1).
while (there is an edge e in CG with benefit(e)>0) do
(1) Select edge e=< n 1 ; n 2 > from CG with the largest heuristic benefit(e) (Sec.6.4).
(2) Map n 1 and n 2 to the same unit by transforming G and CG (Sec.6.3).
end while;
Example 5. We apply the heuristic component mapper to the Adder/Subtracter design depicted in Figure   5 (c). From this ow graph, we derive a compatibility graph that corresponds to the rst (partial) solution node S1 in Figure 10 (d). Next, FSA selects the edge e1 for operator merging in S1 based on the heuristic function (Section 6.4). The nodes AI and SI that are connected by edge e1 are merged into one multi-functional node. The clique property is utilized to reduce CG as demonstrated in Section 6.2. This results in S2 shown in Figure   10 (d). FSA repeats this process of operator merging until no more pro table compatibility edge remains. The nal design returned by FSA is S4 (also in Figure 5(d) ).
The Branch-and-Bound Component Mapper
The component mapper described in Algorithm 6.2 replaces the greedy control strategy by a branch-andbound one. The algorithm uses the inclusion (or exclusion) of a compatibility edge as the branching criterion. At each node in the search tree, the solution space is partitioned into two sets of potential solutions according to whether or not a given compatibility edge is used in the clique cover. If an edge is (not) used in a solution, then the corresponding pair of operator nodes is (not) being mapped to the same unit.
All partial solutions that can lead to a potentially complete solution are maintained in a list, called activestack. The algorithm selects a partial solution b = (G; CG) using a last-in-rst-out scheme. Then it applies the heuristic function bene t() to determine the next most pro table edge e (Section 6.4). It expands the current solution b using the selected edge e: The rst child uses the edge e for operator merging, while the second child eliminates the edge e (Section 6.3). If a child cannot lead to a least-cost solution based on the bounding function 16], it is discarded. If a child represents a complete solution, then it is compared against the current best solution to determine the new best solution. This process is repeated until either no partial solution remains in the active list or the time limit given by the user is exceeded. The last-in-rst-out scheme allows us to generate an initial`good' solution fast, i.e., in polynomial time. The B&B algorithm is guaranteed to nd an optimal solution if given su cient time. This is desirable since most descriptions of RTL components we have come across thus far are reasonably sized. Transform the ow graph into a compatibility graph CG (Sec.6.1).
active-stack=f (G; CG) g; BEST=empty; BEST-COST=1; while ((active-stack 6 = ;) and (iteration-count>0)) do
(1) Pop branching node b (partial design sol) from active-stack (2) Select edge e with the largest benefit(e) (Sec.6.4). Example 6. In this example, we synthesize the Adder/Subtracter design (Example 5) using the B&B component mapper. First, FSA selects e1 in S1 as the branching edge based on the heuristic function. Then the two children S2 and S7 of S1 are created. S7 is pushed on the active stack, and FSA continues with S2. FSA repeats this process until either a complete solution or a solution that can be bounded is found. The rst complete solution found by FSA is S4. At this point, BSF = S4, UP = 3, and active-stack=fS5; S6; S7g.
Then, S5 is popped o the active-stack. S5, a complete solution, is not as good as S4 and therefore is discarded. Both S6 and S7 can be bounded immediately, since their bound is larger than the cost of S4. The nal design returned by FSA is S4, which corresponds to the merged ECDFG graph shown in Figure 5 
Design Quality and Algorithm Performance
In the rst experiment (Table 1) , we explore the features and limitations of FSA using a variety of di erent parameter settings and the TTL library 21]. For instance, we ran area optimization ( = 1:0) and delay optimization ( = 0:0) as indicated in the second column of Table 1 . We also compare FSA's performance when using the functionality recognition option versus when not using it as indicated by the FR column. We also study the solution quality (both area and delay) achieved by FSA for the heuristic and for the B&B component mapper (columns CM1 and CM2). For each, the solution quality of the design is described by the transistor count (Area) and the maximal delay through the design (Delay) 16]. The computation time measured in CPU seconds on a SUN4 is given in the CPU column. The last column displays the results achieved by the human designer (meaning the best possible design we could nd by hand).
One goal of this experiment was the evaluation of the usefulness of the B&Bb over the greedy control strategy for component mapping. Thus in the column labeled \CM2 vs CM1", we present the percentage of design quality improvement obtained by FSA using the B&B versus using the heuristic component mapper. The improvement is calculated by (quality(CM1) ? quality(CM2))=quality(CM1) with quality being Area for = 1:0 and Delay for = 0:0. We found that FSA improves the design quality by about 18% percent in almost half of the thirty six example runs when using the B&B algorithm (indicated by a \CM2 vs CM1" percentage larger than zero). For the remaining example runs, the best design was found even without running the complete B&B algorithm. This improved design quality is achieved at an increased running time of the algorithm.
For each group of example descriptions, the fth row labeled \FR%" indicates the percentage of improvement obtained by FSA when using the FR option over when not using the FR option. This design quality improvement is calculated by (quality(not FR) ? quality(FR))=quality(not FR). The percentage listed in the Area (Delay) column is obtained by plugging the Area (Delay) measures taken from the rows with the parameter setting = 1:0 ( = 0:0) into the formula. For six out of nine examples, FSA with the FR option improves the design quality by an average of 42%. This is true for both the heuristic and the B&B component mapping and for both area and delay optimization.
The hand-produced designs are nearly always equivalent to those of FSA (using Table 2 . Table 2 shows that FSA produces better designs using the TTL library than when using the Mano or the Genus library. Similarly, FSA using Mano produces better designs than FSA using Genus. This is so because the TTL library has a set of richer components than Mano, and Mano has a set of richer components than Genus. For instance, FSA using Mano is able to reduce the second description to one unit, whereas FSA using Genus is not. There is no unit in the Genus library that directly supports some of the described functions, e.g., \A -B -1". This function would therefore be implemented by two subtracters in sequence, which decreases the performance of the resulting design.
In Table 2 , we also present the results of human designers using the three libraries. The last column of Table 2 indicates the percentage of area improvement of the designer's result over FSA's result calculated by (Area(FSA) ? Area(designer))=Area(FSA). Given a particular library, FSA almost always produced the same result as the designer (indicated by 0% improvement). The designer using TTL or Mano components was able to produce a better design for the CntLog example by using the commutativity of operators, but not for the Genus library. In this experiment, we also examined whether FSA is able to recognize the component(s) being described by the descriptions. The column labeled CR (for Component Recognition) indicates whether FSA was able to reduce the functional description to the described component. CR=`yes' (CR=`no') means that (no) proper component recognition took place. CR=`*' indicates that the algorithm succeeded in nding an alternative and even better design implementation for the given description. The number of component recognitions (CR=yes) decreases for simpler component libraries. For the TTL library, FSA recognizes (and possibly even improves) the design implementation for 88% of the examples; for the Mano library it is 66% of the examples, and for the Genus library it is 44% of the examples. In short, FSA is more likely to reduce functional descriptions to their intended component(s) when given more complex component libraries.
CONCLUSIONS AND FUTURE WORK
In this paper, we have de ned a new problem, which we call the functional synthesis problem. Functional synthesis maps a functional description of one or possibly several RTL components to an interconnection of components from a given library. Most research in the literature concentrates instead on the synthesis from behavioral descriptions. We present a solution to this problem in form of a two-phase algorithm, called FSA, that solves both RTL technology mapping and the functionality mismatch problem.
Our experiments show that in most cases FSA produces a design that is comparable to that of a human designer. In addition, we found that FSA improves the design quality (both area and delay) by 18% in about half of the example runs when using B&B over when using heuristic component mapping. Our experimental results have also shown that designs synthesized using functionality recognition are smaller in cost than those synthesized without it. FSA was able to improve 66% of the example designs when using functionality recognition; and the improvement in design quality (area and delay) was 42% on the average. Therefore we can conclude that functionality recognition is an essential ingredient of functional synthesis. We have thus succeeded in moving the functionality recognition approach, a fairly common ingredient to logic synthesis systems, to a higher level of abstraction.
Future work will address the incorporation of the functionality recognition task directly into the component merging phase in order to solve the two problems of expression tree reduction and of grouping functions to hardware units simultaneously. We may also want to study whether there is any gain in replacing the simple pattern matching procedure used by FSA for functionality reduction by a more sophisticated method, such as those used in Boolean technology mapping 10].
