Abstract -This paper presents a placement-driven technology mapping procedure based on fuzzy delay curves. The fuzziness has been introduced to deal with the inherent v agueness in wiring loads (derived from a dynamically updated placemen t) and used by the mapper to calculate the signal arrival times. In the process we describe a number of fuzzy operations which are needed to generate the fuzzy delay curves and to select a minim um area mapping solution satisfying a set of timing constraints. This procedure has been implemen ted and the results are on average 1% and 26% (5% and 3%)better in terms of area and delay compared to a technology mapping procedure with zero (crisp) wire load values.
Introduction
In the past, researchers have separated logic synthesis from physical design and have relegated interconnect optimization to the physical design phase. However, with recent studies [10, 4] indicating that interconnections occupy more than half the total chip area and account for a signicant part of the chip delay, it is appropriate that wiring be incorporated into the cost function for logic synthesis [1, 6] . This is because physical design which is expected to address these issues, comes much later in the design hierarchy. By then, many of the key architectural and structural decisions have been made, hence, limiting the capability of the physical design tools to generate the \best" solutions in terms of area and performance.
To elaborate on the importance of the wiring load, consider a t w o-input NAND gate driving an inverter gate through 0.2 cm of aluminum interconnect (2 m wide, 0.5 m thick, with a 1.0 m thick eld oxide beneath it). 0.2 cm is the expected length of a local interconnect line on a 2cm2cm chip [2] . We calculate the rise time (to 50% of its nal value) at the input of the inverter gate using two methods. One method ignores the capacitance of the interconnect line and uses delay = g + Rg Cg = 0 : 4 ns where g is the intrinsic gate delay, Rg is the on-resistance of the driver gate, and Cg is the input capacitance of the fanout gate. The second method [9] In summary, with the existing technology, the capacitive term is dominated by the capacitance between the interconnection and substrate. For local aluminum lines, the resistive term is dominated by the on-resistance of the MOS transistor. As the chip dimension increases and the minimum feature size decreases, the interconnection capacitance bottoms at about 1 -2 pF=cm while the input gate capacitance decreases. Therefore, the RC delay o f i n terconnect lines will become even more dominant in the future.
Given a boolean network representing a combinational logic circuit optimized by technology independent synthesis procedures and a target library, the technology mapping process binds the nodes in the network to gates in the library such that the area of the nal implementation is minimized and timing constraints are satised.
In [13] an approach is presented to solve the technology mapping problem for minimizing area under delay constraints. The authors rst compute a range of \interesting" values for the required times at each n o d e ( b y nding the minimum area and the minimum delay mapping solutions) and then divide this range into equal intervals. The best mapping solution for each of the required times are generated and stored at the node during a postorder traversal (from primary inputs to primary outputs) of the tree. The nal mapping solution is generated during a preorder (from primary outputs toward primary inputs) traversal of the tree. In order to obtain high quality mapping solutions, this method requires a small time step resulting in large number of delay-area points.
[3] examines the problem of mapping a Boolean network to a circuit implementation using gates from a nite size cell library. The objective is to minimize the total gate area subject to constraints on signal arrival time at the primary outputs. This approach consists of two steps. In the rst step, delay curves (that capture gate area { arrival time tradeos) at all nodes in the network are computed. In the second step, the mapping solution is generated based on the computed delay curves and the required times at the primary outputs. For a NAND-decomposed tree, subject to load calculation errors, this two step approach nds the minim umarea mapping satisfying any delay constraint if such a solution exists. The algorithm has polynomial run time on a node-balanced tree and is easily extended to mapping a network modeled by a directed acyclic graph.
In [6] an attempt is made to increase the interaction between logic synthesis and technology mapping. The idea is to generate a \companion" placement solution for the circuit before it is mapped. This placement is then used to evaluate the cost of a matching gate during the mapping process. The placement is dynamically updated in order to maintain the correspondence between the logic and layout representations. In the end, a mapped network along with a placement solution are generated. The placement solution is then globally relaxed in order to produce a feasible placement according to the target layout style (e.g., standard-cell or sea-of-gates). Using these techniques, circuits with smaller area and higher performance have been synthesized.
A diculty with the placement-driven approach is that after mapping, the network is very dierent from the one we started with. This makes the wiring estimation process during mapping inherently imprecise. The technology mapping results are therefore sensitive to the methods used for estimating the wiring and updating the placement. Fuzzy theory provides an eective method to remedy this diculty b y reducing the reliance on crisp wire values calculated based on the placement solution. Instead, the placement merely provides information about the wiring cost either in the form of a range of wire length values or in the form of a qualitative classication of wire length values (long, medium, or short).
In this paper we describe a placement-driven technology mapping procedure based on fuzzy logic. We start by calculating fuzzy locations for the nodes in the network. Using these fuzzy locations, we obtain a fuzzy wiring load for each gate which will in turn be used to calculate a fuzzy arrival time at the output of each gate. Based on this information we calculate fuzzy area-delay curves at all nodes using a postorder traversal of the network and then nd a minimum area solution satisfying timing constraints during a preorder traversal of the network.
The same fuzzy logic framework can be applied to many other problems in logic synthesis, timing and power analysis and physical design where some parameter of interest or the objective function are imprecisely dened. The technology mapping procedure presented in this paper only serves as an example (see section 6).
The ow of the paper is as follows. In section 2 we review the process of technology mapping using area-delay curves. This process will be extended to include the fuzzy wiring load. In section 3, we describe the principles of fuzzy theory. Extensions of arithmetic operations in the crisp domain to the fuzzy domain are also presented. In section 4 details of fuzzy area-delay curve computation and fuzzy gate selection are presented. Experimental results and concluding remarks are given in sections 5 and 6.
Technology Mapping using Area Delay Curves
The technology mapping procedure used in this paper is based on the ad map approach given in [3] . In the remainder of this section, we give a n o v erview of that approach.
With each node in the network, we store a delay curve. A point on the delay curve represents the arrival time at the output of the node and the total gate area which is required to map its transitive fanin cone up to (and including) the node. In addition to the area and delay v alue, the matching gate and input bindings for the match are also stored with each point on the curve. Points on the curve represent v arious mapping solutions with dierent tradeos between area and speed. We are interested in a mapping with minimum area satisfying delay requirements. Consequently, w e can drop point P1 on the curve if there exists another point P2 on the curve with lower area but equal or lower delay. By dropping inferior points, the delay curve can always be made monotonically non-increasing without loss of optimality.
The technology mapping procedure consists of two graph traversal steps. Initially a postorder traversal of the NANDdecomposed network is performed, where for each n o d e n and for each gate g matching at n (a candidate match), a new delay curve is produced by appropriately merging the delay curves at the inputs(n; g). The delay curves for successive gates g matching at n are then merged by applying a lower-bound merge operation on the corresponding delay curves. At a given node n, the resulting delay curve will describe the arrival timearea tradeos in propagating a signal from the network inputs to the output of n. The delay curve computation and merging are performed recursively until a circuit output is reached. The set of (t; a) pairs corresponding to the composite delay curve a t the circuit output will dene a set of arrival time-area tradeos for the user to choose from.
Given the required time t at the circuit output, a suitable (t; a) point on the delay curve for the circuit output is chosen. The gate g matching at the circuit output which corresponds to this point and its inputs are thus identied. The required times ti at the inputs are computed from t, g, and the fact that these inputs must now drive gate g. The preorder traversal resumes at inputs of g where ti is the constraining factor and a matching gate gi with minimum ai satisfying ti is sought.
In the above technology mapping procedure, all parameters are crisp numbers. In our approach, we model the unknown wire loads as fuzzy numbers. We t h us generate fuzzy delay curves during a postorder traversal of the subject network using fuzzy arithmetic operations. In the process, it becomes necessary to remove inferior points from these curves. This will require sorting of the area-delay points. Fuzzy decision making is used at this stage to rank the points based on their area and delay v alues. Gates are assigned to nodes in the circuit during the preorder traversal of the network using fuzzy decision making.
In order to be able to compute the fuzzy delay curves, it is necessary to generalize the operations used by the ad map to the fuzzy domain. This will require a valid and ecient representation of the fuzzy numbers in addition to methods which will assist in operating on and making decisions based on the fuzzy parameters as explained in the next section.
Fuzzy Logic
In general, it is dicult to model the real world by a precise model. One diculty is that in real life problems, parameters are not exactly known. We might plan to minimize power consumption in a circuit where the switching rate of the circuit input is \approximately 0:25". Another diculty is that in many cases goals are not clearly expressed. For example, the goal of an optimization process might b e t o a c hieve \a delay of essentially 10 nanoseconds or less". Fuzzy theory helps us deal with this imprecision in parameters and goals. These are represented by fuzzy numbers and fuzzy goals.
In the following section, extensions of the crisp arithmetic operations to fuzzy domain are described.
Fuzzy Numbers
A fuzzy numberñ is dened by a membership function ñ(x) where x is a real number 1 . The value of the membership function represents the possibility of the event represented by this fuzzy number to assume a value of x. The membership function is normalized such that 0 ñ(x) 1. The term \possibility" is used to emphasize the fact that the value of the membership function is not a probability v alue. In general if an event a has probability pa and event b has probability pb, then probability of either event i s p a + p b and probability of both events is pa:pb. On the other hand, if pa and pb represent the possibility of these events, then the possibility of either event i s Max(pa; p b ) and possibility of both events is Min(pa; p b ).
For practical purposes it is generally more appropriate to resort to a specic kind of membership function. Indeed, triangular fuzzy numbers [16] are often used. It is shown in [16] that using triangular fuzzy numbers does not considerably limit the generality of the fuzzy theory.
As an example, assume the arrival time at the input of gate g is represented by fuzzy number 5 shown in Figure 1 . Given a delay value x, the fuzzy number fi represents the possibility of the input signal arriving at time x. In this example the possibility of the input signal arriving at time 10 is 1 while the possibility of signal arriving at time 12 is 0.5. The possibility of signal arriving outside the range specified by (m -l,m+ r) is assumed to be zero. 
Fuzzy Arithmetic
It is necessary to generalize the crisp mathematical concepts to the fuzzy domain. The basis for this generalization is provided by the "extension principle" [16] which can be expressed as follows: Given a function f mapping points in set X to points in set Y and given a fuzzy set 5 specified by its membership function pa(x), the extension principle states that
If more than one element of X is mapped by f to the same element y of Y then the maximum membership value of these elements in the fuzzy set tiff is chosen as the membership value of y.
Using this principle, it is possible to generalize any function in the crisp domain to a function in the fuzzy domain which operates on fuzzy numbers. For example, the possibility distribution for the fuzzy number 53, the sum of ffl and if2 can be obtained as follows. For each point y in the space of
The extension principle is used to generalize the crisp arithmetic operations to the fuzzy domain in the case of triangular fuzzy numbers as follows. Given fuzzy numbers fit = (ml,l~,rl) and if2 = (m2,12,r2) and a crisp number a, the following fuzzy operations are defined: a*rf~ =(a*mx,a*r~,a*l~). In addition to the above star/dard operations, it is necessary to define a number of other fuzzy operations which are specifically defined for the technology mapping process. In this paper, the absolute value function is used to calculate the wire length given fuzzy locations for the node and its fanouts. The extension principle can be used to generalize the operation for finding the absolute value of a fuzzy number ft. 
y t,~(-x) x>0
p~,~ (x ) 0 x<O. Figure 2 where part (d) represents the triangular approximation of the fuzzy number generated in part (c).
Then abs_value(fi(x)) = max(fipos(X), tinct(x)). An example of this calculation is shown in
Since fuzzy numbers are used to represent the signal arrival times in the network, it is necessary to find the time when all input signals are available at the inputs of a given node. That is, given a fuzzy number fi(x) representing the possibility of signal arrival time at an input of a gate and a time instance t, we need to calculate a fuzzy number representing the possibility of the signal being present at that input before time t. The presence fuzzy number P(x) is calculated from the fuzzy arrival time using the following equation: The definition of the presence function is again derived from the extension principle where the crisp function is the "maximum" operation. Application of this function to the fuzzy number fi is shown in Figure 3 .
Given a gate with two inputs with fuzzy arrival times ffl and if2, we can find 2~/= max_merge(d~, if2), a fuzzy number representing the possibility of both signals being present at the input by using the following equation:
The application of "max_merge" operation to fuzzy numbers (6, 4, 4) and (4, 1, 7) is illustrated in Figure 4 .
If a gate has more than 2 inputs, the max_merge operation can be performed on the first two inputs and the third input can then be compared with the result of the first two inputs. This operation can be extended to any number of fanins. Note that max_merge will return one of the two fuzzy numbers if the ranges for these fuzzy numbers do not intersect.
The operation for fz_max (fz_min) can be performed by comparing two fuzzy numbers (as described in the next section) and then returning the larger (smaller) fuzzy number.
Fuzzy Decision Making
When dealing with fuzzy numbers, fmding the optimal solution will require a compare operation which is necessary to rank fuzzy numbers. This ranking will in turn be used for sorting purposes. In this paper, fuzzy decision making is used to rank the delay points for each node. Once this ranking is obtained, it is easy to remove the inferior points from the areadelay curve. The preorder phase of the procedure also makes use of fuzzy decision making by searching for the area-delay point which satisfies a given constraint.
A reasonable fuzzy compare operation should take into consideration the range and middle value of the fuzzy numbers. In [8] , a fuzzy relation is presented which results in a total ordering of fuzzy numbers being considered. This operation makes use of a preference equation which applies to normalized and convex fuzzy numbers 2 . For the special case of triangular fuzzy numbers which are normalized and convex, this method greatly reduces the complexity of the compare operation as follows. 
Fuzzy Delay Curves
For submicron technologies, the effect of interconnect on circuit delay is of more importance than its effect on the circuit area. Therefore, we only consider the former effect here. The latter effect can be easily captured in a similar fashion.
The delay curve at each node now consists of a set of noninferior points /5 = (~, a) where t is a fuzzy number representing delay, and a is the crisp area. Let the load at the output of a node n be represented as a fuzzy number Cn" The load at the output consists of two components: the gate capacitance of fanout nodes and the wiring load. The gate capacitance seen at the output is not known because the output nodes have not yet been mapped. However, it is possible to do timing recalculation [3] to adjust the delay curve for each node as the output nodes are mapped. Alternatively, one can use the default gate capacitance of a two input NAND gate which is obtained from the cell library. In any case, the gate capacitance is a crisp value. The wiring load is however a fuzzy number estimated from a "companion" placement solution as in [1] .
The arrival time at the output is computed as: arPival (n, g, Cn) = max_merge~, ¢,,~put ~(~,9) (vi,9 + Ri,g x d~ + arr~val(ni,gi,Ci)) where all the operations (e.g. +, x, and max-merge) are the fuzzy operations defined in section 3.2. The gate intrinsic delay and drive resistance are assumed to be crisp numbers. However, if these parameters are specified by a range of values (to capture static and/or dynamic variations in gate parameters), they can be represented by fuzzy numbers and easily incorporated in the delay curve computation without much extra effort.
Given a set of points on the delay curve of a node, it is necessary to remove the inferior points. The fuzzy compare operation described in section 3.3 is used to sort points on the curve using the fuzzy arrival time of each point. The inferior points on the curve are then removed by making sure that for each point, the area of the point is less than the area of the point which has the next earlier fuzzy arrival time.
The wire load is calculated as the product of wire length and the capacitance per unit length of interconnect. Fuzzy position of nodes are required to compute the fuzzy wire lengths. At one extreme, the chip boundary is considered to be the fuzzy location for all nodes in a network before the network is placed. On the other extreme, the crisp position for each node after placement can be used. In our scheme the chip area is partitioned into regions. Instead of using the crisp position of the node, we use the region where the node is placed. By controlling the number of regions, the degree of fuzziness for each position can be controlled. For example, as the number of regions increases, the positions become less fuzzy.
The enclosing region for each node is specified by its "lowerleft coordinate" LL and."upper-right coordinate" UR. Given a position for node n and its enclosing region, we can specify its fuzzy coordinates L~ and L~n. For example, assume node n is placed at (3, 4) which is inside a region with coordinates LL = (1,2) and UR = (4, 5) as shown in Figure 5 . Then Once the fuzzy coordinates of a node and its fanouts are known, the wire length can be efficiently calculated by using a number of models. These models include the star connection model, the single trunk Steiner tree model and the enclosing rectangle approximation. In the following we use the latter model as it is accurate, yet easy to compute. We use the fz.max and fz_min operations to find a fuzzy bounding box for the node and its fanouts. This bounding box is specified by two fuzzy points representing the lower left and upper right coordinates. The half perimeter of this fuzzy bounding box which is computed using fuzzy arithmetic is then used to estimate the fuzzy wiring length between the node and its fanouts.
It is desirable to incrementally update the position of matched gates while the delay curves are being calculated [6] . This operation will result in gate positions which more accurately reflect the position of the primitives after the mapping procedure. For this purpose, once a gate g is mapped at a node n, the fuzzy position of g is updated by placing g at the median of the fuzzy positions for its fanins and fanouts as this is the optimal location for a floating node with respect to a set of fixed nodes which are connected to it.
Decisions made during the preorder traversal have fuzzy goals. This is due to the fact that given the required arrival time at the output of a node, the required arrival time at the fanins are obtained by subtracting the delay through the gate from the required time at the output. Since the delay through the gate is fuzzy, the required arrival times at the input of the gate will become fuzzy numbers. The fuzzy required times for signals in the lower levels reflects the inexact information about the gate delays at the higher levels. The network is mapped one logic cone at a time where a logic cone refers to a circuit output and all nodes in its transitive fanin. Since logic cones in a network intersect, it is possible for a node to be in two logic cones cl and cj. If logic cone ci has already been processed while mapping cone cj, it is possible to arrive at a node n which is already mapped. In this case, if the arrival time for the current mapping of node n meets the time required at n by cone cy, the current mapping is accepted. However if the mapping solution does not satisfy the required arrival time for cone cj, the mapping procedure will continue until a node is reached that meets the timing requirements or the primary inputs are reached.
After each cone is mapped, the fuzzy region associated with each of the remaining nodes is reduced in size to reflect the fact that the current network is more similar to the final mapped network. Consequently as more cones are mapped, the wiring information will become less fuzzy.
Results
The procedure described in this paper was implemented in a program called FZ.MAP and the results were compared to those of the SIS_MAP [13] and PL_MAP which is similar to AD.MAP, but generates delay curves with crisp wire values. The same optimized blif files were used as input for all cases. The circuits were first optimized using the script.rugged [11] . They were then decomposed into NAND gates and mapped using SIS_MAP, PL-.MAP and FZ..MAP. Finally, the circuits were placed using GortDIANL plus DOMINO [12] and routed using YACrt [7] . All results are reported after layout is completed. Table 1 presents the total gate area and the longest path delay after technology mapping. All entries in the table are normalized with respect to the results for SIS_MAP. The results for FZ_MAP show on average 9% and 24% (2% and 1%) better in terms of area and delay compared to SIS_MAP (PL-MAP). (crisp) wire load values. Table 2 contains the CPU time spent on a Sparc Station II with 64 MByte of memory for each mapper. The PL-MAP is on average 4 times slower than SIS_MAP while the FZ-MAP is 2 times slower than the PL_MAP. Table 3 presents the total chip area and the circuit delay after palcement and routing. The results for FZ_MAP after placement and routing are on average 1% and 26% (6% and 3%) better in terms of area and delay compared to SIS_MAP (PL_.MAP). These results show that compared to PL-MAP, FZ-MAP was able to improve on both area and delay by using fuzzy wiring load values.
Concluding Remarks
We presented a technology mapping scheme based on fuzzy area-delay curves. The fuzziness is due to uncertainty in signal arrival time as a result of imprecise wiring loads. Use of fuzzy logic however reduced the reliance of placement-based technology mapping on the exact placement positions, and criminated the need for sophisticated placement updating schemes.
The fuzzy technology mapper can be easily augmented to capture other imprecise parameters. For example, the same mapping procedure has been used to find the minimum power consumption solution subject to timing constraints [15] . Power consumption estimates however critically depend on the switching activities at internal nodes of the circuit. Switching 337 .
. . .activity calculation under a real delay model (which accounts for hazards) cannot be performed exactly as it requires a very time consuming and memory intensive symbolic simulation [5] . It can, however, be approximately performed, say, using the tagged probabilistic simulation [14] . Fuzzy switching activities can be calculated and used to derive fuzzy power-delay curves. Other applications of the fuzzy framework proposed here, can be found in timing analysis, common subexpression extraction, etc. 
