Abstract-We propose a logic synthesis flow to synthesize a domino-cell network with less crosstalk effect. Crosstalk-immunity property of OR gate and relations between wire adjacency and cell I/O are exploited in technology mapping. Meanwhile, a metric to measure the crosstalk sensitivity of domino cells in synthesis level is proposed. Experimental results demonstrate that the crosstalk sensitivity of the synthesized domino-cell network is greatly reduced by 52% using our synthesis flow as compared with conventional methodology. Furthermore, after placement and routing are performed, the ratio of the number of crosstalk-immune wire pairs to the number of total wire pairs is about 24% using our methodology as compared to 9% using conventional techniques, and the maximum wire coupling can be greatly reduced from 95% to 60%.
I. INTRODUCTION
In deep submicrometer (DSM) technology, coupling capacitance grows exponentially to the downsizing of the feature size. The crosstalk effect as such will result in performance degradation and at worst will give incorrect result. Therefore, crosstalk minimization becomes an important issue today [1] .
Domino logic is a well-developed circuit implementation style. As compared with the conventional CMOS implementation, domino logic has one advantage of faster speed. It is widely used in controlling logic and ALU for high-performance designs. Domino logic operates in two phases: The precharge phase and the evaluation phase. In the precharge phase, output wire is precharged from low to high; while in the evaluation phase, output wire will either be discharged from high to low or remain high. Note that, once the output is discharged during evaluation, it is impossible to restore the output until next precharge phase.
The domino-cell design is especially vulnerable for crosstalk noise during evaluation phase. In the evaluation phase, if there is a phase transition in an aggressor from high to low and a victim remains high, the victim wire may be induced a signal transition from high to low due to crosstalk. Hence, the output is not correctly discharged by the victim wire.
Recently, research on crosstalk minimization has focused on optimization in physical level. The widely accepted methods are wire spacing and shielding, which are considered the most effective ways to solve the crosstalk problem in domino circuit [2] . However, a lot of area overhead will be incurred by using these techniques. On the other hand, Im and Roy propose the placement of NMOS in each domino cell [3] so that the crosstalk effect of a domino cell can be minimized through its unique NMOS reordering technique. Ho et al., develop a multilevel-routing system to minimize crosstalk effect on the routing wires [4] . Lou and Chen focus on crosstalk minimization before wire routing [5] . Ren et al. try to extract the implicit information in advance to perform an incremental crosstalk-aware placement [6] . However, crosstalk minimization in a physical level is far from proactive. The reason is that after the logic synthesis is performed, the functionalities [7] , [8] . They further apply the property for power optimization [9] . The crosstalk-immunity property, computed in synthesis level, can be effectively utilized in physical synthesis. However, computation of crosstalk immunity is very time consuming. If immunity of wires is directly computed from Kim's formula, huge computation time will be needed. We find that inputs and output of OR gate are crosstalk immune. This property can be exploited in technology-mapping efficiency. To further reduce crosstalk effect in synthesis level, we take crosstalk immunity into consideration in technology mapping.
On the other hand, to check if two wires induce crosstalk can be accomplished only after placement and routing are performed. To synthesize a circuit taking crosstalk into consideration, we must be able to accurately predict the wire adjacency of a layout in synthesis level. For that purpose, we develop a cost function to predict wire adjacency in synthesis level and propose two techniques to synthesize a crosstalk-aware network. The first technique is used before technology mapping to construct a crosstalk-aware AND-OR network. The second technique is used for a crosstalk-aware technology mapping for domino cell. With a crosstalk-aware synthesized network, later placement and routing will have a better starting point.
The rest of this paper is organized as follows. The preliminaries are given in Section II. Section III presents our observations and cost function to reduce crosstalk effect. The synthesis flow for dominocircuit generation is proposed in Section IV. Our experimental results are shown in Section V. Section VI concludes this paper.
II. PRELIMINARIES

A. Unateness of Domino Circuit
Since the phase transition in a domino circuit is required to be in monotone, a logic function to be implemented such as domino circuit must be unate. To produce an unate logic, we can apply the DeMorgan's law to push all inverters toward primary inputs or primary outputs. The resultant unate network will consist of AND gates and OR gates only. Previous work addresses this issue focusing on minimizing logic duplication of the trapped inverters. Among others, Puri et al. propose a systematic method to assign the phases of primary outputs for dynamic logic [10] . Zhao and Sapatnekar formulate the phase assignment as a 0-1 integer-programming problem [11] .
B. Crosstalk Immunity of Wires
In a physical level, disregarding the functionalities of two wires, their adjacencies are viewed as the cause of crosstalk. However, if we take the functionality into consideration, two adjacent wires may not incur any crosstalk effect. Let the two adjacent wires a and v be the aggressor and victim, respectively. The crosstalk effect is induced by both the observability of crosstalk effect and the satisfiability of aggressor and victim signals [7] . This crosstalk-immunity property was first proposed by Kim et al. and is rephrased as follows.
Definition 1-Observability: The crosstalk error on signal v can be observed only if there exists an input pattern which propagates the transition of v to any primary output. The observability of wire v, denoted as OBS v , can be formulated as
where F j is the jth primary output of function F , G v is the function of v, and ∂F j (I)/∂G v (I) represents the Boolean difference of F j with respect to G v [12] .
Definition 2-Satisfiability: The crosstalk error occurs only if there exists an input pattern which satisfies a phase transition (0 → 1) of a and stable low of v. The satisfiability of wires v and a, denoted as SAT va , can be formulated as
where G v (I) is the function of wire v, G a (I) is the function of wire a, and I is a primary input pattern.
Definition 3-Current Transformer (CT)-Immune:
A victim wire v is crosstalk immune or CT-immune to its adjacent aggressor wire a if there exists no input pattern which satisfies both satisfiability and observability. It can be formulated as
The satisfiability is a necessary condition to setup the crosstalk error in the victim, while the observability is a necessary condition to observe the crosstalk error in the victim at the output. Both satisfiability and observability are necessary criteria to induce and propagate the crosstalk error to the primary outputs. If any one of the above conditions does not hold, the crosstalk effect of an aggressor a to a victim v can be ignored.
III. RELATIONS OF WIRE ADJACENCY IN PHYSICAL LEVEL AND I/O OF MAPPED CELL IN SYNTHESIS LEVEL
As pointed out in the Section I, a synthesis tool plays an important role in reducing crosstalk. However, to check if two wires induce crosstalk can be accomplished only after placement and routing are performed. To synthesize a circuit taking crosstalk into consideration, we must develop a useful cost function in synthesis level, which accurately predicts wire adjacency of the final layout in physical level. In this section, we will first discuss the relations of wire adjacency and I/O of mapped-domino cell in Section III-A. Then, in Section III-B, we will show that inputs to OR gate are CT-immune to each other in synthesis level. Finally, in Section III-C, we develop our cost function for technology mapping based on the results in Section III-A and B.
A. Wire Adjacency and I/O of Mapped-Domino Cells
In physical level, the adjacency of two wires is the necessary condition for two wires to induce crosstalk effect. Hence, our first challenge is to correlate the physical adjacency of wires with the interconnections of a synthesized logic. We observe a placed and routed circuit and find that many adjacent wires are the I/O signals to the same mapped-domino cells. To understand if the observation is correct in general, we perform a set of experiments. First, circuits from Microelectronics Center of North Carolina (MCNC) benchmark are synthesized by SIS [13] and mapped to domino cells by a dynamicprogramming-based domino technology-mapping algorithm proposed by Zhao and Sapatnekar [11] . After that, we perform placement and routing for these circuits using Dragon [14] and Multilevel Router [15] . Finally, pairs of adjacent wires in each layout are extracted to check if they are I/O signals to the same mapped-domino cell. Table I shows the results. Column I/O Adj is the number of adjacent wire pairs, which are the I/O signals of the same mapped-domino cell. Column Tot Adj is the number of total adjacent wire pairs of the circuit. Column Ratio is the ratio of I/O Adj to Tot Adj. From Table I , we can see that about 41% of adjacent wire pairs are I/O signals of the same mappeddomino cell. This is an important finding. We can utilize this property in synthesis level to synthesize a domino network, which is likely to have more CT-immune adjacent wires in physical level. That is, if a synthesized circuit has more cells whose I/O signals are CT-immune to each other, there will be more adjacent wire pairs in physical level to be CT-immune to each other.
B. Crosstalk Immunity of OR Gates
In the previous section, we observe that about 41% of adjacent wires are I/O signals of the same mapped-domino cell from circuit layouts. Based on this observation, in order to synthesize a circuit that is more likely to be CT-immune, we can pay attention to the I/O signals of a mapped-domino cell.
We find that the observability of signals in OR gate is an empty set. Since the crosstalk error will not be propagated to primary output unless both the satisfiability and observability hold, we can exploit this property for crosstalk reduction. The following observations are derived.
Lemma 3.1: Given an OR gate, the inputs and the output of the OR gate are CT-immune to each other.
Proof: Let an OR gate have inputs a and b, and output c. We prove that a, b, and c are CT-immune to each other. Assume a(0 → 1) to be an aggressor and b(0 → 0) a victim. Let the phase transition in a induce an unwanted phase transition in the victim b from zero to one. The error in b will never be observed, since the output of OR gate c will be evaluated to one when a = 1. Similarly, we can prove that when b is an aggressor and a is a victim, the error will not be observed in the output. Thus, the inputs a and b are immune to each other. Furthermore, suppose a(0 → 1) is an aggressor and c(0 → 0) a victim. A phase transition induces in c when a makes transition. However, this is impossible because the output of OR gate c will also make a transition from zero to one. Suppose c(0 → 1) is an aggressor and a(0 → 0) be a victim. Since c has already been evaluated from zero to one, the unwanted transition induced in a will not be observed; hence, a and c are immune to each other. Similarly, we can prove that b and c are immune to each other. Accordingly, the output of an OR gate is CT-immune to the inputs. Therefore, the inputs and output of the OR gate are CT-immune to each other. Theorem 3.2: Given a tree-based OR network, all wires in the OR network are CT-immune to each other.
Proof: Given a tree-based OR network which has three OR gates A, B, and C in Fig. 1 and the inputs of A, B, and C are {a, b}, {c , d}, 3.1 a, b, c, d , and g are immune to each other due to the inputs and output of the collapsed OR gate. Thus, all signals of the tree-based OR network are CT-immune to each other.
From Theorem 3.2, we can see that the observability of crosstalk error does not hold among signals of a tree-based OR network. We should take advantage of the tree-based OR network to synthesize a domino logic with more wires which are CT-immune to each other.
C. Cost Function for Crosstalk in Technology Mapping
To model the relationship of crosstalk effect between each wire, one commonly used method is the 0-1 sensitivity matrix [9] , [16] , [17] . A 0-1 sensitivity matrix is an n by n matrix consisting of zero and one. In the matrix, the element s i,j is zero if the phase transition of wire i and wire j do not affect each other, otherwise s i,j is one. A Boolean network with more zero in its sensitivity matrix means that there are more CT-immune wires in that circuit. Although the immunity relationship between every pair of signals can be represented in a sensitivity matrix, a lot of information contained in a sensitivity matrix is redundant and irrelevant. The reason is that the most serious crosstalk effect occurs only between neighboring wires, as shown in Table I , instead of every pair of wires in the whole circuit.
Based on our observations, we develop a new sensitivity metric for synthesis-level optimization. Our metric is developed based on the observation where I/O of a mapped cell is more likely to be routed in the same neighborhood. Therefore, we will focus on the crosstalk effect within each individual domino cell rather than the whole Boolean network in synthesis level.
By using the concept of crosstalk immunity in OR network, we define the sensitivity of each domino cell in the following.
Definition 3.3-[SC(in j )]:
The sensitive contribution of the input signal in j in a domino cell can be formulated as
where n is the number of inputs to the OR network.
Definition 3.4-[Sen(dc i )]:
The sensitivity of a domino cell dc i can be formulated as
where # fanin is the number of fanins to the domino cell dc i . By Theorem 3.2, input signals which belong to the same OR network are pairwise CT-immune to each other. Therefore, these fanins can be viewed as a single net. Thus, we give a weight of 1/n to each of them. For other input signals, they are crosstalk sensitive to each other and we give a weight of one. Then, we sum up the sensitive contribution of each fanin net in the domino cell. Additionally, we divide the summation of SC(in j ) by the fanin number of the domino cell to normalize the summation of SC(in j ) for different sizes of cells.
Note that the Sen(dc i ) of a cell is a fractional number greater than zero and less than or equal to one. A domino cell with large Sen(dc i ) tends to be more sensitive to the crosstalk effect.
Taking the cell G in Fig. 2 as an example, Sen(G) is calculated as follows. Because the fanin signals a, b, c, and d belongs to the same OR network, the sensitive contribution SC(a), SC(b), SC(c), and SC(d) are 1/4, respectively. For other signals, e and f , SC(e) and SC(f ) are both one. As a result, the crosstalk sensitivity of the domino cell G is Sen(G) = 3/6 = 0.5.
To check if our metric is indeed a reasonable measurement, we compare the two measurements, 0-1 sensitivity matrix and our crosstalk sensitivity metric, for domino cells. We take Fig. 3 as an example, where the function of cell X is X = (a + b The ratio of one in the sensitivity matrix is 0.86. From the above example, we can see that in reality, there are more crosstalk immune wires in X than in Y . Hence, we can say that the crosstalk effect in X is less serious than in Y . However, using the ratio of 0-1 sensitivity matrix as a metric, X and Y have almost the same immunity cost. On the other hand, using our sensitivity metric, Sen(X) and Sen(Y ) being 0.25 and 0.5, respectively, is more accurately reflect the immunity analysis.
IV. ALGORITHM
A. Flow
Our crosstalk-aware domino-logic-synthesis flow is conducted in four steps. First, we apply multilevel-technology-independent logic optimization using SIS [13] by a standard script [18] . After that, we transform binate functions into unate functions with minimized node duplications [10] . Next, we select the output phases of the functions. Finally, a bottom-up parameterized technology mapping [11] , [18] with crosstalk-aware consideration is performed. The details of steps output-phase selection and parameterized technology mapping are described in the following.
B. Output-Phase Selection
Since the fanins of an OR function guarantee to be CT-immune to each other, we generate a Boolean network with more OR nodes. One straightforward method is to utilize the technique of output-phase assignment to flip the AND nodes and the OR nodes of a Boolean network. Kim et al. have also proposed output-phase assignment to select proper output phases using the crosstalk-immunity property [7] , [8] .
To perform output-phase selection, we group outputs which have common internal nodes into one set. After grouping, all primary outputs are separated into several independent sets. The outputs within a set are dependent to each other; the sets themselves are independent. For those outputs in one set, their polarities must be assigned the same. This is to prevent the network from duplicating extra nodes during output-phase selection. Once the outputs are separated into different independent sets, we compare the number of AND nodes and the number of OR nodes in each output set. For those sets which have more AND nodes than OR nodes, we apply the aforementioned phase-flipping technique to generate a complemented Boolean network. The time complexity of our output-phase selection is O (X · N ) , where X is the number of primary outputs and N is the total number of internal nodes.
C. Parameterized Technology Mapping
Technology mapping is the final step of logic synthesis to combine the optimized logic function with the process technology. Unlike traditional standard-cell CMOS design which uses a number of CMOS cells of a given cell library, the domino-cell technology mapping synthesizes various types of cells on the fly. We use the parameterized technology mapping [18] to preserve the flexibility of implementing domino cells. The constraints in the mapping process are set to be the number of transistors in serial and the number of transistors in parallel for each mapped-domino cell. To map a Boolean function into domino cells, the dynamic-programming-based bottom-up construction method [11] is used. Instead of minimizing the area only, we 
In (6), Area(dc i ) represents the area of the domino cell in terms of the number of transistors. Sen(dc i ) is the cell sensitivity discussed in (5) . Cell cost of a domino cell is a weighted sum of the area and sensitivity of the cell. The factor α is a weighted coefficient to reflect the importance between circuit-area size and crosstalk effect.
V. EXPERIMENTAL RESULTS
Our proposed algorithm is implemented as a software tool, TM-C, using C. The dynamic-programming-based domino technologymapping algorithm proposed by Zhao and Sapatnekar is also implemented to generate area minimized domino circuits [11] . Our experiment is performed on SUN-Blade2500 with 4 GB of memory. The software platform is based upon SIS [13] . MCNC benchmark suite is used in our experiments. The experiment is conducted to generate a domino circuit with minimum crosstalk effect. The constraints of both the number of transistors in serial and the number of transistors in parallel for a domino cell are set to four. All benchmark circuits are first technology-independent optimized as initials using the standard script provided by SIS [13] . Then, the optimized networks are technology mapped by the algorithm proposed in [11] and by TM-C.
The experimental results are shown in Tables II-V. The column labeled [11] is the result of mapped-domino circuits with minimized area. Column TM-C is the result of the circuits synthesized using our proposed flow, where both output-phase selection and crosstalkaware cost function for technology mapping are used. Empirically, the weighted coefficient α is set to 0.1 to have a proper tradeoff between circuit area and crosstalk cost.
In Table II , we compare the cell sensitivity of the circuits synthesized by the algorithm proposed in [11] and by TM-C. Column R Sensitivity is the ratio of average cell sensitivity of circuits synthesized by the algorithm proposed in [11] to that of circuits synthesized by TM-C. From Table II , we can see that the cell sensitivity is greatly reduced after using output-phase selection and crosstalk-aware technology-mapping cost function. In average, the cell sensitivity of the circuits synthesized by TM-C is reduced to 47% as compared to that of the circuits synthesized by the algorithm proposed in [11] . Table III compares the circuit area and level. The area of a domino circuit includes the evaluation transistors, clocking transistors, outputinverting transistors, and fan-out transistor. Columns Ratio are the ratio of the area and the levels of circuits mapped by the algorithm proposed in [11] to that of circuits synthesized by TM-C, respectively. We can see that our mapper produces almost the same area but at the expense of 6.72% increase in the number of levels.
To understand if our synthesis tool can indeed reduce the crosstalk sensitivity in physical level, we perform placement and routing for all benchmark circuits. The experiment is conducted as follows. The networks are first technology mapped to domino cells by the algorithm proposed in [11] and by our mapper, TM-C. Then, Dragon [14] and Multilevel Router [15] are applied to perform cell placement and wire routing, respectively. Finally, pairs of adjacent wires are extracted to check their crosstalk immunity by (3) using an ATPG package. Table IV shows the result of crosstalk immunity. Column Imm Adj is the number of CT-immune wire pairs. Column Tot Adj is the number of total adjacent wire pairs. Column Immunity is the ratio of CT-immune wire pairs to the total number of adjacent wire pairs. We can see from the table, by means of using our crosstalk reduction techniques, the ratio of the number of CT-immune wire pairs to the number of total wire pairs is about 24% using our methodology as compared to 9% using area-minimization algorithm. This demonstrates that our techniques are indeed effective.
Furthermore, we compute the wire coupling of the adjacent routing wires to verify whether our techniques can reduce crosstalk effect. Table V is the result of wire coupling. Columns Max and Total are the maximum wire coupling and the total wire coupling of the circuit, respectively. Maximum wire coupling [7] is computed in two steps. First, the summation of the real (the non-CT-immune) wire coupling of a net is calculated and, then, the maximum is taken of all nets. Column Sum is the sum of adjacent wire coupling. Column Real is the non-CT-immune wire coupling. From Table V, the maximum wire coupling can be greatly reduced from 95% to 60% after applying our synthesis flow. In addition, total wire coupling can also be reduced from 91% to 72%. Notice that, the routing tool, Multilevel Router [15] , used in our experiment is a general wire-length reduction router. If we could apply a crosstalk-driven router taking CT-immune information into consideration, the wire coupling would be further reduced.
VI. CONCLUSION
We have proposed a synthesis flow to minimize the crosstalk effect for domino circuits. Crosstalk-immunity property of OR gate and relations between wire adjacency and cell I/O are exploited in technology mapping. Output-phase selection for independent sets is used to generate a Boolean network with more OR nodes. To precisely measure the crosstalk effect on domino cells, cell sensitivity is used as a metric for crosstalk minimization during technology mapping. The experimental results demonstrate that our synthesis methodology can greatly reduce the crosstalk effect by 52% in logic-synthesis level as compared with conventional methodology. After placement and routing are performed, the ratio of the number of CT-immune wire pairs to the number of total wire pairs is about 24% using our methodology as compared to 9% using conventional techniques and the maximum wire coupling can be greatly reduced from 95% to 60%.
