# IP Watermarking Using Incremental Technology Mapping at Logic Synthesis Level

Aijiao Cui, Student Member, IEEE, Chip-Hong Chang, Senior Member, IEEE, Sofiène Tahar, Senior Member IEEE

Abstract— This paper proposes an adaptive watermarking technique by modulating some closed cones in an originally optimized logic network (master design) for technology mapping. The headroom of each disjoint closed cone is evaluated based on its slack and slack sustainability. The notion of slack sustainability in conjunction with an embedding threshold enables closed cones in the critical path to be qualified as watermark hosts if their slacks can be better preserved upon remapping. The watermark is embedded by remapping only qualified disjoint closed cones randomly selected and templates constrained by the signature. This parametric formulation provides a means to capitalize on the headroom of a design to increase the signature length or strengthen the watermark resilience. With the master design, the watermarked design can be authenticated as in non-oblivious media watermarking. Experimental results show that the design can be marked efficiently by our method with low overhead.

Index Terms — Digital Watermarking, IP Protection, Incremental Technology Mapping, Logic Synthesis.

## I. INTRODUCTION

he increase in integrated circuit complexity and shorter design turnaround time have boosted reuse-based design methodology to speed up new product development in the System-on-Chip (SoC) era. Although significant time and effort have been invested in the creation of reusable intellectual property (IP) cores, piracy extorts sizable revenue from IP producers and poses a severe threat to reusable design. Digital watermarking has evolved as a mature technology to protect the copyright of multimedia content [1]. When this technique is applied to VLSI design [2]-[15], it augments the IP owner's opportunity to reclaim his loss of revenue. The imperceptibility of multimedia watermark stems from the imperfection of human sensory system. In contrast, watermarking for IP protection (IPP) imposes that the watermarked IP must also remain functionally correct. The transparency of IP watermarking is achieved by making the changes induced by the watermark 'invisible' to an experienced designer so that they could not be easily detected. At the same time, the cost and performance (quality) of the watermarked IP shall not be unduly compromised. The

Copyright (c) 2008 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending an email to pubs-permissions@ieee.org.

Manuscript received May 30, 2007; Revised Sept. 30, 2007 and Feb. 25, 2008. Aijiao Cui and Chip-Hong Chang are with Nanyang Technological University, Singapore. (e-mail: <a href="mailto:cuiaijiao@pmail.ntu.edu.sg">cuiaijiao@pmail.ntu.edu.sg</a>, <a href="mailto:echchang@ntu.edu.sg">echchang@ntu.edu.sg</a>). Sofiène Tahar is with Concordia University, Canada. (e-mail: <a href="mailto:tahar@ece.concordia.ca">tahar@ece.concordia.ca</a>).

watermarked IP should also be resilient to malicious attacks without deteriorating the design functionality and performance to an extent that renders the design unusable.

Modern VLSI design flow involves many optimization procedures that require solving various NP-hard constraint satisfaction problems (CSP). The solution space of these problems is normally enormous. Exploitation of the excess solution space of CSP has led to the first work in constraint-based watermarking for IPP by Lach et al. [3], [4] and Hong and Potkonjak [5], which was later formally articulated in [6], [7]. In this approach, the ownership credibility is determined by the probability of coincidence,  $P_c$  [8]. The proof requires revoking the CSP instance to demonstrate that the additional constraints are satisfied by the watermarked design. This process tends to expose the signature and secrecy of the well-formed grammar used to generate the constraints, making other similarly marked designs vulnerable to attacks. Except some local watermarking schemes [9], [10], [13], [14], all watermark bits in global watermarking are closely coupled into a set of design constraints, which cannot be independently extracted to detect the locality of design changes due to partial obliteration.

This paper presents a new constraint-based watermarking technique at logic synthesis level [14]. Instead of passively accepting the success or failure until the watermarking is completed, we use an originally optimized design to extract the excess bandwidth relative to the original timing specifications. Unlike other logic synthesis watermarking [9], [10], where the watermarked solution quality is limited by the technology mapping of overlapping maximum cones, we introduce independent disjoint closed cones for incremental technology mapping to maximally exploit the excess bandwidth. As timing criticality is obscure in a small disjoint closed cone, a slack sustainability is formally defined to determine its headroom for remapping. Using both slack and slack sustainability to qualify cones for remapping prevents the watermark bits from being conspicuously hosted in only the non-critical paths. This incremental mapping preserves as much optimality of the master design and can provide a well-defined control mechanism to trade the embedded capacity for a bounded overhead. To avoid the exposure of the grammar for constraints generation, an alternative verification analogous to the watermark retrieval of non-oblivious image watermarking is made possible by our embedding method. This watermark retrieval method also possesses some degree of fragility that enables the detection of maliciously corrupted watermark bits.

#### II. PRELIMINARIES

A bound logic network can be represented as a directed acyclic graph (DAG), G(V, E) where each node,  $v \in V$  represents an instance of a cell library. A directed edge,  $e_{i,j} \in E$  exists if  $\exists v_i$ ,  $v_j \in V$  such that the output of cell  $v_i$  is an input of cell  $v_j$ . The sets of fanin and fanout nodes of  $v_j$  are denoted by  $FI(v_j)$  and  $FO(v_i)$ , respectively. A *primary input/output* (PI/PO) has no fanin/fanout node. A *cone* at node v, denoted as  $C_v$ , is defined as a subgraph consisting of v and at least one of its predecessors. If all the predecessors of v are contained in the cone,  $C_v$  is called the *max cone* at v [16].

The set of nodes outside  $C_{\nu}$  which drive the nodes (or are driven by the nodes) in  $C_{\nu}$  will be denoted by  $FI(C_{\nu})$  (or  $FO(C_{\nu})$ ). Any node in  $C_{\nu}$  that drives (is driven by) at least a node outside  $C_{\nu}$  or a PO (PI) is called the output (input) node of  $C_{\nu}$ .  $C_{\nu}$  is a simple cone if it has only one output node, v. Any output node of  $C_{\nu}$  that drives at least one internal nodes in  $C_{\nu}$ , if it exists, is known as an implicit output node. A cone is *closed* if it has no implicit output nodes. If a cone contains any implicit output node, it is said to be *open*. A *complex* cone,  $C_{v_1,v_2,...,v_n} = C_{v_1} \cup C_{v_2} \cup ...$  $\cup C_{vn}$  is a conjunction of two or more connected cones with at least one common transitive fanin nodes between any two cones. A simple open cone can be made closed by either excluding all its implicit output nodes, or including the transitive fanin and fanout nodes of the implicit output nodes until the complex cone so formed contains no implicit output node. The distance, d(v, u)is the minimum number of transitive edges connecting nodes u and v. The distance, d(C) is defined as the maximum distance among all pairs of input and output nodes of cone, C.

Fig. 1 shows the DAG of a technology mapped network. Node u is an implicit output node of  $C_v$  as it drives an internal node t. By definition,  $C_v$  is a distance-2 open simple cone due to d(p, v) = 2 and node u.  $C_v$  is a closed cone obtained by eliminating u from  $C_v$ .  $C_{v,x}$  is a closed complex cone obtained by expanding  $C_v$  to cover x, which is the fanout node of u.



Figure 1: Examples of a DAG network, a max cone  $(MC_v)$ , a closed cone  $(C_v)$ , an open cone  $(C_v)$ , and a complex cone  $(C_{v,x})$ .

The following definitions are adopted from [17]. The *arrival time* of the signal at node v, denoted by  $t_a(v)$ , is the latest time when the signal at node v reaches its final stable state.

$$t_{a}(v) = \max_{u \in FI(v)} [t_{a}(u) + t_{d}(u, v)]$$
 (1)

The *required time* of the signal at node u, denoted by  $t_r(u)$ , is the earliest time when the signal at node v is required to be stable.  $t_r(u)$  can also be calculated recursively by (2).

$$t_{r}(u) = \min_{v \in FO(u)} [t_{r}(v) - t_{d}(u, v)]$$
 (2)

The *slack* of a node 
$$v$$
, denoted by  $t_s(v)$ , is given by 
$$t_s(v) = t_r(v) - t_a(v)$$
 (3)

### III. PROPOSED IP WATERMARKING SCHEME

### A. Watermark Insertion

An arbitrary length binary signature, W is encrypted using a public key cryptosystem (PKC) and then reduced by a message digest (MD) to a constant length sequence,  $W_e$ . To maximally utilize the embedding capacity of a design in incremental technology mapping,  $W_e$  is embedded into an originally optimized circuit generated by a logic synthesis tool with a cell library. This cover design is called the master design, G.

To minimize and isolate the topology perturbations to small regions of selected cells in technology mapping, disjoint closed cones instead of max cones are used as watermark hosts. There must be at least one substitutable template for any cell in a selected cone to enable remapping. There must also have enough spatially uncorrelated closed cones to reduce the odds that a randomly selected cone coincides with the watermark hosts. To provide a pool of fair candidates, each cell in the master design is first expanded to a disjoint closed cone of distance- $\tau$ , where  $\tau$  is empirically determined based on the circuit complexity, the watermark length and the cell library resources.

Fig. 2 shows the procedure to expand a cell, v, into a simple or complex closed cone of distance- $\tau$ . It can be implemented by a depth first traversal from v to include all its predecessors into  $C_v$  until the distance from the predecessors to v is  $\tau$ , or any PI or implicit output node is met. When an implicit output node is encountered, either a simple or complex closed cone is to be generated. As multi-vertex mapping requires more effort to preserve the timing constraints, to justify for a complex cone, all implicit output nodes of the closed cone must have greater slacks than node v. Therefore, a complex cone,  $C_{v1v2...vn}$  of distance- $\tau$  is generated only if the following criterion is met.

$$t_s(v_i) > t_s(v_1) \quad \forall v_i \in FO(C_{v_1 v_2 \cdots v_n}), v_i \neq v_1$$
 (4)

where  $v_1$  is the cell to be expanded into a distance- $\tau$  closed cone.

```
\begin{aligned} & \textbf{disjoint\_closed\_cone}(v, \tau) \; \{ \\ & C_v = \{v\}; \\ & \textbf{for each } (u \in FI(C_v) \text{ and } d(u, v) \leq \tau \text{ and } u \text{ is unmasked}) \; \{ \\ & \textbf{if } (u \neq PI) \; \{ \textbf{mask}(u); \; C_v = C_v \cup \{u\}; \; \} \\ & flag = 1; \\ & \textbf{for each } (x \in FO(u) \text{ and } x \notin C_v) \\ & \textbf{if } (t_s(x) \leq t_s(v) \text{ or } x \text{ is masked}) \; \{ \\ & \textbf{unmask}(u); \; C_v = C_v - \{u\}; \; flag = 0; \, \textbf{break}; \; \} \\ & \textbf{if } (flag = 1) \; \{ \textbf{mask}(FO(u)); \; C_v = C_v \cup FO(u); \; \} \\ & \} \\ & \textbf{if } (C_v \text{ is trivial}) \; \textbf{return } \varnothing; \; \textbf{else return } C_v; \\ \end{cases}
```

Figure 2: Generation of distance-τ disjoint closed cone.

As an example, consider node v of Fig. 1. For  $\tau = 2$ , the complex closed cone,  $C_{v,x}$ , is generated if  $t_s(x) > t_s(v)$ . Otherwise, node u is excluded and a simple closed cone,  $C_v$  is generated.

As the embedding capacity is limited by the ability of a design to preserve its timing constraints in remapping, to maximally utilize the available embedding capacity for robust watermarking, a metric is developed to formally qualify the disjoint closed cones generated for watermark embedding. The slack of a cone,  $C_v = t_s(v)$  provides a global view of the margin available for remapping the cells along the critical path of the

max cone rooted at v. However, a distance- $\tau$  closed cone,  $C_v$ , can preclude cells in the critical path of the max cone of v that are further than distance  $\tau$  from v. To determine how well a closed cone,  $C_v$  preserves its output timing, we can remap only  $C_v$  of G to obtain a remapped solution, G'. The *sustainability* of a closed cone  $C_v$  is defined as:

$$S(C_{v}) = 1 + \frac{\Delta t_{a}(v)}{t_{a}(v)} = 1 + \frac{t_{a}(v) - t'_{a}(v)}{t_{a}(v)}$$
(5)

where  $t_a(v)$  and  $t'_a(v)$  are the arrival times of the same node v in G and G', respectively. We add a bias of one to the fractional delay so that when  $S(C_v) = 1$ , the original slack is retained, hence the term slack sustainability. Timing is improved by remapping if  $S(C_v) > 1$  and aggravated if  $S(C_v) < 1$ .

A re-synthesis is needed for each closed cone to evaluate its sustainability. To increase the efficiency, a correlated metric to the sustainability is statically derived from the master design without the need for the physical re-synthesis process.

$$\hat{S}(C_{v}) = \hat{S}(v \in C_{v}) = \frac{1}{|FI(v)|} \sum_{u \in FI(v)} (\hat{S}(u \in C_{v}) + \delta(v, u))$$
(6)

where |FI(v)| is the number of fanin nodes of v and

$$\delta(v,u) = \frac{t_d(u,v)}{t_a(u) + t_d(u,v)} \tag{7}$$

 $\delta(v,u)$  accounts for the fractional delay a cell v contributes to its critical path through its fanin node, u.  $0 < \hat{S}(C_v) \le 1$ . The recursion of  $\hat{S}(u)$  terminates at  $u = FI(C_v)$ , and  $\hat{S}(u) = 0$   $\forall u \in FI(C_v)$ . If  $C_v$  is a max cone,  $\hat{S}(C_v) = 1$ . The higher the  $\hat{S}(C_v)$ , the better the closed cone preserves its timing slack upon remapping. The greater the slack, the easier the delay constraint is met by remapping. The headroom of a closed cone is thus defined as:

$$w(C_{y}) = t_{s}(y) \times \hat{S}(C_{y}) \tag{8}$$

To qualify closed cones for watermark embedding, an empirical threshold,  $w_T$  is defined such that all cones with headroom less than  $w_T$  are pruned. To reduce the probability of qualifying closed cones that could adversely affect the performance of the watermarked design,  $w_T$  is judiciously set to  $\alpha$  time the minimum slack of all path groups of G.

$$w_T = \alpha \times \min \left\{ t_s(PO_i(G)) \right\} \tag{9}$$

As the excess slacks on less critical path groups are expected to be traded for the diminished slacks on more critical path groups,  $\alpha$  provides the trade-off control of the embedding capacity of a design and its watermarked solution's timing specification.

Each qualified closed cone with  $w(C_v) \ge w_T$  is uniquely identified. To obfuscate the locality of the watermark, m qualified closed cones are randomly selected by a cryptographically strong pseudorandom number generator seeded with the m-bit encrypted watermark,  $W_e$ . This keyed one-way function ensures that with high probability a different set of closed cones will be selected when the bitstream in  $W_e$  is permuted. To embed a logic '1', a selected closed cone is coerced to remap by prohibiting the use of one randomly selected template inside the cone. To embed a logic '0', one template of the selected cone is preserved. The selected 1- and 0-watermarked cones are stored in  $Q_1$  and  $Q_0$ , and their

designated forbidden and preserved templates are stored in  $TP_1$  and  $TP_0$ , respectively, for watermark recovery. The watermarked solution, G' is generated by remapping only  $Q_1$  and  $Q_0$  of G according to their template constraints,  $TP_1$  and  $TP_0$ , respectively. The algorithm for watermark insertion is shown in Fig. 3.

```
watermark_insert(G, Lib, W, K, \tau, \alpha) {
    Compute and store timing information of t_s, t_a in G.
    i = 1; Initialize disjoint cone array, Q;
    for each (v \in G) {
          Q[i] = disjoint\_closed\_cone(v, \tau);
         if (Q[i] \neq \emptyset) { cone\_id[i] = identifier of v; i = i + 1; }
    w_T = \mathbf{threshold}(\alpha, G); \ \ Q = \mathbf{qualify\_closed\_cone}(Q, w_T);
     W_e = MD(PKC(W, K)); m = |W_e|; q = |Q|;
    Initialize array TP_1, TP_0, Q_1 and Q_0;
    index[1..m] = pseudo_random_generator(W_e, q);
    for (i = 1 \text{ to } m) {
       if (W_e[i] = 1) {
          TP_1 = TP_1 \cup \text{any template in } Q[index[i]]; Q_1 = Q_1 \cup Q[index[i]];
          mark the selected Q[index[i]] and its forbidden template in G;
          TP_0 = TP_0 \cup \text{any template in } Q[index[i]]; Q_0 = Q_0 \cup Q[index[i]];
          mark the selected Q[index[i]] and its sentinel template in G;
    G' = \mathbf{remap}(G, Q_1, Q_0, TP_1, TP_0, Lib);
```

Figure 3: Watermark insertion by incremental technology mapping.

#### B. Watermark Detection

To prove the presence of watermark in an IP, the original CSP instance is recalled to verify that the additional constraints imposed are satisfied by the watermarked design [8]. This process exposes the grammar used to generate the constraints. Here we present another watermark detection method which possesses some features of fragile watermarking. Other designs similarly marked with the same signature will not become more vulnerable by the information divulged in this process.

To facilitate the retrieval of the watermarked closed cones, correspondences between internal nets of the watermarked design and designated nets of the master design are identified by functional equivalence. The logic cones,  $C' \in G'$  and  $C \in G$  are said to be equivalent iff there exists a bijection between FI(C') and FI(C), and between FO(C') and FO(C). Let f(v) be the logical function of cell v, then  $C' \equiv C$  iff |FI(C')| = |FI(C)| and |FO(C')| = |FO(C)|, and  $\exists u \in FI(C')$  such that  $f(u) \equiv f(v)$   $\forall v \in FI(C)$ , where ' $\equiv$ ' denotes equivalence of logical functions.

Since incremental technology mapping preserves the functionality of the interface ports of remapped cones, the logic functions of these nets can be retrieved from the fanin and fanout nodes of closed cones saved in the master design, G. To recover the watermarked cone C' from a marked design, G', nodes with the same logic functions are extracted. Let  $C' \equiv C$  be a cone extracted from the watermarked design G' with n fanins and k fanouts.  $f(WP_i) \equiv f(P_j)$  for i, j = 1, 2, ..., n+k, where  $WP_i$  and  $P_j$  are the interface ports of C' and C, respectively. If C is used to embed a logic '0' and the designated template is found in C', a valid logic '0' is retrieved. If C is used to embed a logic '1' and the designated template is not found in C', a valid logic '1' is

recovered. Otherwise, the embedded bit has been corrupted. When equivalent fanin and fanout nodes of a watermarked cone cannot be found, it implies that either the cells within the cone or its neighboring cells have been modified. The authorship is proved by a perfect or high match between the recovered bit stream and the embedded watermark.

### IV. EXPERIMENTAL RESULTS

The strength and robustness of an IP watermarking scheme are generally evaluated by the probability of coincidence,  $P_c$ , and the probability of removal,  $P_r$ . As it is difficult to unambiguously or unconditionally measure the credibility of IC IPPs in practice, we conservatively consider the potential high quality solutions that are less likely to violate the design constraints. The closed cones that span this subspace have been qualified by the threshold  $w_T$  in our watermarking flow. Thus, the probability of m uniquely labeled closed cones selected from q closed cones that satisfy the headroom constraints is  $1/C_m^q$ . Let  $p_0$  and  $p_1$  be the probabilities that a template in the selected closed cone is preserved or changed in the remapping process, respectively. Assume that  $W_e$  has an equal number of '1' and '0' bits, the probability that a solution carries the watermark by coincidence is given by:

$$P_{c} = \frac{1}{C_{m}^{q}} (p_{0})^{\frac{m}{2}} (p_{1})^{\frac{m}{2}} \approx \frac{1}{C_{m}^{q}} (\frac{1}{2})^{m}$$
(10)

Assume that it is equally probable that a designated template is extricated or preserved when a closed cone is remapped,  $p_0 = p_1 \approx 0.5$ . From (10), it is obvious that increasing the signature length increases the watermark strength only if the design itself offers sufficient redundancy. Given a fixed signature length, m, a high exploitable redundancy space, q, also increases the watermark strength, provided that  $\alpha$  in (9) is judiciously selected to preserve the original timing constraints.

Since combinational circuit watermarking at logic synthesis level is not resilient against re-synthesis attack, the watermarked design is made available as technology-specific firm or hard IP. HDL codes that can be directly exploited for re-synthesis without substantial design and verification effort shall not be revealed. This makes node manipulation difficult for our method as it will affect the circuit timing even if the attacker can reverse engineer parts of the circuit netlist to recover some local functions with reasonable effort. To delete a large portion of the watermark bits and still preserve the solution quality, the attacker has to substantively perturb the watermarked netlist which results in a task of effort comparable to complete circuit optimization. Assume that a brute-force attack is performed to alter  $\gamma$  cells at random followed by a timing analysis to ensure that the constraints are still satisfied with reasonable effort and without changing the circuit functionality. To simplify the analysis, we further presume that it needs only to alter any cell of a watermarked cone to successfully erase one watermark bit. The probability of erasing i bits of an m-bit watermark from a design with N disjoint cones is given by:

$$P_{r}(n=i) = C_{i}^{\gamma} \prod_{j=0}^{i-1} \left(\frac{m-j}{N-j}\right) \cdot \prod_{j=0}^{\gamma-i-1} \left(\frac{N-m-j}{N-i-j}\right)$$
 (11)

The probability of removing more than k ( $k \le m$ ) watermark bits is given by:

$$P_r(n \ge k) = \sum_{i=k}^{m} P_r(n=i)$$
(12)

In the following experiments, we used Synopsys Design Compiler and its standard cell library to synthesize the circuits from ISCAS85, 89, 99 and LGSynth93 benchmark suites. All experiments were run on a 750-MHz Sun UltraSPARC-III with 2 GB of memory running Solaris operating system. The master design of each circuit was obtained by using the timing constraints derived by the synthesis tool from the initially unconstrained designs. 100 different signatures of length m were used for the watermark embedding on each master design and the average percentage increases in the area ( $\Delta A$ ), delay ( $\Delta D$ ) and power ( $\Delta P$ ) of the watermarked design over the master design were reported in Tables I and II. The columns '#cells', 'N', 'q', 'P<sub>c</sub>' and 'P<sub>r</sub>' are respectively the total number of combinational cells, the number of disjoint closed cones, the number of qualified closed cones with  $\tau = 1$ , the probability of coincidence and the probability of successfully removing more than three quarters of the watermark bits by randomly altering m cells, i.e., k = 3m/4,  $\gamma = m$ . For 'ex1010' and 'B22',  $\tau = 2$  was also experimented with 64-bit and 128-bit signatures.

TABLE I WATERMARKING RESULTS ON ISCAS BENCHMARKS

| Circuit      | #cells | N    | q    | m   | $\Delta A(\%)$ | $\Delta D(\%)$ | $\Delta P(\%)$ | $P_c$     | $P_r$    |
|--------------|--------|------|------|-----|----------------|----------------|----------------|-----------|----------|
| C2670        | 430    | 92   | 92   | 32  | 1.09           | -2.11          | 0.97           | 4.10E-35  | 4.97E-09 |
| C5315        | 721    | 125  | 125  | 32  | 3.19           | -0.43          | 1.80           | 3.76E-40  | 1.78E-12 |
| C7552        | 1069   | 243  | 243  | 64  | 1.79           | -3.25          | 0.96           | 1.33E-79  | 3.30E-23 |
| C6288        | 1979   | 466  | 145  | 64  | 1.29           | 1.07           | 1.83           | 4.95E-62  | 1.61E-37 |
| S5378        | 734    | 140  | 84   | 32  | 1.02           | 0.74           | 0.56           | 1.49E-33  | 9.83E-14 |
| S9234        | 896    | 191  | 187  | 64  | 1.86           | 5.53           | 0.30           | 5.83E-71  | 8.65E-18 |
| S38417       | 4927   | 1223 | 1127 | 64  | 0.30           | -1.53          | 0.32           | 2.02E-125 | 3.87E-58 |
| 330417       |        |      |      | 128 | 0.65           | 0.00           | 0.44           | 4.63E-211 | 1.60E-85 |
| S38584       | 6520   | 1204 | 1141 | 64  | 0.47           | -1.81          | -0.27          | 8.98E-126 | 8.29E-58 |
| 330304       |        |      |      | 128 | 0.73           | -2.72          | -0.12          | 8.66E-212 | 7.53E-85 |
| B22          | 5499   | 1313 | 1290 | 64  | -0.10          | -0.32          | 0.11           | 2.82E-129 | 1.22E-59 |
| DZZ          |        |      |      | 128 | 0.07           | 1.87           | 0.11           | 5.37E-219 | 1.44E-88 |
| B22          | 5499   | 1172 | 1150 | 64  | -0.11          | -0.65          | 0.11           | 5.35E-126 | 3.08E-57 |
| $(\tau = 2)$ | 3499   |      |      | 128 | 0.01           | -0.65          | 0.11           | 2.98E-212 | 1.08E-83 |
| B21          | 3605   | 886  | 875  | 64  | 0.40           | -0.65          | 1.63           | 3.76E-118 | 2.66E-51 |
|              |        |      |      | 128 | 1.01           | -0.65          | 2.04           | 5.29E-196 | 1.32E-71 |

For a given master design, when more watermark bits were inserted, higher overheads were incurred. As the number of cells increases, the overheads diminish and become negligible. From Table II, it is also observed that the area and power overheads of the 128-bit watermarked 'pdc' were both less than 2.6%, and the timing was actually improved (negative percentage difference). For an even larger circuit 'B22', the area and power overheads reduced to only 0.07% and 0.11%, respectively, while the delay increased slightly by 1.87% for  $\tau = 1$ . For  $\tau = 2$ , the watermark introduced only a negligible 0.01% area overhead and the timing was improved. For most designs, as more watermark bits were inserted, lower  $P_c$  was obtained. If q > 1000,  $P_c < 10^{-200}$  for m =128. When a design possesses more than 600 disjoint cones,  $P_r$  <  $10^{-50}$  for m = 128. These results show that it is more difficult to remove a sufficient number of watermark bits successfully from a larger design. We have also experimented with k = m/2. It is found that the attacker can easily remove at least half of the watermark bits for a small number of watermarked circuits with  $N/m \le 2$  and the  $P_r$  values remain very low when N/m > 3. When N > 700 and m = 128, the probability of successfully removing more than 64 watermark bits is less than  $10^{-19}$ . In general,  $\tau$  can be increased to lower the overhead of a watermarked solution. However, as  $\tau$  increases, the number of qualified disjoint closed cones is reduced and  $P_c$  and  $P_r$  become higher.

TABLE II WATERMARKING RESULTS ON LGS YNTH93 BENCHMARKS

| Circuit      | #cells | N    | q    | m   | $\Delta A(\%)$ | ΔD(%) | $\Delta P(\%)$ | $P_c$     | $P_r$    |
|--------------|--------|------|------|-----|----------------|-------|----------------|-----------|----------|
| i7           | 269    | 61   | 61   | 16  | 10.85          | -8.70 | 10.17          | 7.52E-20  | 1.38E-06 |
| i2           | 190    | 57   | 57   | 16  | 3.68           | 28.30 | -2.11          | 2.64E-19  | 3.29E-06 |
| i9           | 239    | 30   | 30   | 16  | 6.34           | 5.70  | 12.73          | 1.05E-13  | 1.40E-02 |
| frg2         | 389    | 58   | 58   | 16  | 1.34           | 2.34  | 1.09           | 1.91E-19  | 2.63E-06 |
| rot          | 374    | 76   | 76   | 16  | 1.13           | -1.64 | 1.98           | 1.41E-21  | 8.37E-08 |
| apex5        | 420    | 78   | 78   | 16  | 3.49           | -0.80 | 5.72           | 8.87E-22  | 6.03E-08 |
| alu4         | 748    | 122  | 120  | 32  | 3.68           | -1.48 | 4.35           | 1.70E-39  | 3.33E-12 |
| apex6        | 381    | 72   | 72   | 32  | 3.31           | -0.80 | 0.47           | 8.16E-31  | 3.07E-06 |
| х3           | 430    | 96   | 91   | 32  | 2.34           | 0.00  | 0.85           | 6.28E-35  | 1.64E-09 |
| k2           | 542    | 100  | 76   | 32  | 1.74           | 0.47  | 0.64           | 8.64E-32  | 5.67E-10 |
| i8           | 416    | 80   | 74   | 32  | 2.59           | -2.06 | 1.33           | 2.60E-31  | 1.93E-07 |
| dalu         | 356    | 93   | 93   | 32  | 3.06           | 4.74  | 0.67           | 2.69E-35  | 3.75E-09 |
| des          | 1680   | 267  | 173  | 64  | 0.89           | 17.84 | 1.05           | 2.69E-68  | 2.62E-25 |
| spla         | 2015   | 364  | 265  | 64  | 1.38           | 0.00  | 1.04           | 2.26E-82  | 3.85E-32 |
| mm30a        | 844    | 215  | 215  | 64  | 0.96           | 20.83 | 0.61           | 1.28E-75  | 1.84E-20 |
| pair         | 1087   | 314  | 314  | 64  | 2.69           | 5.59  | 3.70           | 1.08E-87  | 6.76E-29 |
| i10          | 1431   | 335  | 335  | 64  | 1.32           | -0.48 | 1.48           | 1.07E-89  | 2.54E-30 |
| pdc          | 2288   | 452  | 403  | 64  | 1.46           | -0.62 | 1.60           | 2.48E-95  | 7.39E-37 |
|              |        |      |      | 128 | 2.57           | -1.85 | 2.18           | 2.75E-147 | 6.50E-42 |
| ex1010       | 5411   | 1316 | 1316 | 64  | 1.53           | 10.13 | 11.72          | 7.61E-130 | 1.09E-59 |
| exitoro      | 3411   | 1310 |      | 128 | 2.05           | 10.97 | 16.27          | 3.66E-220 | 1.15E-88 |
| ex1010       | 5411   | 1082 | 1082 | 64  | 1.28           | 9.28  | 10.35          | 2.97E-124 | 1.52E-55 |
| $(\tau = 2)$ | 3411   |      |      | 128 | 1.90           | 10.97 | 15.70          | 1.18E-208 | 2.99E-80 |
| elliptic     | 3626   | 793  | 682  | 64  | 1.53           | -6.29 | 0.00           | 6.32E-111 | 6.14E-49 |
|              |        |      |      | 128 | 1.82           | 11.98 | 0.22           | 7.28E-181 | 8.88E-67 |

In Table III, we compare the overheads of our watermarking method with two other methods in [10] and [11]. The comparison was performed by excerpting the results in [10] and [11] for the same benchmark circuits. In [10], irregular signature lengths were used. Two different signature lengths based on the cases of 4% and 8% of gates being constrained as pseudo-primary outputs [10] were used to compare the embedding overheads and  $P_c$  values with [10]. In [11],  $P_c$  was not evaluated. Thus only the area overhead was compared using the same watermark length equivalent to the number of additional constraints of [11]. An empty entry indicates unavailable result. It is found that for most designs, our method provides the strongest proof of authorship with lower overhead.

### V. CONCLUSION

We presented a constraint-based watermarking method at the logic synthesis level for IP protection. In our approach, good localities for watermark insertion are identified based on the notion of slack sustainability. The formal qualification of hosting cones, in conjunction with an incremental technology mapping, has several distinct advantages. First, the original design constraints can be satisfied with very low embedding overhead. Second, the signature length can be adaptively tailored to the embedding capacity of a design to optimize the ownership proof, watermark obscurity and resilience with low impact on circuit quality. Third, the existence of the watermark can be explicitly detected by a direct extraction method by comparing

the master design and the watermarked design. The correlation check also provides the cue on attacker's attempt of obliteration.

The experimental results show that our method generally performs better than other watermarking methods at the logic synthesis level in terms of the overhead and watermark strength under the same signature length. The results also indicate that the probability of successful random obliteration descends rapidly with increased disjoint cones to signature length ratio. There is potential to integrate our method with dynamic watermarking at higher design level [15] to offer greater flexibility to reuse the watermarked IP cores with low risk of successful attacks.

TABLE III COMPARISON OF LOGIC SYNTHESIS WATERMARKING METHODS

| Circuit | М   | Pro    | posed     | [1     | [11]     |        |
|---------|-----|--------|-----------|--------|----------|--------|
| Circuit |     | ΔA (%) | $P_c$     | ΔA (%) | $P_c$    | ΔA (%) |
| i7      | 18  | 14.83  | 2.91E-21  | 2.88   | 4.22E-14 |        |
|         | 36  | 15.91  | 1.65E-28  | 1.44   | 6.98E-28 |        |
| i2      | 22  | 6.62   | 6.83E-23  | 10.74  | 1.87E-13 |        |
|         | 43  | 8.26   | 1.48E-26  | 26.45  | 1.00E-23 |        |
| i9      | 19  | 6.74   | 3.49E-14  | 8.57   | 2.33E-14 |        |
| alu4    | 25  | 2.44   | 7.14E-34  | 6.82   | 5.84E-11 |        |
| aiu4    | 49  | 4.18   | 1.37E-49  | 11.82  | 3.34E-20 |        |
| frg2    | 14  | 0.24   | 6.02E-18  |        |          | 2.04   |
|         | 21  | 0.52   | 1.43E-22  | 2.98   | 3.01E-10 |        |
|         | 41  | 4.93   | 2.30E-27  | 8.61   | 4.15E-18 |        |
|         | 20  | 1.88   | 8.75E-25  |        |          | 1.53   |
| rot     | 24  | 2.27   | 1.58E-27  | 4.53   | 2.73E-12 |        |
|         | 48  | 3.89   | 7.13E-36  | 8.36   | 1.04E-22 |        |
|         | 20  | 3.75   | 3.06E-24  |        |          | 11.24  |
| apex6   | 26  | 3.75   | 5.40E-28  | 5.37   | 6.41E-16 |        |
|         | 51  | 6.58   | 5.75E-34  | 10.74  | 2.29E-29 |        |
| C2670   | 19  | 2.08   | 8.34E-26  |        |          | 2.55   |
|         | 29  | 2.49   | 2.63E-33  | 7.27   | 9.51E-16 |        |
|         | 58  | 4.75   | 1.94E-43  | 12.42  | 7.22E-29 |        |
| x3      | 18  | 1.61   | 8.08E-25  |        |          | 5.92   |
|         | 28  | 1.68   | 1.67E-32  | 7.89   | 5.93E-16 |        |
|         | 55  | 2.03   | 9.70E-43  | 14.66  | 5.12E-29 |        |
| k2      | 33  | 2.02   | 3.24E-32  | 2.91   | 1.84E-10 |        |
| K2      | 66  | 2.83   | 1.42E-32  | 6.05   | 3.00E-19 |        |
|         | 15  | 2.28   | 1.67E-20  |        |          | 0.86   |
| i8      | 34  | 3.46   | 4.24E-32  | -4.45  | 1.06E-10 |        |
|         | 67  | 6.77   | 3.77E-30  | -10.83 | 4.53E-23 |        |
| dalu    | 43  | 4.40   | 1.81E-40  | 4.19   | 1.07E-19 |        |
|         | 86  | 8.24   | 1.36E-36  | 11.26  | 3.88E-36 |        |
| C5315   | 55  | 4.53   | 2.24E-53  | 6.30   | 2.92E-32 |        |
|         | 110 | 7.57   | 8.50E-53  | 14.57  | 4.40E-59 |        |
| pair    | 19  | -0.27  | 1.46E-36  |        |          | 0.38   |
|         | 58  | 1.95   | 3.39E-82  | 6.73   | 3.99E-31 |        |
|         | 115 | 4.36   | 1.34E-123 | 14.62  | 6.88E-57 |        |
| C6288   | 97  | 1.93   | 9.36E-69  | 10.72  | 7.01E-51 |        |
| C7552   | 98  | 3.68   | 4.15E-100 | 8.25   | 1.48E-52 |        |
|         | 196 | 6.32   | 2.27E-110 | 17.93  | 4.42E-96 |        |
| des     | 112 | 2.12   | 5.23E-82  | 0.79   | 3.21E-55 |        |
| i10     | 119 | 2,24   | 7.27E-130 | 4.26   | 5.98E-50 |        |
|         | 238 | 3.88   | 1.34E-158 | 8.67   | 4.48E-94 |        |
| apex5   | 4   | 1.48   | 4.38E-08  |        |          | 0.38   |
| S9234   | 19  | 0.38   | 4.09E-32  |        |          | 2.61   |
| S5378   | 20  | 1.73   | 8.88E-26  |        |          | 2.60   |
| mm30a   | 20  | -0.04  | 1.30E-34  |        |          | 1.36   |

## REFERENCES

- [1] C. H. Chang, Z. Ye and M. Zhang, "Fuzzy-ART based adaptive digital watermarking scheme," *IEEE Trans. on Circuits and Syst. for Video Technology*, vol. 15, no. 1, Jan. 2005, pp. 65-81.
- [2] A. T. Abdel-Hamid, S. Tahar and E.M. Aboulhamid: "A Survey on IP Watermarking Techniques," *Design Automation for Embedded Syst.*, vol. 10, 2005, pp. 1-17, Springer Verlag.

- [3] J. Lach, W. H. Mangione-Smith and M. Potkonjak, "Fingerprinting digital circuits on programmable hardware," *Information Hiding*, Lecture Notes in Computer Science, vol. 1525, Springer Verlag, 1998, pp. 16-31.
- [4] J. Lach, W. H. Mangione-Smith and M. Potkonjak, "FPGA fingerprinting techniques for protecting intellectual property," *Proc. IEEE\_Custom Integrated Circuits Conf.*, Santa Clara, California, May 1998, pp. 299-302.
- [5] I. Hong and M. Potkonjak, "Techniques for intellectual property protection of DSP designs," *Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing*, Seattle, Washington, USA, vol.5, May 1998, pp. 3133-3136.
- [6] G. Qu and M. Potkonjak, "Hiding Signatures in Graph Coloring Solutions," *Proc. 3rd International Workshop on Information Hiding*, Dresden, Germany, Sept. 1999, pp. 348-367.
- [7] G. Qu and M. Potkonjak, *Intellectual Property Protection in VLSI Design: Theory and Practice*. Boston: Kluwer Academic Publishers, 2003.
- [8] A. B. Kahng, J. Lach, W. H. Mangione-Smith, S. Mantik, I. L. Markov, M. Potkonjak, P. Tucker, H. Wang and G. Wolfe, "Constraint-based watermarking techniques for design IP protection," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Syst.*, vol. 20, no. 10, Oct. 2001, pp. 1236-1252.
- [9] D. Kirovski, Y. Y. Hwang, M. Potkonjak and J. Cong, "Intellectual property protection by watermarking combinational logic synthesis solutions," *Proc. IEEE/ACM Int. Conf. on CAD*, San Jose, California, Nov. 1998, pp.194-198.
- [10] D. Kirovski, Y. Y. Hwang, M. Potkonjak and J. Cong, "Protecting Combinational Logic Synthesis Solutions," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Syst.*, vol. 25, no. 12, Dec. 2006, pp. 2687-2696.
- [11] S. Meguerdichian and M. Potkonjak, "Watermarking while preserving the critical Path," *Proc. ACM/IEEE Design Automation Conf.*, Los Angeles, California, June 2000, pp. 108-111.
- [12] M. Moiz Khan and S. Tragoudas, "Rewiring for watermarking digital circuit netlists," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Syst.*, vol. 24, no. 7, July 2005, pp. 1132-1137.
- [13] D. Kirovski and M. Potkonjak, "Local Watermarks: Methodology and Application on Behavioral Synthesis," *IEEE Trans. on Computer-Aided Design of Integrated Circuits and Syst.*, vol. 22, no. 9, Sept. 2003, pp.1277-83.
- [14] A. Cui and C. H. Chang, "Stego-signature at logic synthesis level for digital design IP protection," *Proc. IEEE Int. Symp. on Circuits and Syst.*, Kos, Greece, May 2006, pp. 4611-4614.
- [15] A. T. Abdel-Hamid, S. Tahar and E. M. Aboulhamid, "A public-key watermarking technique for IP designs," *Proc. Design, Automation and Test in Europe*, vol. 1, Munich, Germany, Mar. 2005, pp. 330-335.
- [16] J. Cong and H. Huang, "Depth optimal incremental mapping for field programmable gate arrays," *Proc. ACM/IEEE Design Automation Conf.*, Los Angeles, California, USA, June 2000, pp.290-293.
- [17] J. Y. Jou and D. S. Chou, "Sensitisable-path-oriented clustered voltage scaling technique for low power," *IEE Proc. Computers and Digit. Tech.*, vol. 145, no. 4, July 1998, pp. 301-307.



Aijiao Cui (S'06) received her B. Eng. degree and M. Eng. degree in electronics from Beijing Normal University, Beijing, China, in 2000 and 2003, respectively. From July 2003 to December 2004, she has been working as a teaching assistant in Beijing Jiaotong University.

Since January 2005, she has been pursuing the Ph.D. degree in Electrical and Electronic Engineering in Nanyang Technological University, Singapore. Her main

research interest is digital watermarking for IP protection.



**Chip-Hong Chang** (S'92–M'98–SM'03) received his B.Eng. (Hons) from National University of Singapore in 1989, and his M.Eng. and Ph.D. from Nanyang Technological University (NTU), Singapore in 1993 and 1998, respectively.

Dr. Chang served as Technical Consultant in industry prior to joining the School of Electrical and Electronic Engineering, NTU in 1999, where he is now an Associate Professor. He holds joint appointments at the university as

Deputy Director of the Centre for High Performance Embedded Systems (CHiPES) since 2000, and Program Director of the Centre for Integrated Circuits

and Systems (CICS) since 2003. His current research interests include low power arithmetic circuits, digital filter design, application specific digital signal processing and digital watermarking for IP protection. He has published three book chapters and more than 130 research papers in international refereed journals and conference proceedings. He serves as an Editorial Advisory Board Member of The Open Electrical and Electronic Engineering Journal (OEEE) and has been involved as organizing and program committee of various international conferences. He is listed in the 2008 and 2009 Marquis Who's Who in the World.

Dr. Chang is a Fellow of the IET, UK.



Sofiène Tahar (M'96–SM'07) received in 1990 the Diploma degree in computer engineering from the University of Darmstadt, Germany, and in 1994 the Ph.D. degree with Distinction in computer science from the University of Karlsruhe, Germany. From 1995 to 1996, he was a postdoctoral fellow at the Université de Montréal, Quebec, Canada. Currently he is Professor in the Department of Electrical and Computer Engineering at Concordia University, Montreal, Quebec, Canada.

Dr. Tahar is founder and director of the Hardware Verification Group at Concordia University. He has made contributions and published papers in the areas of formal hardware verification, microprocessor and system-on-chip verification, VLSI design automation, and communications architectures and protocols.

Dr. Tahar has been organizing and involved in various international conference program committees as well as national research grant selection committees. In 1998, he received a Canada Foundation for Innovation (CFI) Researcher Award. From 2001 to 2006, he held a junior Concordia University Research Chair in Formal Verification of Microelectronics Systems. In 2007 he was named University Research Fellow upon receiving the University Research Award. In the same year, he has been appointed senior Concordia University Research Chair in Formal Verification of System-on-Chip.

Dr. Tahar is a Professional Engineer in the Province of Quebec.