Watermarking is proposed as a mean to protect intellectual property contents of electronic systems from copyright infringement. The technique consists of implanting indelible stamps in the circuit's inner structure, while not disrupting its functionality or degrading its performance. In this paper a novel method is proposed for the creation of watermarks in regular sequential functions operating on finite inputloutput sets. This is an important class of circuits, as it is the basis of most digital controllers. Algorithm:; are proposed for implanting and detecting watermarks so as to minimize implementation overhead for a specified level of robustness.
Introduction
Today, electronic systems are built in large part using stand-alone, individually packaged chips, assembled on ad hoc printed circuit boards. The industry is currently shifting to a new design paradigm based on the system-on-chip concept. Future systems will be assembled integrating several building blocks, so-called virtual components, on the same silicon substrate. Virtual components, associated to intellectual properties (IPS), will be designed by independent firms, possibly for a number of technologies and applications. To ensure that proper mechanisms exist to govern the exchange and management of IPS, a set of standards and interfaces are currently being defined [l] .
One of the fundamental requirements to promote a practical system-on-chip design paradigm is that copyrights of the design and of its building blocks be safeguarded. In pairticular, it will become essential that the industry find ways to fight potential IP copyright infringement. Currently, design copyright laws are enforced by means of non-disclosure agreements and patents. However, the costs involved in preventing or containing IP infringement and tracking espionage, if at all possible, may be too high. A promising alternative is deterrence. A possible such scheme requires the capability of effectively detecting and subsequently tracking IP infringement cases. This task can be accomplished by embedding a unique code, or watermark, exploiting the IP's unique features. Fundamental requirements for a watermark are that it be (1) transparent, i.e. not interfering with the design functionality, (2) robust, i.e. hard to remove or forge, and ( 3 ) detectable, i.e. easy to extract from the design. The process used for managing watermarks must not necessarily be proprietary, while the code used in the encryption process should be secret for any released IP.
Recently, watermarking has been applied to digital audio-visual IPS [2, 31. Similar schemes have also been proposed for electronic IPS [4, 5, 6, 7] . A1 least two types of watermarking schemes exist. The first scheme, known as active watermarking, consists of integrating the watermark as a part of the design process, thus allowing the creation of an arbitrarily high number of uriique watermarks. The second scheme, known as passive watermarking, is aimed at adding the watermark to a design making use of existing structures, thus requiring no redesign but allowing limited tracking flexibility. Both approaches are robust, since the deletion of the watermark results in removing wanted functionality. IP protection based on watermarking consists of two phases: synthesis and detection. In this paper we propose a set of algorithms for synthesizing watermarks in an important class of circuits which implement regular sequentialfunctions operating on finite inputloutput sequences [8] . Sequential functions are modified to generate a predictable output when a unexpected signals are to be applied to the input. In this context, the watermark is a pair of inputloutput sequences of symbols which cannot occur during normal operation. Such sequences are hidden among "legal" inputloutput sequences, thus making it extremely time-consuming to track and remove them, with the risk of accidentally modifying intended functionality.
Sequential circuits are generally complex and highly optimized automata, developed both in stand-alone and embedded processors. In order to maximally exploit the advantages of a particular technology, there is little room for overhead, both in form of additional circuits and/or signals. For this reason, the proposed algorithms operate both in active and passive synthesis regimes and they are designed to prevent excessive implementation overhead for a specified level of detection confidence.
The synthesis phase is fully characterized by (a) a set of algorithms translating design features onto a unique watermark, (b) t,, the worstcase time required to forge and/or delete the watermark, and (c) P,, the odds that a design carries an unintended watermark in part or in its totality. The detection phase is fully characterized by (d) P,, the probability of a miss and (e) Pj = P,, the probability of a false alarm. Watermarking should be performed simultaneously at various levels of abstraction [4] . The goal is to improve the robustness of the approach and bo allow quick and accurate tracking of the last licensee, who ultimately caused the infringement. The paper is organized as follows. A formulation of the problem is presented in Section 2. Section 3 outlines the process of modifying the inner structure of regular sequential functions to add the watermark. Detection techniques are presented in Section 4, examples in Section 5.
General Problem Formulation
Regular sequential functions operating on finite inputloutput sets can be specified by means of a Finite State Machine (FSM). A FSM is a discrete dynamical system translating sequences of input vectors into sequences of output vectors and it is generally represented by State Transition Graphs (STGs) and State Transition Tables (STT) .
In the reminder of the paper we will restrict our focus to deterministic FSMs, using the same notation of [8] and [9] . Completely specified FSMs (CSFSMs) contain every element of set Z', i.e. every input sequence in Z* results in a unique output sequence in A*. An incompletely specified FSM (ISFSM) is one in which there exist some transition relations with unspecified destination and/or output, i.e. there exist a set of input sequences for which no output is specified. Call I , C Z' such set. Conversely, there exist a set of output sequences which can be produced only by unspecified input sequences. Call 0, C A* such set. The problem of minimizing the number of states in CSFSMs can be solved in polynomial time [IO] . For ISFSMs the problem is known to be NP-complete [l 11. 
w h e r e x and are constraints on the watermark robustness. Problem 1 can be partitioned into two tasks. The first task consists of computing the size of IO signature U so as to satisfy the constraints on the confidence. The second task is that of finding the actual IO signature so as to minimize the overhead of M". The IO signature must be generated with some degree of randomness to ensure that, using the same algorithm one cannot generate an identical code. The randomized algorithm is controlled by key k . The overhead accounts for added states and logic.
Synthesizing watermarks in CSFSMs requires first that the machine be translated onto a ISFSM. This can be accomplished by extending the input andlor output alphabets C and A. The resulting machine is then handled by solving Problem 1. Hence, the procedure can be seen as a preprocessing step to a general watermark synthesis step.
A passive watermarking scheme consists of generating signature U from a given ISFSM without modifying the machine itself. The process consists of first minimizing the FSM, thus synthesizing a CS-FSM. Then, all sections of the non-specified IO mapping are designated as a IO signature. Randomization of the signature,controlled by key k, resides in the FSM minimization algorithms. Hence, the probability of accidentally synthesizing the same watermark are bounded by the degrees of freedom of the algorithm and/or by its level of randomization.
IO Signature Generation
In this section a solution to Problem 1 is proposed. At least two approaches exist to the generation of IO signature U. The first involves the generation of new transition relations in the FSM's STG or Sl'T, while the second calls for the augmentation of C, A or Q. All these modifications are likely to but do not necessarily increase the size of the machine.
Let q' E Q' denote a state in an ISFSM M' and let qh be its reset state. Let I?') be the set of all the input configurations in q' for which no next state is specified, call such configurationsfree. Define U' to be the set of all the states with incompletely specified transition relations, i.e. U' = {q' E Q' I lI?')l > 0). The total number of free input configurations n is bounded as follows n 5 nmas = II?')~.
(2) P'EQ'
Every state q' E U' must necessarily be reachable lIhq')l times, using each time one of the remaining free input configurations in Ip'). Suppose that a sequencex exists of all the visited states, call s the input sequence which forces x. The resulting output sequence d , of length n, will be one of [2lailn possible implementations. Hence, the odds that an identical sequence be produced by M is
The second term of the denominator is given by the fact that one of such sequence will result from the given input sequence in the CSFSM in P,I. By setting P, 5 and solving (3) with respect to n one obtains (4) In some cases it is not possible to satisfy both (2) and (4) to meet specification (l), i.e. nmin > nmas. Hence, either (1) must be relaxed and/or nmar must be increased.
Suppose constraints (2) and (4) are satisfied, then an output sequence& E A* and the states which can produceit must beselected.
The wanted output is generated by an n-long sequence of states in U'. The sequence can be seen as a path p , = (qh, U:, . . . , covering a subset of the states in U', with or without repetition. It is assumed, but it is not necessary, that q& E U'. If this were not the case, a different first state, say q(,' E U', could be selected for p , and input sequence sa would need to be augmented an input sequence s such that S'(q0, s) = q:. The generation ofp, does not contribute to the probability of coincidence P,, but it does determine the impact state minimization will have on the final machine. The second factor impacting the effectiveness optimization is the selection of input sequence 3 , . Several alternatives are proposed for the generation of the input sequence s, to minimize overhead. The first method consists of performing an exhaustive search of the decision tree. For each path a CSFSM is synthesized and the smallest machine is selected. The second method is a Monte Carlo approach, iin which a set of input sequences are selected at random from all the feasible ones. The CSFSMs corresponding to such sequences are generated and the smallest one is selected. The third method is based on a branch-andbound search. At each level of the tree an estimate is computed for the machine associated with each sub-tree underlying any decision. Such estimate is computed using a Monte Carlo approach. All the sub-trees with higher estimates are pruned, while the surviving trees As a byproduct of Step 6 the FSM is synthesized. A passive watermarking scheme is applied to ISFSMs only. The method assumes that randomization can be introduced by the IFSM synthesis. It consists of converting the original ISFSM onto a CSFSM using a given optimization criterium. Then, an IO signature is selected at random from all the possible ones available. The only way to synthesize a CSFSM from the original ISFSM which contains an identical IO signature is to use the same synthesis engine with an identical set of parameters and optimization criteria. Hence, P,, can be derived in this case as the inverse of all possible machines which can be generated from an ISFSM of a certain size and structure with the given engine.
3.3.2

Watermark Detection
In the previous sections we have proposeid techniques to generate an IO signature = {s, , d,} and to embed it in the machine. Detecting To properly analyze the effects of tampering, let us consider the following scenarios: (1) specifications on the IO mapping are known, (2) IO mapping is not known but the STG of the CSFSM is known, and (3) no STG is known. In case (l), infringement cannot be prevented, since the aggressor can resynthesize the FSM from specifications using techniques proposed, e.g. in [9] . In case (2), IO mapping can be derived from the STG, but usually at an extreme computationavstorage cost, thus being in most cacies impractical. In this case the aggressor may either: (a) modify state transition relations, i.e. changing the output or next state associated with a transition relation, or (b) apply the techniques proposed in this paper to watermark CSFSMs. In both cases, part or the totality of the watermark will be unchanged,but it may be cormptedlocally. Tampering (a) may in fact result in a change in the functionality of the machine, it is therefore counterproductive. Tampering (b) will only result in literal swaps and 01,00,10,01,00,11,10,01);(0,1,1,0,1,0,1,1,1 
)}
Suppose that tampering has removed or corrupted the median section of d,, i.e. (0, 1, 0), then the sections of the IO signature which are still intact can be matched to U using the genome-search algorithm described in detail in [4] . The algorithm returns an estimate of the probability that the design contains in fact watermark Q. Note that by construction, it is known when the reset state is reached. Hence, the boundary symbols or operons of each "gene" are known. An alternative method is that of using correction schemes such as CRC to detect and correct corrupted subsequences.
Results
In our experiments we have used FSMs from the IWLS93 benchmark set. The tools were implemented in C/C++ and run under UNIX and Linux operating systems. Watermarking was performed on ISFSMs as well as CSFSMs. Constraint was selected so as to require, in some cases, expansion of Z andor A. The increase in the number of states IQ1 and input/output bits I C 1 is expressed by the area estimates.
The estimates are based on technology mapping obtained with SIS[ 121 using the Msu script. Table 1 lists all relevant experimental data and specifications on the robustness of the watermark. For the FSM minimization stage in the algorithm of Figure 3 the tools STAMINA and NOVA [8] were used. As expected, larger FSMs require less overhead for comparable robustness. Note, as shown in benchmark exl, that overhead can be traded for smaller values of z. The overhead of benchmark s27 was extremely high due to the increase of the output alphabet. Such expansion was however necessary to boost the watermark's confidence. Exhaustive search could be performed only in one benchmark due to the extreme computational complexity of the method. For the other circuits an estimate or a lowerbound of the time required by the search were computed. Such time estimates were derived multiplying the time required by one minimization with the minimum number of free configurations, i.e. 21r!"'"'l n m i n , where IIimin)l = 72 u t IIiqf'I. In the Monte Carlo approach a maximum of ten input sequencess, were explored. Alternatively one could select such upperbound based on some estimate or measurement of the standard deviation of the minimized machine's size.
Conclusions
A watermark-based scheme has been proposed to protect the intellectual property content of sequential functions operating on finite input/output sets. By modeling such functions as finite state machines and exploiting some unutilized input vectors, modifications were introduced so as to trigger a specific response with known input excitations. We have shown how the odds of reproducing an identical behavior can be made arbitrarily small. We have demonstrated how machines, which have been infringed upon, are effectively detected.
