Abstract-Physical unclonable functions (PUFs) can extract chip-unique signatures from integrated circuits (ICs) by exploiting the uncontrollable randomness due to manufacturing process variations. These signatures can then be used for many hardware security applications including authentication, anticounterfeiting, IC metering, signature generation, and obfuscation. However, most of these applications require error correcting methods to produce consistent PUF responses across different environmental conditions. This paper presents a novel method to enable lightweight, secure, and reliable PUF-based authentication. A two-level finite-state machine (FSM) is proposed to correct erroneous bits generated by environmental variations (e.g., temperature, voltage, and aging variations). In the proposed method, each PUF response is mapped to a key during design phase. The actual key can be determined from the PUF response only after the chip is fabricated. Because the key is not known to the foundry, the proposed approach prevents counterfeiting. The performance of the proposed method and other applications are also discussed. Our experimental results show that the cost of the proposed self-correcting two-level FSM is significantly less than that of the commonly used error correcting codes. It is shown that the proposed self-correcting FSM consumes about 2× to 10× less area and about 20× to 100× less power than the Bose-Chaudhuri-Hochquenghem codes.
effort should only be focused on networks and software is no longer valid given globalization of integrated circuits and systems design and fabrication. In 2011, the Semiconductor Industry Association pegged the cost of electronics counterfeiting at U.S. $7.5 billion per year in lost revenue and tied it to the loss of 11 000 U.S. jobs [1] . From a national defense perspective, unsecured devices can be compromised by the enemy, putting military personnel and equipment in danger. Therefore, securing integrated circuit (IC) chips is extremely important.
Physical unclonable function (PUF) is one emerging security primitive, which is a powerful tool for chip authentication and cryptographic applications [2] , [3] . Contrary to standard digital systems, PUFs extract secrets from complex properties of a physical material rather than storing them in a nonvolatile memory. It is nearly impossible to predict, clone or duplicate PUFs. When a PUF is provided with an input (or challenge), the output (or response) should satisfy the following three properties: 1) unique output due to interchip variation; 2) random output that is difficult or impossible to model; and 3) reliable output that is consistent across different environmental conditions. The challenge and response pairs (CRPs) of a PUF are used to generate chip-unique signatures for an authentication system. Unfortunately, PUFs are noisy in nature. The response of a PUF would be affected by intrachip variation sources such as temperature changes, voltage drifts, and aging effects. The reported PUFs in the literature, such as optical PUF [4] , multiplexer PUF [5] , ring oscillator PUF [3] , butterfly PUF [6] , SRAM PUF [7] , sensor PUF [8] , bistable ring PUF [9] , memrister PUF [10] , and spin-transfer torque MRAM PUF [11] , are not 100% stable. However, cryptography in general relies on the existence of precisely reproducible keys. As a result, it is clear that the plain PUF responses are not suitable as cryptographic keys. One solution to this problem is to add a stage to correct the errors after collecting the PUF response based on error correcting codes (ECCs) such as Bose-Chaudhuri-Hochquenghem (BCH) codes [12] , [13] or fuzzy extractors [14] , [15] . In order to obtain reliable responses from PUFs, while still keeping PUFs attractive for low-cost hardware applications, the error correction technique must be implemented in hardware as well. Moreover, its implementation should be area-efficient; otherwise, it will defeat the purpose of using PUFs in lightweight hardware devices. These error correcting techniques are not tailored to PUF applications. In fact, the BCH codes are usually used in storage devices and communication systems [16] , and the first fuzzy extractor construction is aimed at biometric applications [17] .
This paper presents a two-level finite-state machine (FSM) architecture which can be used to authenticate a chip or IP inside a chip. Besides, a novel self-correcting approach is also proposed based on the two-level FSM which eliminates the use of high overhead error correcting techniques. In the literature, FSM-based techniques have been incorporated into a number of works on hardware device protection, which include active metering [18] , remote IC activation [19] , field-programmable gate array (FPGA) IP binding [20] , and obfuscation [21] , [22] . The major advantage of using FSM is that it is not extractable from the synthesized design. Thus, even for an adversary who has access to the synthesized hardware IP, extracting or changing the FSM would require significant redesign of all the stages [23] . By utilizing the benefit of FSM, our proposed approach can achieve a lightweight, secure, and reliable authentication approach. Different from previous works, we incorporate self-correction into the FSM, which could eliminate the use of high-overhead error correcting methods. This paper perhaps is the first work to propose an approach for PUF-based local authentication without using ECC or fuzzy extractor. In the proposed method, each PUF response is mapped to a key during design phase. The actual key can be determined from the PUF response only after the chip is fabricated. Because the key is not known to the foundry, the proposed approach prevents counterfeiting. Furthermore, the key can be changed by changing the challenge of the PUF. Thus, keys can be updated in a future time as well; this provides additional flexibility in securing used chips.
The proposed research is motivated by several applications. In one application, third party IP can be protected using the proposed approach where each IP includes a PUF. In the proposed approach, the IP vendor creates a key for each possible response where the key can be longer or of same length as the PUF response. This key is integrated into the design in the proposed approach. The IP vendor provides the response-key mapping to the designer or owner of the chip. The foundry does not have access to this mapping. After fabrication, the PUF response is determined. Unique key of each chip can be determined by the designer from the PUF response and mapping provided by the IP vendor. This key can be supplied to the customer or a trusted third party to authenticate each chip or IP within the chip. Since the foundry does not have access to the response-key mapping, it cannot profit by selling excess parts. In other applications such as sensor networks, devices can be authenticated locally using the proposed approach without requiring Internet connection. A key contribution of this paper is the error-correcting ability of the two-level FSM if the PUF response is not error-free. Up to m bits of the error can be corrected by the FSM without requiring an ECC decoder.
The rest of this paper is organized as follows. Section II introduces the background of PUF-based authentication and error correcting techniques used for these PUF-based authentication schemes. In Section III, we present the two-level FSM design that can be used for local authentication and IP binding. Section IV describes the novel low-cost self-correcting method that is integrated into the two-level FSM. In Section V, we discuss other applications of the proposed self-correcting FSM. We report the performance of the proposed method and comparison with the BCH codes in Section VI. We also introduce the concept of hierarchical authentication in Section VII. Finally, Section VIII presents remarks, conclusions and future directions.
II. BACKGROUND
A. PUF-Based Authentication 1) Remote Authentication: As the PUF response is unique and unpredictable for each IC, it is straightforward to use PUF for IC authentication. In the literature, authentication using PUF response bits as secret keys has been explored in many previous works [12] , [13] , [24] , [25] . Most of the existing PUF-based authentication schemes are remote authentications, which involve a device and a trusted party or so-called server. A communication link should be established between the server and the devices. During the enrollment phase, the trusted party applies randomly chosen challenges to the device to collect unpredictable responses. Then, the trusted party stores these CRPs in a database for future authentication operations. Later, if a device initiates an authentication request, the trusted party will send a challenge to the device and obtain the PUF response through the communication link. The device will be considered as authentic only if the response is the same as the previously recorded one or only vary in a predefined range. However, remote authentication schemes suffer from man-in-the-middle attacks. The adversary will be able to perform modeling attacks to create a software program after collecting a set of CRPs.
2) Local Authentication: The problem can be resolved by using a local authentication scheme, as CRPs will not be transmitted through a communication link. Local authentication can be used to verify that each component inside a system is authentic and has not been tampered with. Local authentication is applicable to different layers of the design hierarchy: for instance, a controller can authenticate each IC in a system, an IC can authenticate each IP block, an IP can authenticate each functional unit, and so on. Moreover, local authentication is particularly useful in the applications where the communication link between the server and the device cannot be established or the server is not available. Note that local authentication scheme can also achieve IC metering [18] and IP binding [20] . Fig. 1 illustrates the basic process of PUF-based local authentication. Unlike the remote authentication that authenticated CRPs are kept on a trusted server, secret information has to be stored on chip in a local authentication scheme. For example, after fabrication, an authenticated response for a given challenge can be programmed into the memory of the chip after fabrication. During the authentication phase, a regenerated response of the PUF is compared to values in the on-board memory to check the authenticity of the device. Alternatively, the designer can provide the authenticated CRPs to the customers which can be considered as a key that is required to be entered during authentication process, which could protect their ownership.
B. Error Correction
One major issue for PUF-based local authentication is the robustness of the system, since environmental variations will also affect the PUF response. In a server-based remote authentication system, this issue can be easily resolved by tolerating certain number of bit errors. For example, the server could authenticate a PUF response whose Hamming distance (HD) to the desired response is less that a certain threshold. However, this method cannot be used in a local authentication scheme due to the high overhead. Additionally, it is not feasible to store a large amount of CRPs on chip. As mentioned in Section I, ECC and fuzzy extractor can be employed into PUF-based authentication protocols to improve their robustness, which could also be used in the local authentication scenario. However, given the fact that PUF is usually a compact circuit, the use of error correcting techniques significantly increases the design complexity. Moreover, there is a security concern that the syndrome or helper bits will reveal information about the secret bits. Therefore, error correction has to be secure, robust and efficient.
In a typical error correction setting for PUF, during an initialization phase, a PUF response is generated and the error correcting syndrome is computed based on this response. The syndrome or helper data is public information which is later sent to the PUF along with the challenges to perform correction on response bits. Equivalently, the syndrome can be stored locally on chip. To regenerate the same PUF output, the PUF first produces a response from the circuit. Then, the PUF uses the syndrome from the initialization step to correct any errors in the circuit output. In this way, the PUF can consistently reproduce the output from the initialization step if the device is authentic.
However, in most of previous works on correcting PUF responses, the area complexities of the ECCs or the fuzzy extractors were not considered and there was no detailed explanation on how to choose the algorithm and the parameters [12] , [13] . In fact, the implementation costs and hardware overheads of the commonly used ECCs are relatively high, compared to the PUF circuit [26] . Besides using ECC or fuzzy extractor, other error correcting methods have also been exploited in the literature. The method of utilizing majority voting on reducing errors has been demonstrated in [27] . The use of repetition codes along with conventional syndrome generation using XOR masking has been proposed for PUFs in [15] . Soft-decision encoders and decoders have also been employed to correct PUF response errors [28] . However, the hardware implementations of these methods have been only based on FPGAs and the area complexities have not been addressed [15] , [27] [28] [29] . Furthermore, authentications using pattern matching algorithms [25] , [26] need to maintain a database to store pattern information, which are also not suitable for local authentication. In contrast to the above works, this paper proposes a novel secure, reliable, and efficient error correcting method which can be used for PUF-based local authentication.
III. TWO-LEVEL FSM ARCHITECTURE
This section presents a basic two-level FSM scheme which could be used for PUF-based authentication, IP binding, and IC metering. The general concept is illustrated in the state transition graph (STG) of Fig. 2 . The PUF response and an authentication key (i.e., key in Fig. 2 ) are used to determine the state of the FSM. The PUF response is the input of the first level FSM, while key is the input of the second level FSM. The first level FSM is designed to transit to a unique intermediate state for each unique PUF response. Only if the actual PUF response is the same as the desired PUF response, the FSM will enter into the correct intermediate state. Furthermore, only one key value will transit the second level FSM from a certain intermediate state to the desired state (i.e., Auth in Fig. 2) . The desired state then can be used to authenticate the circuit or activate the correct functionality of certain blocks. Basically, this architecture requires unique mapping pairs between PUF response and key, i.e., (
It is important to note that (R i , K i ) can be arbitrarily designed. Thus, all such possible pairs are only known to the designer. For an N-bit PUF response, there will be 2 N intermediate states. The length of the key will at least be N, if we ensure only one value of the key could transit the FSM into the desired state. Note that the lengths of PUF response and the key are not necessarily identical, i.e., N-to-N mappings. A longer key can be used to increase the complexity of the structure. At the expense of increasing the probability of key collision, multiple PUF responses can also be mapped into one intermediate state or multiple key values can be designed as correct inputs to a PUF response. Moreover, the (R i , K i ) mappings can be designed differently for different chips. As a result, an adversary with access to the response and key authentication records from other devices will still be unable to authenticate a new device. For example, we consider an example of a 3-bit PUF response as shown in Fig. 3 . Without loss of generality, we can assume the values of R i as marked in Fig. 3 , respectively. The correct (R i , K i ) pairs are summarized in Table I , where the values of K i can be arbitrarily chosen.
We can add another state Unauth as shown in Fig. 4 . If the key entered is wrong, the FSM will transit into the Unauth state. This stage can be used to lock the chip or trigger an alarm to report a possible attack.
The global flow of the local authentication scheme by utilizing the proposed technique is shown in Fig. 5 . For example, Company X designs the circuit and integrates a PUF and the self-correcting FSM into the design. The (R i , K i ) pairs are designed at this stage. After that, Company X sends the detailed manufacturable design specifications to Foundry Y who makes the mask and manufactures multiple chips implementing this design. Each chip will be uniquely locked after fabrication due to the interchip variations of the PUFs. For each chip, the PUF response needs to be tested, which can be conducted by either Foundry Y or Company X if Foundry Y sends the manufactured chips back to Company X. If the tests are done by Foundry Y, the PUF responses should be sent back to Company X. According to the PUF response (which will be considered as the desired PUF response for the authentication phase), only Company X has knowledge of calculating the key for entering into the correct authenticated state. The key could be programmed by another honest vendor after the chips have been fabricated or sold to the customer, i.e., Company Z. Therefore, the correct key is a strong proof of ownership. At the beginning of the authentication phase, the FSM enters into an intermediate state based on the variability-induced response of PUF, as shown in Fig. 6 . Only Company Z, who has the possession of the correct key, can authenticate the chip.
The global flow can also prevent the ICs piracy from overbuilding, as modern chip designs are usually outsourced for fabrication. For example, it is conceivable that a dishonest manufacturing plant could create more chips than ordered and sell the additional chips at a lower cost, subverting the profits of the legitimate owner. However, by employing the proposed two-level FSM, over-produced chips without the correct keys cannot function properly, since the manufacturing plant (Foundry Y, for example) does not know the correct (R i , K i ) pairs.
IV. ERROR CORRECTION BASED ON THE TWO-LEVEL FSM A. Self-Correcting Functionality
We propose a novel FSM structure which not only has the capability for PUF-based authentication, but also could correct certain number of PUF response bit errors, caused due to environmental variations, to improve the robustness.
The two-level FSM structure presented in Section III can be extended to incorporate the error correcting functionality by allowing a second key attempt, as shown in Fig. 6 . For instance, in the conventional two-level FSM authentication scheme, if the PUF response varies due to environmental noise, the FSM will enter into a different intermediate state and the device cannot be authenticated with the correct key. The problem can be solved by introducing a pathway from intermediate state S i to the authenticated state, where S i represent a state whose HD of the corresponding PUF response to the correct PUF response is less than or equal to m bits. Note that S i and S j as shown in Fig. 6 may represent multiple candidate states. If the PUF response varies within m bits in the first level of the FSM structure, the second attempt of the correct key K i will bring the FSM back into the desired intermediate state S i . The advantage of the proposed approach is the inherent redundancy built into the self-correcting FSM by contiguously entering the key twice that eliminates the need for an extra ECC. We can also design a scheme that the key will always be internally entered to the FSM twice.
A 3-bit example of such a structure is shown in Fig. 7 . For example, if the response of the PUF has been tested to be 010 during the enrollment phase, the designer can calculate the correct key as K 3 . However, during the authentication phase, the PUF response may vary 1 bit due to environmental variations, e.g., 000 instead of 010. In the proposed structure, the FSM could transit into the desired black state S 3 from the gray state S 1 by entering the correct key K 3 for the first time. Then, if the authenticated user enters K 3 once more, the FSM will transit to the authenticated state. For other PUF responses that have HDs of 2 or greater, K 3 would not be able to bring Table II .
B. Advantages and Limitations
Similar to ECC or a fuzzy extractor, the error-correction capabilities are achieved with HD comparison. The role of the proposed two-level FSM is essentially correcting a limited number of errors when compared with the correct input. These error correcting methods (including ECC and fuzzy extractors) can be expressed as: if HD(R i , R ) ≤ m, correct the errors or unlock the system, where R is the actual input (PUF response in our case) and m is the error-correcting threshold. In our approach, we map a correct K i to the correct R i , i.e., R i = F(K i ). If HD(F(K i ), R ) ≤ m, the system will be authenticated.
However, we neither directly compare the R and K in our design nor actually compute the mappings from R i to K i or from K i to R i , i.e., R = F(K i ) or K = P(R i ). Instead, both R and K are treated as inputs of the FSM. The state of the FSM will be determined by the values of R and K. The whole process is shown in Fig. 8 . Moreover, these mappings can be designed arbitrarily and hidden deeply inside the FSM after synthesis. It is also important to note that the proposed approach is not only capable of correcting PUF response errors, but also achieves a local authentication scheme. 1) Key Collision: Also similar to ECC or a fuzzy extractor, adding the error correcting functionality will degrade the level of security. The probability for the adversary to guess the key value for a given PUF response will increase from (1/2 N ) to ((1 + N) Table III . For example, besides K 3 , three other key values K 1 , K 4 , and K 7 can also authenticate the PUF response 010. However, when N is large (e.g., N = 256), the value of ((1 + N)/2 N ) = (257/2 256 ) will still be very small for a 1-bit correction scheme. Even for m = 7 and N = 256, the value of
is still 1.17 × 10 −64 . Furthermore, a requirement for a practical PUF is that the PUF responses should have a large interchip variation (50% HD ideally) so that even in the presence of noise it is possible to distinguish responses originating from different devices. Therefore, key collision of the proposed self-correcting approach would not be an issue for PUF-based authentication. A set of distinguishable keys can be obtained for different chips even if the (R i , K i ) pairs are designed equivalently. Moreover, as discussed in Section III, we can increase the length of the key to improve the security. I, O, S, S 0 , F, G) , where S is a finite set of internal states, I and O represent finite set of inputs and outputs of the FSM, respectively, F is the next-state function, G is the output function, and S 0 is the initial state. However, the extra transition edges introduced by the error correcting functionality only affect the next state logic of the FSM, while the output logic and the size of state registers remain the same, as shown in Fig. 9 . As a result, we can expect the extra design complexity would be relatively small for the self-correcting FSM structure, which only involves an N-to-N combinational logic synthesis.
Furthermore, although the state size will increase exponentially with the length of PUF response, the size of state Fig. 9 . Only the output function of the FSM needs to be redesigned after adding the self-correcting functionality.
register as shown in Fig. 9 will only increase linearly. For an N-bit PUF response, there are 2 N intermediate states in the proposed two-level FSM. However, the state registers only need to store N + 1 bits in total (including S 0 , Auth, and Unauth). For examples, when N = 3 as shown in Fig. 7 , only a 4-bit state register is required, as there are 11 states in the design. Therefore, the proposed self-correcting FSM scheme could enable lightweight yet reliable PUF-based authentication.
3) Security: Another advantage is that the proposed scheme is more secure, compared to ECC. For example, in the applications of ECC, when testing a PUF response, an error correcting syndrome for that response is also computed and saved for later use. The syndrome is information that allows for correcting bit-flips in regenerated PUF outputs. The generated syndrome is public information and can be stored anywhere (on-chip, off-chip, or remotely on a server). Clearly, the syndrome reveals information about the PUF response. In general, given the b-bit syndrome, attackers can learn at most b bits of the N-bit PUF response, where b is less than N and is dependent on the specific parameters of the employed ECC. The soft-decision syndrome coding scheme proposed in [28] could limit the amount of leaked information to improve the security of PUF-based authentication. However, the generated syndrome is still associated with the PUF response.
As opposed to these prior works, the key in our proposed self-correcting FSM is not public and only the designer or the authenticated users have the knowledge of the correct key. Furthermore, the fact that the PUF response and authentication key pairs (R i , K i ) can be arbitrarily designed, which are not based on any algorithm such as ECC, also enhances the security. There is no inherent equation for the FSM. In other words, even if the adversary knows the PUF response (or key) in a (R i , K i ) pair, it is still infeasible to guess the corresponding key (or PUF response). Another advantage is that the successful key values are not close, i.e., HD is less than a certain threshold. For example, even for two pairs (R i , K i ) and (R j , K j ) that the HD of the two PUF responses R i and R j is only 1, the HD of the two keys K i and K j could be very large. Additionally, along with the assumption as discussed in Section II and claimed in [23] that extraction of the corresponding STG of the FSM is a computationally intractable task, the proposed scheme could achieve a very high level of security.
C. Security Assessment of the FSM
The proposed PUF based local authentication with selfcorrection approach rests upon the assumption that the manufacturer or the adversary does not know and cannot compute the correct values of the key based on the PUF responses. Otherwise, the manufacturer or the adversary could just program these values on overproduced copies and the IP cannot be protected. Therefore, the security of the proposed method stands largely on the adversary's ability to find a key K for a PUF response R that unlocks the system.
In this section, we discuss the possible attacks on the proposed two-level FSM and a design algorithm which can achieve higher security. Note that the PUF response and key mappings can be embedded into targeting design of a specific application where there is an appropriate FSM to further enhance the security.
The goal of attacks is to determine the correct values of the key inputs. The naive idea of brute-force search does not work. If the length of the key is N, average 2 N−1 attempts are required to obtain the correct key value for a given PUF response. Clearly, this is not practical. One of the most possible attacks to the proposed FSM is to predict the correct key value for a given PUF response after collecting a large number of PUF response and key pairs, which is similar to scenario of modeling attacks on PUFs. Modeling attacks on PUFs attempts to model the functional behavior of the PUF circuit after collecting a set of CRPs. For example, an arbiter (MUX) PUF can be modeled as a linear additive delay model [30] , [31] . The collected CRPs can be used to solve the linear equations to calculate the delay difference of each stage. Then the estimated delay differences can be used to compute the response for a new challenge. The PUF and the system built on the PUF are compromised, if the modeling attack could achieve a high accuracy. In summary, the modeling attack is successful if the parameters of the PUF circuit equations can be correctly determined by using the collected CRPs.
As opposed to the PUF circuit, the proposed two-level FSM will be significantly less vulnerable against this type of attack, since the PUF response and key pair mappings can be designed arbitrarily. There is no such equation for all two-level FSM designs exist. For the PUF circuit, the equation for a certain type of PUF is always the same, while the parameters (e.g., the delay difference of each stage in an arbiter PUF) are different for different PUF instances. There will be strong correlations between the challenges whose HDs are small. For example, if there is only 1 bit difference of two challenge, we can expect the output of the PUF circuit will be very likely the same. However, for the proposed two-level FSM, the correlation of key values for close PUF responses will be minimal, as the FSM can be arbitrarily designed. As we discussed in Section IV, even for two pairs (R i , K i ) and (R j , K j ) that the HD of the two PUF responses R i and R j is only 1, the HD of the two keys K i and K j could be very large.
However, we still need to design the two-level FSM carefully, even though there is no inherent equation of the FSM mapping exists. Clearly, simple mappings will be easy to solve after collecting a number of PUF response and key pairs. For example, if a two-level FSM only involves simple bitwise comparison, the adversary will be able to find the relation only with a small number of PUF response and key pairs. In order to ensure high security, complex functions can be used to implement the two-level FSM, including various types of nonlinear functions whose outputs are not heavily repetitive. For example, instead of the linear model in the arbiter PUF, a polynomial function can be used to design the PUF response and key pair mappings. The detailed tradeoffs between the complexity of the FSM and the feasibility of reverse engineering is beyond of scope in this paper.
Summary
The properties of the two-level FSM can be summarized as below.
1) Introducing error correction into the FSM will degrade the security, as more key values will pass the authentication for a given PUF response. However, from this perspective, the performance of the proposed selfcorrection FSM is same as all other error correcting techniques which are used to correct PUF response errors. This essentially is a tradeoff between security and reliability.
2) The error correcting functionality only adds extra design complexity to the next state logic of the FSM, while the output logic and the size of state registers remain the same. The cost of introducing error correction will be small. 3) Although the number of states increases exponentially with the length of PUF response, the size of state register only increases linearly. Therefore, the proposed twolevel FSM scheme is applicable to any PUFs, which also include strong PUFs with exponential number of CRPs. We can expect the total area and power consumptions to be relatively small, compared to other error correcting approaches. 4) The key will not reveal the information of the PUF response, since PUF response and authentication key pairs (R i , K i ) can be arbitrarily designed. Therefore, the security will be improved.
V. OTHER APPLICATIONS
In fact, the proposed self-correcting two-level FSM can be extended for other applications. In this section, we discuss a number of other possible applications.
A. Two-Factor Authentication
The proposed self-correcting two-level FSM can also be used for the so-called two-factor authentication [32] . The challenge of a PUF is combined with the key to achieve stronger hardware protection. The authenticated device, correct PUF challenge, and correct key are required for the two-factor authentication. In other words, the (R i , K i ) pair is extended to (C i , R i , K i ). The state of the proposed two-level FSM is determined by R i and K i , while R i can be calculated by the challenge C i for a given PUF. However, PUF is a one-way function in the sense that it is hard to reconstruct the challenge from the response. Therefore, even if the adversary knows the desired (R i , K i ) pair, it is still infeasible for the adversary to compromise the device without knowing the correct challenge. Additionally, as described in Section III, the (R i , K i ) pairs can be designed differently for different devices. As a result, (C i , R i ) and (R i , K i ) pairs will be unique for each chip. The security can be greatly improved by the proposed twofactor authentication. The security properties are summarized below.
1) The device cannot be duplicated.
2) The user is unable to authenticate without the device.
3) The device cannot be used by someone else to successfully authenticate the device without the correct key. 4) An adversary with access to the response and key authentication records from other devices is still unable to authenticate a new device without the correct challenge. 5) The device does not need to store any information.
B. Signature Generation
The proposed self-correcting FSM architecture can also be utilized for reliable signature generation. The FSM can be modified as shown in Fig. 10 to regenerate the same PUF output, which would be more straightforward from the error correction aspect. Instead of having two states (i.e., Auth and Unauth) at the last stage, the FSM structure can be extended to enable an N-bit output by adding extra 2 N output states (OSs) at the last stage. Note that OS i and OS j may represent multiple states, while R i and R j may represent multiple incorrect PUF responses. Each output state OS i will generate a unique output R i , which corresponds to the desired PUF response. For example, if the actual PUF response R i deviates within a tolerated range from the correct PUF response R i , the FSM will enter into an intermediate state S i . By adding the transition edges of the proposed self-correcting functionality into the STG, the FSM could transit to the desired intermediate state S i by entering the correct key K i for the first time. When entering K i the second time, the FSM will transit into OS i and the correct PUF response R i will be generated. Key values other than K i cannot bring the FSM to the last stage of the FSM from state S i . As a result, the same PUF response cannot be regenerated. In this case, the key value can be made public which will be similar to the functionality of syndrome in ECC or helper data in fuzzy extractor. As mentioned in Section IV, it is still infeasible to predict the corresponding PUF response even if the adversary knows the key value. Note that in this application, key values are considered as public information, which is different from the application of authentication where key values are secret information. Different two-level FSM designs can be used in different applications.
VI. HARDWARE IMPLEMENTATION
In this section, we present the performance of the proposed two-level self-correcting FSM. All the circuits are synthesized using Synopsys Design Compiler with optimization parameters set for minimum area and mapped to a 65 nm standard cell library. Note that we use the same bit-length for the PUF response and key in our implementations. We first examine the performance with respect to the PUF response bit-length N, and the number of tolerated error bits m. Then we compare the proposed error correcting technique to one commonly used ECC for PUF-based authentication, i.e., the BCH codes.
A. Implementation Details
In our experiment, we write a script to automatically generate the Verilog code of the self-correcting two-level FSM based on two parameters: 1) the PUF response bit-length N and 2) the number of tolerated error bits m. Using the script, we can assign (R i , K i ) pairs manually with a certain function or randomly with the built-in pseudo random number generator.
In an effort to simplify the implementation and reduce the length of the final generated Verilog code, we choose to design the two-level FSM with the following steps in our experiment.
1) Write a module that implements a 4-bit permutation (a permutation of series from 0 to 15). 2) Call the 4-bit permutation module (N/4) times to generate the correct (R i , K i ) pairs. Thus, in our experiment, the length of the PUF response should be a multiple of 4. 3) Manually or randomly permute the N output bits of all (N/4) modules to generate the final correct (R i , K i ) mappings. 4) Generate the next-state function of FSM in the Verilog code automatically using the script based on the (R i , K i ) pairs obtained from above steps and complete the output function of the two-level FSM. 5) According to the error-correcting capability parameter m set in the script, the extra transition edges are added into the FSM in the Verilog code. It is important to note that the presented design method is only one option, the two-level FSM can be designed arbitrarily and even with different bit-lengths of the PUF response and the key.
B. Area and Power
Tables IV and V show the area and power consumptions of the FSM as shown in Fig. 6 , respectively, for different design parameters (i.e., N and m). Note that when m = 0, the implemented structure is reduced to the FSM without selfcorrection as shown in Fig. 4 . The results include average area and power overheads over a number of different implementations, where the PUF response and key value mappings are randomly designed (include both manual simple bitwise comparison and highly random perturbation). Note that by using the script to generate the Verilog code automatically, the area and power only vary slightly with different random (R i , K i ) pairs.
As expected, the area and power consumptions are not very significant. For example, the area of the proposed FSM for a 128-bit PUF response with 7 bits error correction is only equivalent to 2061 NAND2 gates, while the power consumption is about 16 µW. This can be compared to 1399 gates and 11 µW with no error correction for a 128-bit PUF response.
For better illustrations, we plot the gate counts for m = 0 (without self-correction) and m = 2 (with 2 bits error correction) as shown in Fig. 11 . It can be observed that when the bit-length N is doubled, the area is also almost doubled for a fixed m. The power consumption exhibits a similar trend.
We also plot the gate counts for different m for N = 64 as shown in Fig. 12 , which is normalized to the gate count of the FSM when m = 0 and N = 64. It can be seen that the overhead is about 30% for adding 1-bit error correcting functionality into the conventional two-level FSM. However, as m increases, the additional overhead in area or power consumption becomes less and less. For example, the overhead is 47% when m = 7, while the overhead is already 39% when m = 2. Therefore, we can expect that overhead of the proposed selfcorrecting FSM would be reasonable even for a large m. Note that we can draw similar conclusions for area and power consumptions of the FSM with other values of N, as shown in Tables IV and V. When comparing with the PUF circuit, we find that the area consumption of the proposed two-level FSM is usually greater than the area consumption of the PUF circuit, since PUFs are very compact. For example, the area consumption of the proposed FSM for a 64-bit PUF response with 2 bits error correction is 1.32 times that of the 64-stage arbiter PUF. This is also the reason that design of low-overhead error correcting method for PUF-based authentication is very important.
C. Comparison to BCH Codes
As illustrated in Section II, the hardware implementation of the error correcting techniques in prior works on PUF-based authentication were either not considered or only implemented on FPGAs. However, error correcting techniques, that consume large area or power, would not be suitable for lightweight and low-cost devices. For example, the majority voting method requires additional 9N registers to store the responses (each challenge is repeated 10 times) [27] , while state registers only need to store N + 1 bits in total the proposed FSM. Note that each 1-bit register needs an equivalent of 5-8 gates. If we assume 1-bit register is equivalent to five gates. Then the majority voting approach requires at least 2880 gates to store nine additional PUF responses for a 64-bit PUF response without counting the voting logic, which is significantly greater than the gate count of the proposed FSM for a 64-bit PUF response. The area consumption of the majority voting will be much larger than the proposed two level FSM. In addition, the majority voting method also increases the latency. Furthermore, to the best of our knowledge, there is no other work on efficient hardware implementation of error correcting method that is particularly well suited for PUF-based authentication in the literature. As a result, it is important to examine the performance comparison with other state-of-theart low-cost error correcting techniques. In order to achieve fair comparison, we also synthesized comparable BCH decoders It can be seen from our experimental results that the proposed self-correcting FSM consumes about 2× to 10× less area and about 20× to 100× less power than the BCH codes. Therefore, it can be concluded that the cost of correcting PUF response can be significantly reduced by the proposed approach. Particularly, the power consumption can be reduced to 1%-5% of the BCH codes. Additionally, as discussed above, the extra overhead of the proposed self-correcting FSM will be small for a large number of tolerated error bits m. However, for the BCH codes, it can be observed from Tables VI and VII that both the area and power consumptions increase linearly with the number of tolerated error bits, which is also consistent with the results in [33] . Therefore, we can expect that the area consumption of the proposed method will be significantly less than the BCH codes for a large m.
Furthermore, it is important to note that the proposed FSM architecture not only corrects the errors, but also has the capability for PUF-based authentication. If we only consider the design complexity for the error correcting functionality itself, the proposed approach would be much more lightweight and low-cost compared to the BCH codes. For example, we consider the overhead of introducing 4 bits error correcting functionality to the two-level FSM without error correction. The area and power overhead results for both the proposed self-correction FSM and the BCH codes are normalized to the cost of the two-level FSM without error correction, as shown in Figs. 13 and 14 , respectively. It can be seen that the normalized overheads of the BCH codes are significantly greater than those of the proposed self-correcting FSM. For instance, when N = 128 and m = 4, the normalized area overhead for the proposed self-correcting FSM is 9× less than the BCH codes, while the normalized power overhead for the proposed self-correcting FSM is 167× less than the BCH codes. It can also be observed that the overhead incurred by BCH codes will decrease as N increases. However, the length of PUF response used for authentication is usually relatively small (N ≤ 256). Therefore, it can be concluded that the overhead of the proposed self-correcting FSM is significantly less than the BCH codes for the PUF-based authentication.
VII. HIERARCHICAL AUTHENTICATION
As mentioned in Section II, the proposed method is applicable to different layers of the design hierarchy. For example, a system could consist of a central control part and a number of different components. A component may also consist of several sub-components (e.g., IP blocks). The key associated with the central control is used to authenticate the device by a trusted server. The central control then can authenticate components locally. A component then can also authenticate its sub-components, if necessary. The device will be functional only after it passes both remote and local authentications.
The flow of a three-level hierarchical authentication is shown in Fig. 15 . In this example, party B1 obtains the designs from other parties C1, C2, and C3, and incorporates these designs in a subsystem. Similarly, party B2 obtains the designs from other parties C4 and C5, and also incorporates them in a subsystem. Each party at level C also provides the correct (R i , K i ) mappings to the corresponding party at level B. Then, A incorporates the two subsystems B1 and B2 in the final system and obtains the correct (R i , K i ) mappings from IP providers B1 and B2. Each component at level B and level C integrates a PUF and a two-level FSM for local authentication individually. Device A also has a PUF on chip for serverbased remote authentication. After assembling system A, all sub-components in the hierarchy need to be authenticated. The CRPs for A will be stored on a server, while the responses for the PUFs at levels B and C will be authenticated through local authentication sequentially. Finally, A stores all the key values which are required for local authentication at lower levels. At the beginning of an authentication process of the hierarchical system, remote authentication will be performed to verify the authenticity of the whole system by comparing the CRPs generated by A with the CRPs stored on the server. If it passes the authentication, each component will be authenticated based on the correctness of the key value for each part.
Hierarchical authentication leads to the following advantages.
1) Degrees of Freedom in Authentication:
Depending on the security requirements of various IP blocks, appropriate authentication circuits and obfuscation approaches can be adopted for each IP. This allows heterogeneity in levels of security for different blocks. 2) Third-Party IP Authentication: The components in a device may come from different sources. Counterfeit or malicious parts can be integrated into devices without being noticed along the design flow. Integrated circuits would be very vulnerable when a key component fails. Therefore, authentication needs to be performed not only for the whole system, but also to identify selected components of the device. 3) Hierarchy in Security Levels: A hierarchy of privilege can be realized through hierarchical protection, such that different users can be granted access to the functionality of each component depending on the desired access rights to the owners of IP and the user. Current work is directed toward a hierarchical authentication scheme where local authentication together with remote server-based authentication will be optimized based on the performance requirements at each hierarchy level.
VIII. CONCLUSION
This paper has presented a novel two-level self-correcting FSM, which can be used for PUF-based authentication while tolerating a certain number of bit errors that are generated by environmental variations. The applications of the proposed method for local authentication and reliable signature generation have also been discussed. The performances of the proposed two-level self-correcting FSM with respect to the PUF response bit-length and the number of tolerated error bits have been studied. We have also shown that the proposed technique achieves significantly lower cost than the error correcting methods that are previously used for PUFbased authentication. Additionally, we have introduced the concept of hierarchical authentication. A formal security analysis of the proposed method is beyond the scope of this paper. Future work will be directed not only toward security analysis of the proposed method but also comparison of security of the proposed method with that of the error correction with helper data or the BCH method. Future research also needs to be directed toward investigating the tradeoffs between the complexity of the FSM and the feasibility of reverse engineering.
DISCLOSURE
The University of Minnesota plans to file a patent application on the content of this paper.
