Hardware security is a relatively new scientific discipline which focuses on adversarial threats to hardware components of complex systems and infrastructures, including manipulation, unauthorized access to confidential data, and intellectual property theft. State-of-the-art test methods can contribute to detection and elimination of security threats but also may introduce new vulnerabilities. This article will explain this dichotomy and outline current research challenges.
Introduction
Information and communication systems are subject to numerous security threats, including manipulation, gaining unauthorized access to confidential data, and intellectual property theft. Historically, software and communication infrastructure were the primary target of attacks, giving rise to the discipline of IT security [1] . More recently, the actual hardware of the systems was recognized as a major vulnerability [2, 3] . The attacker with physical access to a hardware circuit which implements a security-critical function may compromise its integrity by either passively observing or actively influencing the calculations it performs. Hardware security is a scientific discipline that studies vulnerabilities of hardware to adversarial attacks and develops countermeasures to eliminate such vulnerabilities.
Since manufacturing processes are prone to defects, hardware circuits must undergo thorough testing before being shipped to the customer [4] . With technology miniaturization, the requirements on testing increase, as the circuits become larger and more complex and new types of defects occur. In general, testing involves applying stimuli, or test patterns, to the circuit inputs and observing the responses at the circuit outputs. Sometimes, non-functional parameters such as timing, current flowing through the circuit or power consumption are also observed during the testing process. A further related task is diagnosis, which goes beyond detecting the presence of a defect in the circuit and attempts to identify the location and the type of the identified defect in order to optimize the manufacturing process. Moreover, some failures may occur in field, after the circuit has been shipped. Causes for such failures are noise from various sources and circuit deterioration due to aging. These failures are generally beyond the scope of post-manufacturing testing and are addressed by faulttolerant design, error detection and correction, or redundancy.
State-of-the-art test and diagnosis methods are based on the assumption that a defect may be located anywhere in the circuit. Good test patterns attempt to make as many internal lines of the circuit observable, that is, induce different responses at the outputs in presence and in absence of the defect. This requirement is most pronounced in the case of diagnosis methods that must decide which lines are fault-free and which are affected by the failure. If the test patterns do not achieve high observability, design-fortestability (DFT) techniques such as test points or scan chains can be employed to provide easy access to hard-toobserve lines. Obviously, this objective will interfere with the necessity to protect confidential data stored and processed by the circuit. An adversarial attacker may simply use the test patterns and the available DFT mechanisms to gain unauthorized access to protected data.
This article focuses on the relationship between hardware security and test methods and their fundamental dichotomy. On the one hand, test methods can and should be used to identify adversarial manipulations of the circuit and attempts to obtain unwarranted access to the processed data. On the other hand, test and especially diagnosis inherently require access to the internal circuit logic, and these access mechanisms can potentially be abused by an attacker. Several research challenges, many lacking a convincing solution, will be outlined in this article.
Current technology trends point towards massive utilization of hardware circuits in larger cyber-physical systems that are interacting with the physical environment via sensors and actuators and at the same time are integrated via open networks, most notably the Internet [5] . Moreover, they interact with each other, forming Systems of Systems that exhibit highly complex, emergent behavior and constantly change their boundaries, with new subsystems continuously entering and leaving [6] . These developments put extremely high requirements on quality (and therefore testability) of the circuit as well as their security: the circuit should neither cause system failure because it contained a defect that was overlooked during test, nor should it be vulnerable to adversarial threats and serve as a potential backdoor for attackers. For example, an (in)security analysis of car electronics can be found in [7] .
The remainder of this article is organized as follows. The next section gives a brief overview of relevant topics in hardware security. Section 3 explains the similarities and differences between faults and defects that occur during circuit manufacturing and utilization, and malicious attacks and manipulations by human adversaries. Section 4 discusses the capabilities and the limits of test methods in detecting specific security threats. Section 5 covers the implications of DFT and self-test blocks to security of the circuit. Section 6 concludes the article. A (naturally subjective) list of open research questions follows in Section 7.
Hardware security overview
Hardware security considers a number of separate threats that are addressed using different countermeasures. The common property of the topics is that the threat originates from a hardware circuit. In many cases, the circuit does not function in a way desired by its designer, and the reason for the malfunctioning is at least partially due to a deliberate, malicious intent rather than a purely natural cause. Some key areas are reviewed next.
Side-channel analysis
A side-channel attack [8] refers to unauthorized access to protected data processed by the circuit, such as the secret key of a cryptographic algorithm. Side-channel attacks are based on observation of non-functional properties of a circuit, including execution time [8] , power consumption [9] , cache response times [10] or electromagnetic emissions [11] . For example, consider a circuit that stores and processes a 64-bit secret key 1 . . . 64 that requires protection. Assume that an operation performed by the circuit depends on the value of the first bit 1 : if 1 = 1, a complex calculation (for instance, a matrix multiplication) is performed by a dedicated sub-module, but if 1 = 0, the calculation is omitted. An attacker who knows about this principle may measure the power consumption during execution; if it exceeds a certain threshold, 1 must be 1 with high probability. This is illustrated in Figure 1 . Knowing one bit reduces the number of potential secret keys by one half. The attacker may continue to obtaining further bits in a similar way, successively reconstructing the complete key.
In practice, it may not be necessary for the attacker to derive all the bits. If the number of remaining, unknown bits is sufficiently small, brute-force search can be applied.
For instance, assume that the circuit in question is a 64-bit cipher that calculates the encoding function = ( , ), where , and are the plaintext, the secret key, and the ciphertext, respectively. If 34 out of 64 secret key bits are known, the attacker may simply simulate up to 2 30 encryptions using all possible combinations of unknown key bits, until one combination yields a ciphertext that is consistent with the ciphertext calculated by the circuit.
Countermeasures against side-channel attacks aim at minimizing information leakage, i. e., correlation between processed data and non-functional observables. For example, the measured power consumption of the circuit may be related to the number of 1 and 0 bits processed by the circuit (Hamming weight). Knowing the Hamming weight is useful for the attacker because it restricts the number of valid combinations and may enable brute-force search. One possible protection mechanisms is dual-rail encoding: each bit is represented by two values one of which is 1 and the other is 0. This scheme can be refined to eliminate information leakage due to asymmetric power consumption due to switching activity on both rails [12] . It is obvious that such countermeasures incur significant overhead in terms of area and power consumption. 
Hardware Trojans
Hardware Trojans [3, chap. 14] refer to malicious modification of the circuit design by an untrusted third-party manufacturer, as illustrated in Figure 2 . This rather new threat is facilitated by outsourcing of circuit manufacturing to overseas foundries. A typical Hardware Trojan consists of two parts: activation trigger and payload. The payload constitutes the actual malicious functionality: it may deactivate the circuit (denial of service), change the functionality of the circuit, or establish a hidden side channel through which information can be leaked as discussed in Section 2.1. The activation trigger initiates the malicious activity upon checking a condition, which can be external (for instance, some combination of values on internal signals which rarely occurs in regular operation but which the attacker can enforce) or internal (for example, the number of clock cycles since power-on, measured by a counter which was not part of the original design but was maliciously added by the untrusted manufacturer).
A very recent Trojan insertion technique by a hard-todetect manipulation of p-type and n-type dopants in transistors of a circuit [13] received broad public interest. It manipulated a random number generator circuit that is part of an Intel processor such that the quality of random numbers was no longer sufficient for secure applications while escaping detection. Two versions of the Trojan were presented: one affected the circuit's logical function in a way similar to a manufacturing defect, and the other was applied to an advanced dual-rail encoding [12] to re-establish the correlation between current consumption and the pro- cessed data which that encoding was supposed to eliminate.
Counterfeiting and hardware metering
An untrusted third-party manufacturer may opt to produce a larger number of circuits than contractually agreed and sell the residual circuits on the grey market ("overbuilding"). While this is clearly illegal, the threat of prosecution must be complemented by technical protection, known as hardware metering [14] . The fundamental approach is to equip every manufactured circuit with a unique identifier that can be checked by the system in which the circuit is later employed. The key requirement is that the malicious manufacturer is not able to forge the identifier to make the counterfeit circuit appear legitimate. In particular, the identifier must be constructed such that simply copying, or cloning, the identifier is not sufficient for identification. For example, storing a digital identification number in a register is vulnerable to cloning, as it is possible to write the same number into the same register of the counterfeit circuit. sated by error correction). State-of-the-art hardware metering approaches [16] combine unclonable identifiers with locking/unlocking functionality: the circuit is nonfunctional upon fabrication and must receive a unique unlocking sequence in order to start working. This sequence is calculated by the designer from the unclonable identifier data provided by the manufacturer, but cannot be calculated by the manufacturer alone. Therefore, legitimate circuits that the manufacturer has reported to the designer can be used whereas the overbuilt circuits cannot [3, chap. 5] . Figure 3 shows an example approach to overbuilding protection. Each manufactured circuit can only be used after an unlocking sequence has been applied. This sequence is circuit-specific and depends on the circuit's unique id. Upon manufacturing, the unique identifier id of every authorized circuit is read out and communicated to the designer, who calculates incorporating both id and the designer's secret key . Therefore, every authorized circuit can be used after its individual sequence has been applied, but this sequence does not unlock other circuits. An overbuilt circuit that is kept secret from the designer cannot be used because no unlocking sequence can be calculated without knowing the designer's secret key .
Fault-based attacks
Fault-based attacks [17] are applied to cryptographic circuits that perform encryption and/or decryption. In the course of encryption, plaintext is transformed to ciphertext using the secret key that is stored within the device and protected against unauthorized access. Since , and the encoding function is known, it is theoretically possible to obtain the key from solving the equation ( , ) = . The cryptographic strength of the procedure is based on the difficulty for the attacker to solve this equation within reasonable time.
A fault-based attack consists in a repeated encryption, whereas the physical circuit is manipulated such that it's calculated result deviates from the fault-free ciphertext . In order to derive the secret key , the attacker repeats encryption injecting a fault into the device and obtains the faulty ciphertext . is derived by differential cryptanalysis of and . For example, one can supplement the equation ( , ) = by the findings from the differential cryptanalysis that quickly guide the solution procedure towards the key [18] . Therefore, fault-based attacks target the implementation of the cryptographic algorithm rather than the algorithm itself and thus overcome its cryptographic strength. A number of techniques are known for introducing physical faults; these include manipulation of the clock signal, voltage supply, and pinpointed irradiation by a laser [18] .
The generic principle is illustrated in Figure 4 for the (hypothetical) cipher that consists of rounds, with the While the problem is recognized by the industry, it is currently addressed in a rather ad-hoc and unsystematic manner. For example, the invited talk from NXP Semiconductors at the COSADE workshop 2012 bears the title "700+ Attacks published on smart cards: The need for a systematic counter strategy" [19] . Relevant countermeasures include shielding (to prevent manipulations), light sensors (to identify attempts to disassemble the package) and error-detecting codes [21] .
Faults and defects vs. adversarial attacks
Many security threats have manifestations that are conceptually similar to the effects of defects and faults due to natural causes. Hardware Trojans modify the function and/or non-functional properties of a circuit, similar to permanent manufacturing defects. Therefore, they can generally be detected by testing. Fault-based attacks result in sporadic bit-flips that could also have happened due to transient or intermittent faults. Consequently, faulttolerant design techniques designed for detecting or correcting transient faults are capable to address fault-based attacks in principle. A major difference of adversarial attacks from natural faults is their attribution to an intelligent human with malicious intent to avoid detection. For example, assume that an attacker has the technical capability to inject single or double faults in order to perform an attack, and that a test set that detects 100% of single faults and 99.9% of double faults is available. In context of natural faults, this set would most probably be acceptable, because double faults are less likely to occur than single faults and the probability of occurrence of the 0.1% undetected double faults is negligible. However, an intelligent attacker would find out which double faults are undetectable and intentionally inject these faults.
Moreover, the attacker has the ability to learn about the developed countermeasures and adapt his attack over time to circumvent them. In context of regular postmanufacturing test, if a new defect type turns out to be relevant in a recent technology, the response typically consists in a combination of improving the technology to avoid the defect and developing new test methods to identify and sort out residual defects. In contrast, if a countermeasure against an attack has been developed and published, the attacker will aim at enhancing the attack such as to make the countermeasure ineffective. For example, if the design of a circuit has been altered to substantially reduce the correlation between the processed data and current consumption, making simple side-channel analysis infeasible, the attacker may resort to performing a large number of measurements followed by statistical post-processing [9] .
One line of defense against hardware security threats consists in keeping the details of the circuit and the protective mechanisms secret in order to prevent the attacker from taking this information into account. This "security-by-obscurity" approach is, however, a doubleedged sword. On the one hand, the attacker has to put effort in understanding the circuit before even starting planning the attack. On the other hand, if the attacker does succeed in finding a vulnerability, the designer of the circuit has no chance to learn about it and the attacker will continue exploiting the vulnerability for indefinite time. In contrast, if the relevant details are disclosed, not only the malicious attackers but also researchers from the entire scientific community will work on finding weaknesses and publishing them. As a result, compromised circuits can be put out of use and weaknesses can be fixed in the new designs.
This dilemma already played a role in the past in context of cryptographic algorithms. Some authors opted to keep their ciphers secret; for example, the GOST 28147-89 cipher developed by the Soviet authorities in the 1970s was only declassified in 1994. Today, all popular ciphers, including the relevant industry standards, adhere to the Kerckhoff's principle which requires security even if everything about the system except the key is known to the attacker. All algorithmic details are documented and made available to the general public, including potential attackers as well as researchers who are interested to identify vulnerabilities before attackers do.
In the field of hardware security threats, the industry has so far been rather reluctant to share information on employed countermeasures. As a result, the majority of secure hardware circuits that are in use today have protective mechanisms that are approved by the responsible certification authorities but cannot be independently evaluated by the research community. At the same time, concealing information from attackers becomes less feasible over time. For example, the publicly available Degate software (http://www.degate.org/) semi-automatically creates a gate-level netlist from microphotographs of individual layers of a manufactured circuit. This information can assist an attacker in choosing the location of the attack such as to prevent its detection by testing.
Addressing hardware security threats by test methods
We will evaluate, for each threat mentioned in Section 2, the scope and the limits of test methods to identify and address a specific issue.
Side-channel analysis
Post-manufacturing testing does not play a major role in identifying side-channel vulnerabilities. Side-channel analysis relies on the regular operation of the circuit and does not require any defects or other deviations to be present. It is possible to measure the current drawn by the circuit during test (known as I DDQ testing), however the measured value does not directly point to a vulnerability. An indirect indication of susceptibility to side-channel attacks would be an excessive variation in current consumption among different test patterns. Many circuits consist of a small secure sub-module that handles protected data and much larger circuitry used for other purposes. It may be challenging for the attacker to observe the secure sub-module in isolation. For example, if only the current drawn by the whole circuit can be measured, fluctuations in current consumption in nonsecurity-related circuitry may by far exceed the differences that carry information useful for the attacker. If the circuit provides test-access mechanisms that allow switching different sub-modules on and off for testing purposes, the attacker may abuse this functionality to keep the secure submodule running while deactivating everything else, thus improving the correlation between current consumption and the protected data.
Hardware Trojans
Hardware Trojans are conceptually similar to manufacturing defects and could, in principle, be activated during test using their trigger, followed by observing the effects of their payload. The fundamental difference is that attacker will actively attempt to avoid detection and thus will conceal this functionality as good as possible. The test procedures are generated with rather simplistic models of defects in mind, which are adequate for random, natural defects. Activation and detection of Trojans may require highly complex test sequences with exorbitant application costs that could be difficult to justify given that the very existence of a Trojan in the design is only suspected in most cases. For example, Trojans of "time-bomb" type are activated when an internal counter reaches some value selected by the attacker. It is clearly infeasible to continue testing for an indefinite time, just based on the suspicion that a Hardware Trojan might trigger. Therefore, there is significant interest in identifying Trojans in ways other than by triggering them.
One structural approach is based on the assumption that Trojans are triggered by values on existing signal lines in the circuit that are maliciously routed by the untrusted manufacturer to the Trojan logic [22] . Each line used to control the Trojan receives an additional fanout branch that was not present in the original design. The testing scheme is based on detecting the small additional delay induced by the fanout by considering the paths in the circuit in detail. A big advantage is that this effect is observable even when the Trojan is passive. However, great care must be taken to distinguish Trojan-induced extra delay from regular path delay fluctuations due to process, voltage and temperature variations. In general, testing for hardware Trojans may necessitate the observation of non-functional parameters such as delay, current, or power consumption.
Counterfeiting and hardware metering
Testing is a cornerstone of hardware metering. If an unclonable identifier is used to authenticate the circuit, its stored values can be accessed by a challenge-response scheme, where "challenge" refers to applying a test input to the physical entity that implements the identifier, and "response" depends on the physical principle of the identification. Possible responses are logical values, discretized delays or local currents measured by an embedded sensor. There is a trade-off between the number of applied test patterns and the reliability of identification. The circuit is unlocked and ready for use if all applied challenge-response pairs were consistent.
Special protocols, often using public-key cryptography, are required to obtain challenge-response pairs that the untrusted manufacturer cannot forge. For example, the manufacturer may record the raw identifier information of each circuit after fabrication and send it to the designer, who would compute a separate unlocking sequence for each circuit using the designer's private key which is not known to the manufacturer. The unlocking circuitry present in every circuit incorporates the designer's public key, and the circuit becomes functional only if the unlocking sequence has been created based on the designer's private key. Practical protocols are more complex in order to eliminate several further vulnerabilities [16] .
Fault-based attacks
Fault-based attacks are applied to circuits that have been tested and shipped, and therefore their detection has to rely on fault-tolerance techniques designed for natural faults, as well as on specific detection mechanisms such as light sensors. The classical approach to fault tolerance in hardware circuits, known as self-checking design, is based on representing and processing information in encoded form using special error-detecting or errorcorrecting codes (EDC). The circuit continuously monitors that all processed data are codewords of the employed EDC. If a fault occurs, one or several bits of information will flip and a non-codeword will be generated with some sufficiently high probability, resulting in detection. While such schemes are practical for memories, communication channels and restricted classes of circuits including adders [23] , they result in significant area and energy overhead of over 50% for typical circuits. As a consequence, today's fault-tolerant systems often use dual-modular redundancy (DMR) for fault detection: the circuit is duplicated and the responses of both copies compared. This approach is not much more expensive than EDC-based architecture, is much simpler and has a very comprehensive coverage of natural faults: all faults affecting one copy and the vast majority of faults affecting both copies are detected.
The situation changes completely when it comes to adversarial attacks. If the attacker has the capability to inject a certain fault into one circuit in a DMR system, it is not unrealistic to assume that the same fault can also be injected into the second copy at the same time. The comparator will then not flag an error and the manipulation will remain undetected. An EDC-based protection is harder to circumvent, and its effectiveness strongly depends on the particular code used. For example, all linear codes, including popular parity and Hamming EDC, are ineffective if the attacker is capable to flip arbitrary bits of the codeword. Consider a circuit protected by the parity code, and assume that the attacker can flip two arbitrary bits, e. g., the first and the second bit of the codeword. The resulting manipulated value will have the same parity as before and therefore again be a valid (yet incorrect) codeword and lack of detection. A special class of non-linear robust codes with guaranteed minimal probability of adversarial attacks has been investigated in [24] . The application of robust codes in actual circuits requires a better understanding of the relationship between informationtheoretical bounds known for such EDC and their effects in actual circuits [25] .
Hardware security versus DFT and BIST
As already mentioned in the introduction, design-fortestability circuitry such as test points and scan chains, may establish a new channel through which the attacker can access protected data [26] . This is especially dangerous if DFT mechanisms are inserted in the end of the design process by DFT engineers who are not fully aware which parts of the circuit are sensitive. One possible countermeasure is to avoid DFT altogether. This may be feasible for actual cryptographic blocks, which in practice are often small and easy to test even by functional (sequential) test patterns. It is impractical, however, to eliminate DFT everywhere in the chip just because it contains a block for which DFT is not usable. If DFT is avoided in the securitycritical part of the circuit and employed everywhere else, it may not provide its full benefits. For example, if all memory elements of the circuit are included in scan chains, the complexity of test pattern generation lowers drastically [4] . This advantage is significantly reduced by a single block without scan, even if this block is small. If DFT cannot be avoided, it must be designed such as not to compromise the protected information. A number of techniques, such as scan chain scrambling (placing memory elements in the scan chain in an order unknown to the attacker) [27] , are based on the "security by obscurity" paradigm and violate Kerckhoff's principle "the enemy knows the system". A different approach is to consistently distinguish between the test mode in which the secret information is not used and the operational mode in which the DFT circuitry is disabled. This can be achieved by physically destroying the access to the scan chains after the end of manufacturing test by blowing fuses [28] . However, it is possible that the attacker reestablishes the destroyed electrical connections, obtaining the circuit with operational scan chains.
A rather radical solution is to resort to built-in selftest (BIST). The circuit is equipped with on-chip block TPG (test pattern generator) which provides test data to the block under test and TRE (test response evaluator), which observes its responses and decides whether the test has been passed or failed. Only very limited information (in the extreme case, just the pass/fail status) is communicated outside the circuit. This solution provides good protection of secret data, but effectively prevents diagnosis and improvement of manufacturing process and its yield. In general, all information that is useful for diagnosis is also useful for the attacker. Therefore, it appears that any practical solution along these lines must be a compromise.
Using BIST in security-related circuits is also associated with some problems. Low-cost BIST schemes tend to have low fault coverage and/or require DFT structures (test points) to detect a sufficient number of faults. Therefore, an attacker may inject faults or incorporate Trojans that are not identified by BIST. Moreover, it is the very nature of BIST blocks being deterministic and disconnected from the outside works that simplifies their manipulation. For example, the above-mentioned dopant Trojan [13] was able to circumvent two levels of protection by BIST mechanisms employed in the Intel processor with the objective to avoid DFT.
The first level of protection consisted in calculating a checksum from the consecutive outputs of the circuit. It was bypassed by another hardware manipulation: several constants were set to values that yielded a sequence of outputs which in turn led the BIST block into calculating the correct checksum even though the functionality of the circuit was altered. This trick is only applicable because the BIST mechanism is fully deterministic and the expected checksum is fixed and known in advance; the circumvention would be harder if additional test data would be applied from outside. The second level of protection was a functional test checking the randomness of the sequences generated by the manipulated block (a deterministic random number generator). This test was fooled by manipulating only some of the generator's bits, such that the resulting sequence looked sufficiently random to the BIST procedure but in reality had far less randomness than necessary for adequate security levels.
Conclusions
Hardware security is an arena on which a steady competition between attackers who look for vulnerabilities to exploit them and designers who attempt to protect their design against known and expected threats is running. Testing is an essential tool in the portfolio of protective methods, but it is also available to the adversaries who are working on circumventing this protection by designing attacks that are hard to identify. Moreover, some test procedures, in particularly DFT, can open up backdoors for attackers in unanticipated ways. The design of future secure circuits will have to consider quality of shipped circuits (driven by testing), yield learning and improvement (driven by diagnosis), reliability (driven by faulttolerant design) and security (driven by countermeasures, some of which were mentioned in this article). In addition, the usual design constraints area, performance and power consumption still have to be met in order to design a product that is competitive on the market. Balancing between all these requirements is a major challenge that will require a thorough understanding of many factors, including those which are often ignored today.
Open research problems
1. How to model side-channel leakage such as to balance between accuracy and simplicity? Detailed electrical models based on non-linear functions or differential equations are difficult to use for automatic attack construction and analysis due to their high complexity. In contrast, simple indicators, such as Hamming weight, are easy to incorporate into such procedures but they may not match the actual observed behavior. 2. Can Hardware Trojans be detected by traditional structural test approaches with sufficient confidence? Are design-time techniques like provisions to isolate small parts of the circuit for detailed analysis necessary to make Hardware Trojans detectable? What are their implications to side-channel analysis and other vulnerabilities? 3. Is there a fundamental mechanism to construct provably unique identifiers? Today's solutions seem to rely on particular sources of variability. What can be done if variability will become much better controlled in the future?
4. Are there automatic methods to derive a fault-based scenario for a given crypto circuit? The existence of such generic methods could pave the way to the formal quantification of vulnerability of circuits to such attacks. 5. Is error detection a viable generic mechanism against fault-based attacks, and does the necessary infrastructure introduce new side-channel vulnerabilities?
