Abstract-We develop a region-based timing characterization approach to detect hardware Trojans (HTs) on integrated circuits (ICs). In order to ensure the scalability of the approach, we partition the target IC into well-formed and non-overlapping regions and detect hardware Trojans on all circuit locations by examining the timing properties of the transistor paths. Based on the circuit partition, we insert a minimal number of test points that provide additional observation interfaces for the delay measurements of all circuit locations. Our evaluations on ISCAS and ITC benchmarks show that the region-based Trojan detection via test points can detect hardware Trojans accurately with well controlled area overhead and test time.
I. INTRODUCTION
Integrated circuit (IC) outsourcing has become a trend in the IC industry in order for the IC companies to increase their revenue. In the current IC design and manufacture model, the IC is exposed to a variety of threats such as hardware Trojans (HTs) [19] [9] and unauthorized intellectual property (IP) usage [16] , due to the fact that the IC designers and the manufacturing foundries are completely separate and different entities. Since ICs are the fundamental building blocks of all digital computing and communications platforms for the modern business, military, and government affairs, the potential vulnerabilities against security attacks have become a major concern in the IC industry.
A malicious circuitry, or hardware Trojan (HT) [19] [9] , is a malicious modification on an IC, which is intentionally conducted by untrusted designers or foundries in order to attack the target hardware. There is a wide consensus that HTs have huge impact on the security and integrity of the IC systems. For example, HTs may alter the functionality of the IC, change the original characteristics (propagation delay or leakage power), or even leak confidential information. Similar to the trend of software Trojans in the early days, the detection of HTs has triggered a great deal of attention in the IC community, due to the fact that the consequences of HTs can be extremely severe for security-sensitive systems.
As the potential risk arising from HTs increases, many research efforts have been put on how to detect and prevent the embedded HTs effectively. Among them, side channelbased monitoring schemes [8] [2] [11] have been considered as effective and economic compared to the otherwise expensive physical inspections. In the existing approaches, leakage power is the most often monitored property for the checking against HTs [15] [23] , because any unexpected malicious components on the circuit would result in a systematic bias in the total leakage consumption. Therefore, the leakage power-based approach can provide us with a full coverage of the circuit in terms of identifying HTs. However, as a side effect caused by the full coverage detection, most of the existing power tracingbased approaches require creating and solving huge sizes of power equations that cover essentially all gates on the target circuit, making it difficult to scale to modern IC designs with millions of gates on each chip. Furthermore, the existence of process variation (PV) during the IC manufacturing process may compromise the detection scheme, since it is difficult to distinguish the leakage power variation caused by PV from that caused by HTs.
Besides leakage power, circuit frequency (or delay) has also been considered as a side channel for HT detection [8] [11] . Delay-based detection methods are in general scalable, since only a small subset of gates that are on the same path are under consideration. On the other hand, it is challenging for the delay-based methods to cover all the locations on the circuit for HT detection, due to the fact that there exist paths that are in parallel and reconverge to a single point, and it is difficult to map the measured path delay to a specific path for the consideration of HTs. As shown in Figure 1 , there is a HT gate H embedded on the path A. Even though we can measure the delay between inputs I1/I2 and output O, we are unable to determine whether the measured delay is for path A or path B. In fact, since path B (gates 1-3-4-5-6-7) is much longer than path A (gates 1-2-H-7), it is difficult to detect the addition of HT gate H using delay measurements from inputs I1/I2 to output O. We develop a gate-level delay characterization approach that covers all locations on the circuit for HT detection. Although the lack of observability in delay caused by reconvergences, as shown in Figure 1 , can be addressed by inserting additional test points (i.e., flip-flops) [22] , we argue that the problem of identifying all the reconvergent points is 978-1-4799-0601-7/13/$31.00 c 2013 IEEE NP-complete, which cannot scale to large designs. In order to solve this issue, we employ a circuit partition method to scale down the problem space to a limited number of nonoverlapping regions. In this way, we ensure that our timing characterization at all circuit locations is scalable to large IC designs. Also, based on the circuit partition, we introduce a test point insertion scheme that separates the parallel paths and enables their observability via delay measurements. To summarize, our innovations and contributions include the following:
• a region-based circuit partition approach to reduce the overhead and ensure the scalability of the timing characterization;
• a malicious circuitry detection approach using delay characterization that covers all circuit locations; and
• a test points insertion scheme to break the reconvergences and measure the required paths for their timing properties. 
II. RELATED WORK

III. PRELIMINARIES
A. Hardware Trojan Threat Model
HT attack is a common IC threat that has drawn a great deal of attention in the hardware security community [21] [28] [18] . There are typically two types of HT attacks: (1) adding malicious components into the IC, and (2) altering the properties of the existing IC components. In this paper, we focus on the first category, since it is more commonly used by an attacker during the IC design and manufacture process and would induce more severe security attacks. In particular, we assume that attackers tend to hide and prevent the HT from exposing to delay measurements. For example, they may size the HT ultra small to minimize its observable delay. Furthermore, they can hide the HT under one of the parallel paths to bypass direct delay measurements.
B. Delay Model and Measurements
The delay of a single logic gate can be expressed as d = gh + p, where g and h are logical effort and electrical effort, respectively; and p is parasitic delay. In particular, We use the delay model in [14] that connects the gate delay to its sizing and operating voltages:
where L is effective channel length, V th is threshold voltage, W is gate width, V dd is supply voltage, n is substreshold slope, μ is mobility, C ox is oxide capacitance, D is clock period, φ t is thermal voltage φ t = kT /q, σ is drain induced barrier lowering (DIBL) factor, subscripts i and i + 1 represent the driver and load gates, respectively, γ is the ratio of gate parasitic to input capacitance, and k tp and k fit are fitting parameters.
In order to measure the delay of a transistor path, we leverage the delay fault testing technique introduced in [13] . In particular, we insert flip-flops into the target circuit that serve as additional observation and control interfaces to facilitate the delay measurements based on delay faults. In order to reduce the cost of the embedded test points, we ensure that the flipflops are only activated during the test mode and deactivated during normal IC operations.
C. Gate-Level Characterization (GLC)
Gate-level Characterization (GLC) [24] [25][27] [22] is the process of identifying the process variation in the manufactured IC. The effect of process variation can be represented as a scaling factor towards the gate-level manifestational properties, such as delay. Then, a system of linear equations can be obtained by summing up the gate-level properties and measuring the total delay:
whered j is the delay of the path when input vector j is applied; s i is the process variation scaling factor of gate i, which represents the deviation of the delay from its nominal design values due to process variation; K ij is the nominal delay for the gate at input state j; and e sj and e rj are systematic and random measurement errors, respectively. Furthermore, we add one additional variable H in each of the equations that represents the additional delay caused by malicious circuitries. Although it is unknown whether any HTs exist or not, we assume there is a HT on each delay path while formulating the equations. Our idea is that the H would capture the systematic bias caused by HTs, if there are any, or becomes an ultra-small value if the circuit is HT-free. Therefore, by solving the system of linear equations and evaluate the value of H, we are able to detect the existence of HTs.
IV. OVERALL FLOW Figure 2 shows the overall flow of our HT detection approach. We employ three systematic steps in order to determine whether any hardware Trojans have been embedded in the target circuit. First, we partition the circuit into small regions, which consist of limited number of gates under one or multiple delay paths, to ensure scalability of our approach toward large designs. We employ a maximum fanout free cone (MFFC)-based method [7] for the original circuit partition. Then, we merge the MFFCs to form the regions that are non-overlapping and that can cover all circuit locations. After obtaining the well-formed regions for delay measurements, we insert test points, i.e., scan chain flip-flops, to the source or destination gates of the paths that are not directly observable through inputs, outputs, or existing flip-flops. Secondly, we manipulate the input vector of the circuit and generate delay variations for each path. With the delay measurements using the test points, we are able to formulate a linear program for each path concerning the path delay and the delays of individual gates. Finally, for the detection of hardware Trojans, we insert an additional HT variable in the linear program in order to capture the systematic bias caused by the HT. By evaluating the solved HT variable values from the linear program, we determine whether any malicious circuitry exists in the target circuit. Overall flow of the region-based Trojan detection using delay characterization.
V. REGION-BASED DELAY CHARACTERIZATION
A. Motivation
We note that we must address the following three issues regarding the overhead and scalability of our approach, in order to provide a reliable and scalable HT detection solution.
• Area overhead. The area overhead caused by the inserted test points (i.e., flip-flops) cannot be ignored, as the area of a flip-flop is often 4 to 6 times larger than a regular gate. It is essential to take into account of this critical overhead toward the design and manufacture of the ICs.
• Test Time. The test time required by the delay measurements is an important metric for the cost of the proposed approach. Since it requires input vector control to conduct gate-level timing characterization, the GLC of multiple paths cannot be fully parallelized. Therefore, we must minimize the total number of measured paths to reduce the test time while still covering all circuit locations.
• Scalability. the search of reconvergence points in all circuit locations is a NP-complete problem that raises scalability concerns. The complexity grows exponentially with the increase of the number of inputs, outputs, or gates. The efficiency and scalability of the identification procedure must be improved for the consideration of large designs, e.g., with millions of transistors, in the modern IC industry.
To address these issues, we further develop a circuit partition scheme that partitions the large IC into smaller regions, so that the scope of the timing characterization is reduced to the partitioned regions to address the scalability issue. Also, we aim to minimize the number of test points that we must add during the course of circuit partition, while the goal is still to cover all the circuit locations in terms of gate-level timing characterization.
B. Problem Formulation
We define the circuit partition problem as a specialized graph (netlist) partition problem with the goal of minimizing the required test points while controlling the number of gates in each region:
Circuit partition problem for timing characterization. Given a graph G = (V, E, P I, P O) that represents the netlist of an IC, where V is the set of vertices (gates)
, E is the set of edges (wires), P I is the set of primary inputs, and P O is the set of primary outputs, find a graph partition that consists of k subgraphs (regions) so that (1) each region consists of P i (i = 1...k) paths, which ensures that the total number of source and destination nodes of the paths that do not belong to P I ∪ P O is minimized; and (2) the number of gates in each component N i (i = 1...k) < T h, where T h is a configurable threshold determined by the gate-level timing characterization approach.
After partitioning the circuit, we can conduct the reconvergence identification and delay characterization in the scope of each single region. Since the size of the problem domain is reduced by factor of k, we argue that the complexity of the identification and characterization processes is reduced exponentially.
C. MFFC-based circuit partition
Our intuitions in addressing the circuit partition problem are three-fold: (1) We should find regions that contain paths traversing from primary outputs toward the direction of primary inputs. In other words, the addition of nodes in the region during the search process should go vertically (i.e., depth first). This is based on the consideration of leveraging as many primary inputs/outputs as possible, in order to reduce the additional test points. (2) The number of gates in each region should be maximized as long as it can be handled by the characterization approach (i.e., below threshold T h), which would reduce the number of regions and thus the number of delay measurements. (3) It is beneficial that there is no or little overlap between different regions, in order to reduce unnecessary measurements and characterizations. Based on the above intuitions and thoughts, we develop a circuit partition method using maximum fanout free cones (MFFCs) [7] . a MFFC for a node g i is a sub-netlist where each node g j is a transitive fan-in of g i , and all the transitive fanin's of g j is included in the MFFC. In other words, a MFFC is a self-contained region that grows maximally from P O toward P I of the circuit G, which satisfies our intuitions (1) and (2) . Furthermore, the MFFCs have an important property that two different MFFCs are either non-overlapping or one contains the other, which satisfies our intuition (3) as long as we remove the smaller MFFC from consideration in terms of circuit partition. Our circuit partition algorithm using MFFCs is presented in Algorithm 1. We first find the MFFC for each gate in the circuit using known algorithms [7] . Then, we sort the found MFFC set in the ascending order in terms of the size (i.e., number of gates) in the MFFC. Finally, we remove the MFFCs that have been fully covered by at least one other MFFC to eliminate the redundancy. if ∃C j where C j ⊃ C i then 7: Remove C i from C 8: end if 9: end for 10: Output C as the circuit partition. Figure 3 shows a small example of the proposed circuit partition method using MFFC. We partition the 6-gate circuit into 3 regions. Each region is a maximum fanout free cone from the bottom node. With this partition, the delay of each path is measurable, and it only requires one test point at location 16, since all other observation points are primary inputs or outputs of the target circuit. 
D. Test Point Insertion
After partitioning the circuit, we insert test points in the target circuit to facilitate the delay measurements of all circuit locations. There are two types of locations where we must insert test points: (1) where there are no input, output, or flipflops in the original design that can be used to measure the delay, such as the location 16 in Figure 3 ; and (2) where the corresponding path is not measurable due to reconvergence, such as the two paths in Figure 1 .
The Type (1) test points are highly dependent on the partition of the circuit. We must make sure that the source and destination gates of all the paths in the selected regions are covered by either inserted test points or observation points (i.e., primary input/output or existing flip-flops) in the original design. For the Type (2) test points, we employ a depth-first search algorithm to identify all the reconvergences within the selected regions. Since the size of a region is significantly smaller than that of the entire circuit, the complexity of the reconvergence identification is greatly reduced even though it is NP-complete.
VI. EVALUATION RESULTS
A. System Implementation and Metrics
We evaluate our delay characterization-based Trojan detection method on a set of ISCAS'85, ISCAS'89, and ITC'99 benchmarks. We simulate the 45nm technology and generate the variation of threshold voltage following the Gaussian process variation model [3] . Also, we employ the spatial correlation of effective channel length following the quad-tree model [6] . For the prototype implementation, we use Synopsys Design Compiler Version E-2010.12-SP5 to synthesize. The target library is the FreePDK 45nm Standard cell library [12] . Finally, we use Cadence Soc Encounter Version 9.1 to place and route our design.
We consider the following two metrics for the evaluation of our approach: (1) the accuracy of HT detection, which is represented by the false positives and false negatives in the detection results; and (2) the overheads caused by the inserted test points, which include the area overhead (i.e., the additional area caused by the inserted flip-flops) and the cost of test (i.e., the test time required by measuring the delay of the selected paths).
B. Accuracy of Hardware Trojan Detection
We evaluated the accuracy of our HT detection approach on a set of ISCAS'85 benchmarks. For each benchmark, we repeat the detection process 50 times and solve the value of the HT variable H using a LP solver. Table I shows the statistics of the HT variable (H) in each tested benchmark. For the ease of comparison, we evaluated two cases where there is a HT gate inserted at a random location on the circuit, and where the circuit is HT-free. We observe from the results that there are large gaps and little overlaps in terms of the HT variables between the HT-free case and the HT-present case. This enables us to draw a decision line between the H in the two cases and use it to determine whether HTs exist or not. 
C. Test Points Overhead
Our delay characterization process incurs the following sources of overhead: (1) the area cost for the inserted test points; and (2) the test time required by the delay measurements.
We evaluate overhead (1) by identifying the number of test points (i.e., flip-flops) that are required to measure the delay paths and cover all the gates in the target circuit. The column "# Test Points" in Table II shows the total number of test points required by our approach in each benchmark. We see that the number of test points is relatively small compared to the size of the circuit. In particular, we evaluate the area overhead by referring to the transistor counts, as reported in [17] , which are needed to implement the test points as well as the existing gates in the target circuit. Figure 4 shows evaluation results of the area overhead, which is calculated as the ratio between the transistor count of the inserted test points and that of the original design. The results indicate that the area overheads in all tested benchmark circuits are below 25%.
In Table II , we show the number of characterized paths required by our approach as the indicator of overhead (2) (i.e., the cost of test). Our intuition is that the number of measured paths is an important metric to evaluate the test time of the hardware Trojan detection approach, since it is often the case that the measurements of multiple paths cannot be parallelized. Therefore, the overall test time is approximately the sum of test time on each path. The results show that our regionbased approach requires a limited number of delay paths to be measured, compared to the huge number of paths in the circuit. The low cost of test is obtained from the non-overlapping circuit partition. 
VII. CONCLUSION
We have developed a region-based test point insertion method that can be used for gate-level delay characterization to detect the hardware Trojans. We partitioned the circuit into small non-overlapping regions using maximum fanout free cones. The test point insertion approach based on the circuit partition ensured that all the components on the target circuit are characterizable in terms of gate-level delay properties. Consequently, our HT detection method, which uses an additional HT variable during the course of delay characterization, is capable of covering all possible circuit locations. We evaluated the approach on a set of ISCAS and ITC benchmarks. The results indicated accurate HT detection with limited area overhead and test time.
