The recent surge in hardware security is significant due to offshoring the proprietary Intellectual property (IP). One distinct dimension of the disruptive threat is malicious logic insertion, also known as Hardware Trojan (HT). HT subverts the normal operations of a device stealthily. The diversity in HTs activation mechanisms and their location in design brings no catch-all detection techniques. In this paper, we propose to leverage principle features of social network analysis to security analysis of Register Transfer Level (RTL) designs against HT. The approach is based on investigating design properties, and it extends the current detection techniques. In particular, we perform both node-and graph-level analysis to determine the direct and indirect interactions between nets in a design. This technique helps not only in finding vulnerable nets that can act as HT triggering signals but also their interactions to influence a particular net to act as HT payload signal. We experiment the technique on 420 combinational HT instances, and on average, we can detect both triggering and payload signals with accuracy up to 97.37%.
I. INTRODUCTION
The sheer complexity of modern electronic systems and their distributed supply chain spread across the globe are growing concerns for establishing trust in Integrated Circuits (IC) and systems. The untrusted third-parties involved in fabrication, assembly, packaging, testing, and validation can make malicious modifications to the ICs to undermine the security of the system. Such intentional changes, known as Hardware Trojan (HT), are typically performed without designer knowledge, and it works as a backdoor to achieve undesired behavior. The potential threats of HT are often aligned with the attackers' objectives that include but not limited to the leaking secret information (e.g., key), incorrect functionality, early failure of the device, etc. Vulnerabilities of the design are often stipulated during pre-and/or post-silicon where a small HT circuit (3-5x smaller than regular design) observes certain events and becomes active at an infrequent time when the device is in the field. The part of the design showing limited controllability and observability can provide the required stealthy behavior of HT. Furthermore, the spectrum of HTs depending on their physical, activation, and action characteristics [1] make the current detection approaches nontrivial [2] .
The construction of concurrent HTs is concerned with the nets that switch rarely within a design. Rare triggering nets directly influence the location and insertion phase (gate-level or layout-level). As more and more Commercial Off The Shelf (COTS) components are integrated within the system without security verification but to meet time-to-market demands, the attack surface becomes more dynamic. Hence, traditional detection approaches need to be transformed to account for newly published HTs.
During pre-silicon, a defender analyzes the security by performing logic testing of the design against HT vulnerability. Although logic testing is independent of the process variations, it is limited by the simulation cost, which depends on design size and the modeling of the simulator. Such a simulationbased technique involves generating random test vector [3] or high-level statistical modeling [4] - [6] of the design to propagate the test vector through the design. Regardless of the vector generation technique, the analysis of the switching activity of all nodes in a design can identify the nets with the lowest activity. Based on the study, a designer can generate region-specific test vector to improve the controllability and observability of these nets [7] , [8] .
On the contrary, Side-Channel Analysis (SCA) is performed to detect minute variations in design parameters (area, power, and delay) due to the presence of HT [9] during post-silicon. SCA treats the design as a whole and collects the signature and compare it with HT-free golden design. The strength of SCA is challenging due to the decreasing resolution of HT size and increasing process variations in sub-nm design [10] , [11] . Many of works for HT detection and prevention require pre-characterization of HT-free design which may not be available sometime. For example, IP vendors provide limited information about the internal construction of IP in their published datasets. Furthermore, the limitations of logic/functional testing in generating the same test vectors for both attacker and defender are nearly infeasible due to ample logical space, and the mechanism of reliability failures may even mask HT presence. Hence, a common assumption regarding golden design makes HT detection more challenging and intractable as to learn the perturbed nets by an attacker and identify them through SCA.
In this paper, we utilize the concept of the social network to delineate the relation between design properties and locate HT within a design. As a first step, we perform a formal treatment of circuit edge (nets) as it is the heart of HT triggering. We use essential attributes of edges since adding or removing edges affect the neighborhood of other edges 978-1-7281-5842-6/19/$31.00 c 2019 IEEE as well as design parameters. Then, identified characteristics of edges are ranked to understand the potential perturbations over edges. It also helps to gain insight into an evolving network (triggering signals) and how it influences other edges (payload). As there can exist multiple of such network, we analyze each of this network cluster who can perform collusion by adding or removing edges strategically. Then, based on both individual edge and group network, compact test vector can be generated for inter-cluster. These test patterns would be completely localized without resorting to logic simulation a priori and magnify the probable HT location (if any). The novelty and contributions of the proposed approach are:
• there is no assumption in underlying design (HT-free or HT-affected), hence the technique is uniform. • a comprehensive framework for combinational HT detection based on individual (node) and cluster (network) analysis to identify rare triggering nets. • a localized test pattern generation technique for neighborhood isomorphism. The rest of the paper is organized as follows. Section II provides an overview of related works on HT localization based on local vs. global excitation. Section III describes attack model, related definitions and bottom-up method to detect HT. Section IV presents the detection capability of the proposed technique without any knowledge of golden design. Section V draws the conclusion followed by future work.
II. BACKGROUND AND RELATED WORK
In this section, we present the existing HT localization techniques during both pre-and post-silicon. As our focus in this paper is the detection of combinational HT, we review the works that apply heuristics to particular nodes to act them as HT triggering signal and payload. Many of these works rely on extracting (rare) transition activity before they use a specific HT detection methodology. Hence, we broadly classify simulation technique under two conditions: (a) if individual edge 1 attributes are considered, and (b) if combined properties of only a subset of edges are considered without checking the individual net.
Zhou et al. [12] proposed transition probability enhancement of the nets far from primary inputs by using two input MUX. The approach is practical for gate-level netlist, but for low-level (e.g., RT-level), the technique is not scalable. Zhou et al. [13] approached the combinational HT by fault simulation (stuck-at). The authors have considered only one type fault during excitation of a net, however, an attacker can enumerate multiple nets with different fault model.
For the sub-network based approach, Banga and Hsiao [8] presented a region of interest around a gate that contains flip-flops. The technique was successful in detecting sequential HT, however, the proposed approach is applicable for combinational HT. Koushanfar and Mirhoseini [14] applied gate profiling technique to detect side-channel parameters. As the HT detection problem is NP-hard, the impact of process variations can be significant on the individual gate, hence can mask the HT presence. Wei and Potkonjak [15] presented gatelevel characterization approach where the authors partitioned the design based on input vector control and profile regionbased current. The technique, however, fails to detect HT if they are located in multiple regions. Banga and Hsiao [16] presented miter circuit to detect equivalence between golden and HT-infected design. The circuit would produce UNSAT if there is any HT.
Unlike previous techniques, our technique (a) does not consider any HT excitation by expensive simulation, instead, applies network parameters, and (b) is scalable for both combinational and sequential HT as it classifies most influential nodes for HT triggering without any heuristics.
III. PROPOSED APPROACH

A. Threat Model
We assume the potential inclusion of out-of-spec components into the legacy design can happen during pre-and postsilicon. In pre-silicon, third-party IP vendors can modify parts of the design and sell to the particular IP buyer where the buyer may be fabless design house. Similarly, a system integrator can act as an insider attacker where he has access to design internals and can subvert any IP cores included in Systemon-Chip (SoC). Even the IP core is encrypted, as a valid and regular user, the integrator would have the correct key to unlock the core. During post-silicon, an attacker in the foundry would have access to GDSII file which he can reverse engineer to insert HT by changing process parameters (e.g., lithographic mask).
B. Problem Statement
A combinational circuit can be modeled as Directed Acyclic Graph (DAG), G (V, E) where V(G) denotes the set of nodes and E(G) is the directed edge set. The problem involves identifying the set of edges' properties, which are potential candidates for HT triggering and payload. To do so, we convert G to the edge-labeled graph, G'(E, V) where, E(G') denotes nodes (conversely, edges in G) and V(G') indicates the edge set (conversely, nodes in G). In our case, to ensure localization of possible HT, we require to identify at least k candidate edges in G with high-confidence using the properties defined in Section III-C.
C. Definitions and Terminology
Centrality, C: returns the number of adjacent nodes to a particular actor and includes both in-and out-degree of a node. Hence, nodes having more impact on HT should have lower centrality and follow an inverse power-law distribution. If A ij is the adjacency matrix in G', we can compute C of a node as follows:
where |A ij | denotes the total locations where edge exists (i.e. A ij =1). Closeness Centrality, CC: determines the shortest path length from a node to all other nodes. The nodes in G' showing the lowest CC (maximum average path length) should be identified as they can manifest themselves as low controllable and observable nodes [17] . If the number of edges traversed between two nodes i and j in the shortest path is s ij , CC can be computed as follows:
where |s ij | denotes the total number of mutually exclusive shortest paths. Betweeness Centrality, BC: acts as a bridge for shortest paths between two nodes. In general, if the BC value of a node in G' is higher, the higher is the chance the node would act HT payload as the node is reachable from other two nodes showing lowest CC. Conversely, a new node can be added to G' to perform collusion between two sub-groups in a design. We can calculate BC of a node as follows [18] :
where the value of I p j,k (n i ) is '1' if shortest path exists between j and k and it passes through i.
Eigen Vector Centrality, EVC: computes the centrality of a node by accounting centrality of nodes in its neighborhood. Once we calculate the most influential nodes in G' from C and CC metrics, EVC can estimate the relative importance of another node(s) in the periphery of the influential node. Similar to BC, node(s) having highest EVC can work as payload, and we can compute EVC by taking principle eigenvalue (λ) of the adjacency matrix between HT triggering nodes and payload node.
PageRank, PR: estimates the probability that a node can be inserted in G' based on the edge information of neighborhood nodes. As the location of the future edge(s) is essential, and attributes of nodes are entirely available in design [19] , we apply intimacy between neighbors to describe the likelihood of edge.
k-Cores, k c : is a connected sub-graph of the network where each node in the sub-graph has at least degree of k. Once we compute C, CC, and BC, we can find the "degeneracy" of the HT triggering cluster (highest value of k).
Density, D: denotes the connectivity information of a subgraph or graph. For strongly connected graph/sub-graph, D is close to 1, and an attacker would avoid it as inserting HT in this graph/sub-graph would influence more in changing design parameters. If the information about the centrality and its derivatives are not available, we can start with K c =1 to get the density of each sub-network(s) in the graph and rank them from lower to higher. The cluster with the lowest rank can be chosen to embed HT.
D. Motivational Example
We address the problem of HT localization using a bottomup technique, which is a fine-grained approach. We start with removing the Primary Inputs (P.I.), and Primary Outputs (P.O.) from DAG of design as inserting HT at the periphery of a design can be easily detected. Then, we convert the nodelabeled DAG (G) into edge-labeled DAG (G'). We concentrate on two crucial analysis (node-and graph-level). Node-level analysis is mainly concerned with finding C, CC, BC, EVC, and PR. It is more challenging to perform graph-level analysis without any heuristics than node-level analysis. As there could be multiple combinations of nodes in G' depending on various possibilities of node-level parameters and the impact of considering the wrong candidate set for HT could be high, we do pairwise comparison of C, CC, and BC of nodes before the information can be passed down to calculating other properties and confirming node status (Trigger or Payload).
With node level data, we can further verify two sub-graph properties (K c and D). In the top-down approach, we start with the graph-based approach (coarse grain technique). Similar to bottom-up, we only consider internal nodes and nets, excluding primary input (P.I.) and primary output (P.O.). The nodes are topologically sorted according to its degree, which are further classified into clusters of a network according to k c . We compute the density (D) of all clusters and rank them from lowest to highest. Based on user preference, we can perform further analysis of the nodes within the cluster ranked lowest. Analytically, both approaches should locate the same rare nets Fig. 1 . One can also adopt a top-down approach; however, the success in HT detection would depend on initial heuristics. Consider a generic c17 circuit (node-labeled) from IS-CAS85 benchmark and its corresponding edge-labeled circuit in Fig. 2 . The design has 5 P.I. and 2 P.O. in the node-labeled graph and edge-labeled ones has 14 nodes which are edges of node-labeled ones. Each edge in the node-labeled graph is labeled incrementally. In the edge-labeled graph, the third parameter associated with each node denotes the label of that node. We remove nodes N1, N2, N3, N6, N7, N22, and N23 from further consideration. In the rest of the section, we would refer to edge-labeled graph representation for analysis. The attributes of the nodes are shown in Table II . We can see the five nodes have a centrality value of 0.2 or less, which is also calculated for all nodes in closeness centrality. Then, we discard node (9,10) from further consideration as it is reachable from the rest of the nodes (C=0.4). As higher CC indicates maximum closeness, we sort the CC column and find that the nodes (9, 12) and (8, 11) both have minimum CC as well as moderate closeness (C=0.2). However, for BC, we see the value of each node is '0'. This is because no shortest path exists between two nodes that would cross any other node in the table. Hence, the nodes (9,12) and (8, 11) can act as HT triggering signals.
On the contrary, EVC and PR are closely related as both can approximate the structure of incoming edge(s). The node having the largest EVC and PR value should be chosen as possible HT payload signal. Hence, the node (12,13) could be used as a payload.
Based on the above relations between nodes, we can classify the possible HT payload into two categories: (a) explicit payload and (b) implicit payload. Explicit payload exhibits a change in regular path delay, whereas implicit payload can circumvent the path delay based fingerprint (i.e., no change in path delay). The impact of both payload is shown in Fig.  3 . The label (x/y) indicates the change in labeling where the previous label was x, and the new label is y.
During explicit payload, we use existing nets (8, 11) and (9, 12) to act as HT triggering signal (HTT), and the edge between N12 and N13 is being modified to work as HT payload (HTP) due to the inclusion of output of HTT. In comparison to Fig. 2(a) , the labeling of two edges (N12 → N13 and N13 → N23) are modified, which denotes the delay of critical path has been changed.
For implicit payload, we add an edge between nodes (N8 and N9), which would act as input to HTT with other input being the edge between nodes (N9 and N12). However, we have four choices to create additional nodes between (8, 11) and (9, 12) who can act as HTT, as shown in Table III . Among four possibilities, we choose node (8, 9) as an additional HTT signal. The output of HTT directly modifies the output of N13. If N13 is being replaced with three input NAND, the output will become '0' when all inputs are HIGH ('1') i.e., which is a rare condition (1 out of 8 combinations). Hence, the output of HTT should be '1' to change in the output of N13. Unlike explicit payload, there is no additional gate to act as HTP, and this also ensures the delay of paths towards P.O. is being modified simultaneously.
E. Algorithm for combinational HT localization
We propose an algorithm as shown above to locate nodes that can act as combinational HT. The design is considered to (8, 9) Impact both outputs' paths without any delay difference (8, 12) Impact one output and present path anomaly (11, 12) Create a back-edge (9, 11) Reachable by visiting node N10 be represented as DAG and edge-labeled. We describe major procedures in the algorithm as follows:
• Node extraction: For all nodes in the graph, we mark the P.I., and P.O. and remove them from further consideration. • Triggering nodes location: For the rest of the nodes, we calculate C, CC, and BC value. We sort the node in ascending order based on the value of C. The node(s) in the higher position(s) have larger C value and are marked as influential node(s). Such nodes have a higher fanin and fanout. Hence, adding a new node (edge) to these nodes would have a significant impact on design parameters which the attacker always wants to avoid. Therefore, we remove the influential nodes. For the remaining nodes in the list, we calculate an objective function (F i ). The nodes showing the lowest F are given priority to be considered as HT triggering signals. • Payload node location: Once HT signals are known, we sort the nodes (ascending order) according to BC value. The node on the bottom of the sorted list is marked as payload node. We choose this node as it should satisfy the property: it would act as a bridge (hence reachable) if there exist any shortest paths between HT triggering nodes. Furthermore, the payload node should also exhibit higher EVC and PR value. Higher EVC value confirms that the node is closest to the most influential nodes in the graph, which would manifest them HT triggering.
Similarly, depending on damping factor, PR establishes that HTT signals would create new edge the node having highest PR. 
IV. EXPERIMENTAL RESULTS
We evaluate the socio-network analysis of HT localization using NetworkX [20] . We tested the technique on combinational HTs for ISCAS85 benchmark [21] . As mentioned in Section III-C, each parameter influences the HT localization. For each design (HT-free), we perform the same experiment using different configuration parameters (e.g., change in damping value of PR) and we measure the success rate (%) for all HT types under a particular design. For each trial, we use all HT instances of design to calculate the success rate. We use a total of ten snapshots of socio-network parameters. We quantify the success in detection using three parameters: We assume the HT triggering signals would contain at least four nets whereas, for HT payload, it is only one signal. We did not perform any additional experiment such as controllability and/or observability, gate-level simulation to complement our approach. We also did not set transition threshold to refer the net as triggering. The HT infected designs do not contain any implicit payload signal, and we always observe a change in the path delay. Hence, we only presented the success rate for explicit payload. Table IV shows the success rate for four combinational designs. On average, we can detect HTT for 96.25% time and HTP for 98.5% time. Given that, there can be multiple HTP signals, the FP and FN rate could be higher. For example, in many HT instances of c6288, there is more than payload signals. Similarly, around 10-15 instances create cycle when combining nets to trigger HT. As our assumption is acyclic graph representation to the algorithm, the method can not detect this HTs. We did not experience any memory bottleneck due to large size of the design. On average, 70% time is spent to pre-characterize the nets, and the rest goes on checking HT database.
V. CONCLUSION AND FUTURE WORK
We present a framework to detect diversity-agnostic HT based on social network parameters. With only knowledge of the design, a designer can find possible HT locations easily without any expensive simulation or side-channel analysis. We also present an algorithm to rank the parameters with lower time complexity. We describe the feasibility of bottomup analysis approach. However, given enough run-time and good heuristics, top-down analysis is also possible. Future work includes the same parameters review for sequential HT and at a higher abstraction level (e.g., algorithmic level).
