Trojan side channels (TSCs) are serious threats to the security of cryptographic systems because they facilitate to leak secret keys to attackers via covert side channels that are unknown to designers. To tackle this problem, we present a new hardware Trojan detection technique for TSCs. To be specific, we first investigate general power-based TSC designs and discuss the tradeoff between their hardware cost and the complexity of the key cracking process. Next, we present our TSC identification technique based on the correlation between the key and the covert physical side channels used by attackers. Experimental results demonstrate the effectiveness of the proposed solution.
INTRODUCTION
With increasing demand for secure computation and communication in the era of internet of things (IoT), hardware cryptographic modules are not only widely used in secure applications such as smartcards and set-top boxes, but also proliferate in all sorts of "smart" devices connected to the Internet. As cryptographic hardware provides the "root of trust" in the system, it is essential to ensure its own security. However, while the cryptographic algorithms themselves are extremely difficult (if not impossible) to break mathematically [1] , their implementations suffer from the well-known side-channel attack, i.e., secret information may leak through side channels such as power consumption, timing information and even sound, unless carefully designed and implemented. Side-channel attacks thus become serious industrial concerns and there are significant amount of research efforts spent in designing sophisticated attacks (e.g., differential power attack [2] ) and the corresponding countermeasures [3] .
Recently, a new type of hardware security threat namely hardware Trojan (HT) emerges, which are malicious circuits introduced by adversaries in the design team, third-parties or even foundries to serve as back-doors in the system [4, 5] . Various types of hardware Trojans were presented in the literature with different kinds of malicious functionalities [6] . In particular, for cryptographic hardware, Lin et al. [7] proposed the so-called Trojan side-channel (TSC) concept that facilitates to leak secret information via covert Trojan-induced side channels and showed that Trojans with size of 14 LUTs can reveal secret keys of an AES core implemented in an FPGA. Liu et al. [8] demonstrated a silicon implementation of an AES-based wireless cryptographic chip with embedded TSC and showed it could leak secret keys while passing conventional verification and test procedures.
In [7, 8] , the authors also presented potential TSC identification techniques for their specific TSC designs. [7] briefly introduced a potential TSC identification solution for their TSCs, but the details are missing. At the same time, they admitted that the proposed technique would not be able to detect sophisticated TSC designs. In [8] , the authors employed existing HT detection techniques based on side channel analysis, such as [9, 10] , to differentiate their fabricated TSC-free chips and TSC-infected chips. In practice, however, side channel analysis is often quite difficult, if not impossible, to have known TSC-free chips as golden reference.
The basic idea of TSC design is to embed some circuitries that are closely related to the on-chip secret key, thereby inducing or amplifying the key information leaked via physical side-channels. On the one hand, more direct TSC-induced correlation between the key and the physical side-channels facilitates attackers to extract the secret key easily, but it also enables relatively simple side channel analysis for TSC identification. On the other hand, more sophisticated TSC-induced correlation is hard to detect (without TSC-free chips as golden reference), but the cryptosystem becomes more difficult to break. Consequently, it is interesting and relevant to investigate TSC design tradeoffs and the corresponding identification techniques, which is addressed in this paper.
To be specific, the main contributions of this work include:
• We present a general power-based TSC design methodology and the corresponding key cracking procedure, with arbitrary combination of key bits, plaintext bits and random bits as key information leakage source. We then conduct systematic design tradeoff analysis considering the TSC size, the key cracking complexity via TSC and the TSC stealthiness against the proposed identification technique.
• Leveraging the correlation between TSCs and secret key, we propose a novel TSC identification technique that is applicable to general TSC designs without requiring Trojan-free chips as golden reference, and discuss its detection capability and limitations.
The remainder of this paper is organized as follows. Section 2 presents preliminaries and surveys related work. In Section 3, we present the theoretical study on general TSC designs and the corresponding key extraction procedure. Next, we describe the proposed TSC identification technique in Section 4. Experimental results are then presented in Section 5. Finally, Section 6 concludes this paper.
PRELIMINARIES
In this section, we present existing TSC design and identification techniques.
Trojan Side Channel Design
Side channel attack tries to break a cryptosystem based on information gained from its physical implementation (e.g., power consumption and timing information). For example, a vulnerable smart card has distinct power consumption when operations are performed with correct secret keys and wrong ones. Differential power attack [2] is able to exploit this property for key extraction even with noisy power traces, thanks to its signal processing and error correction properties. Various types of countermeasures have been proposed to mitigate this severe security threat. For instance, designers could use power analysis-resistant logic (e.g., dual-rail logic) or masking logic when implementing cryptographic hardware [3] . Figure 1 : An example TSC design: MOLES [12] In recent years, hardware Trojans emerge as a serious security threat due to the fact that today's IC designs involve many third-parties during the design and manufacturing process. For example, Skorobogatov et al. [11] reported a backdoor found in a military-grade FPGA device. Considering the fact that cryptographic hardware modules are often used as the "root of trust" in a system, they are no doubt the focus of hardware Trojan threats. Lin et al. [7] first introduced the TSC concept, which facilitates to leak secret information via covert Trojaninduced side channels. Later, the same authors presented a concrete TSC design namely MOLES in [12] . Gallais et al. [13] extended TSC to general-purpose processors on which cryptographic software is executed. Recently, Liu et al. presented a silicon implementation of the AES-based wireless cryptographic chip with embedded TSC in [8] .
Let us take MOLES [12] introduced in an AES design as an example to illustrate how TSC works (see Fig 1) . In this TSC design, the secret key bits, K(0) to K(7), are XORed with a random sequence of R(0) to R(7) generated by a pseudo random number generator (PRNG 1 ), represented as X i = K(i) ⊕ R(i). X i then drives a leakage circuit (LC) to leak the key bit information with additional power consumption P 0−1 whenever X i has 0-to-1 transition. Therefore, the total power of MOLES-infected chip, denoted by P total , can be modeled as the sum of the power consumption of MOLES circuitries denoted by P MOLES , the power consumption of AES denoted by P AES and other power consumption that can be modeled as white Gaussian noise (AWGN) denoted by P AW GN , P total = P MOLES + P AES + P AW GN .
(1) P MOLES can be further represented as:
where P K(i) and P PRNG represent the power consumption of the LC connecting K(i) and that of the PRNG. During the key cracking process, attackers extract one key bit at a time with differential power analysis (DPA). Let us start by guessing the value of K(0). With the guessed value of K(0) and known R(0), attackers can calculate X 0 = K(0) ⊕ R(0). According to the transition of X 0 , power traces 2 are grouped into group 0 associated with 0-to-1 transition and group 1 associated with 1-to-0 transition. Suppose there are m 0 power traces in group 0 and m 1 power traces in group 1.
The differential mean power is calculated by the mean power in group 0 minus the mean power in group 1, given by
where P ( j,K(i)) denotes the power consumption caused by K(i) in the power trace j. In Eq. 3, P PRNG and P AW GN are canceled with sufficient power traces, since they are not correlated with key bits; P AES can be safely ignored as well because designers would minimize the correlation between key bits with normal side-channel signals (otherwise the key bit can be already cracked even without HTs). Now, let us first consider the differential mean power caused by K(0), denoted by DP K(0) , given by
If K(0) is correctly guessed, each power trace in group 0 consumes the power of P 0−1 while all power traces in group 1 do not consume any power. Thus, DP K(0) ≈ P 0−1 > 0. If K(0) is wrongly guessed, on the contrary, all power traces in group 0 do not consume any power while each power trace in group 1 consume the power of P 0−1 . Thus, DP K(0) ≈ −P 0−1 < 0. In terms of differential mean power caused by other key bits K(i) (where i = 0), let m i0 and m i1 be the number of power traces associated with 0-to-1 transition of X i in group 0 and that in group 1. As a result,
Since grouping based on K(0) is uncorrelated with other key bits, we would have m i0 ≈ With the above, DP = ∑ 7 i=0 DP K(i) > 0 if K(0) equals the guessed value; otherwise, K(0) equals the opposite value of the guessed one. Other key bits can be extracted similarly with the above procedure.
Trojan Side Channel Identification
A number of HT detection techniques have been proposed in the literature and they can be broadly categorized into two categories: trust verification techniques [14, 15] used to detect HTs inserted at the design stage and side channel analysis techniques [9, 10] used to detect HTs inserted during fabrication. Generally speaking, trust verification techniques perform HT detection by identifying the rare trigger signals used in an HT. However, TSC designs are often "always-on" without any dedicated trigger signals and hence cannot be caught by these solutions. Liu et al. [8] adopted side channel analysis technique to successfully differentiate their fabricated TSC-free and TSCinfected chips. However, in practice, it is often quite difficult to obtain TSC-free ICs as golden reference.
The above techniques are for general HT detection. In [7] , the authors briefly introduced a dedicated TSC identification technique for their proposed TSC designs, but the details were not presented. While effective for their specific TSC designs, it was mentioned in [7] that their method was not applicable for more sophisticated TSC designs that leak secret information via a complex combination of multiple key bits, plaintext bits and random bits.
HT design and HT detection are like arms race, wherein attackers constantly update their tactics to intrude a system while defenders respond with more security measures to protect the system. Existing TSC designs and the corresponding identification techniques such as [7, 8] are case studies that open the horizon of new security concerns for cryptographic hardware, and it is essential to investigate whether there are other forms of TSC designs and how to defend against them, if any. This has motivated the study of general TSC design methodology and the corresponding TSC identification technique in this work.
GENERAL POWER-BASED TSC DESIGN
In this section, we provide the theoretical study on the general power-based TSC design methodology as well as the corresponding key cracking process by using TSC.
Before discussing the details, we have the following symbol definitions as listed in Table 1 . Let K be the key variable, wherein K(i) is the i th bit of K, and k be the realization of the key variable. Let K = {0, 1} nk be the whole key space where n k is the number of key bits of K, and hence k ∈ K. Let K * be the variable of the subkey composed of certain key bits of K, wherein K * (i) is the i th bit of K * , and k * be the realization of K * . Let K * = {0, 1} n k * be the key space of K * where n k * is the number of key bits of K * , and hence k * ∈ K * . Let R be the random number, wherein R(i) is the i th bit of R, and r be the realization of R. Let R = {0, 1} nr be the space of R where n r is the number of bits of R, and hence r ∈ R. Typically, TSC contains multiple leakage modules (LMs) which are used together to crack the whole key. Each of LM, serving as a separate channel to leak key bits via power side channel, is composed of two components, the leakage information generation circuit (GC) and the physical leakage circuit (LC), as shown in Fig. 2 .
To be specific, GC generates the actual logic leakage information that is the key masked by a random number. In case of the ease presentation, we model the plaintext bit as a special random bit. Fig. 2 (a) presents the LM of MOLES that leaks one key bit XORed one random bit while Fig. 2 (b) shows the general LM that leaks multiple key bits masked by multiple random bits. Thus, the actual leakage information, denoted by X, can be represented by
Previous work treated TSC secure based on the fact that only attackers know how to recover the key with knowledge of F and R. Thus, if LM is driven by multiple key bits and multiple random bits, it becomes much more difficult for designers to determine F and R. In Section 4.3, we discuss the stealthiness of TSC. We assume that LC can output only one power value, denoted by P LC , and whether it consumes the power is determined by the value or the transition of the input of LC, represented by X in Eq. 6, given by
In case of the ease presentation, in the following, we consider that LC consuming the power depends on the value of the input, unless otherwise specified. The proposed general TSC design and corresponding detection technique are easily extended to TSC whose LC consuming the power depends on the transition of the input. The design of LC is out of scope of this paper. Based on Eq. 6 and Eq. 7, we model LM's behavior as leakage power matrix, which is defined as follow: DEFINITION 1. Leakage power matrix for LM, T P 2 n k * ×2 nr , describing the leakage power of LM under different values of K * and R. If LM consumes the power under k * and r, we set T P(k * , r) = 1; otherwise we set T P(k * , r) = 0.
For a specific key k * , we use T P k * , defined as the row of T P wherein K * = k * , to denote its leakage power pattern. A simple example for leakage power matrix T P can be found in Fig. 3 .
By embedding TSC with n lm LMs, attackers have 
The number of key bits of K K *
The subkey variable and all key bits belong to K K * (i)
The i th bit of
The number of key bits of K * R The random number R(i)
The i th bit of R r
The realization of R R The random number space R = {0, 1} nr , r ∈ R n r
The number of random bits of R k
The actual key value of K used by the design k *
The actual key value of K * driving a LM n lm
The number of leakage modules with the knowledge of the implementation of LMs represented by F(K * , R) and P(X) as well as the random number R generated by PRNG. Next, we detail how to crack the entire key via embedded TSC.
3. The key cracking process is illustrated as follows. First, attackers are assumed to have the ability to measure the power of the chip and collect a large number of power traces under the different plaintexts. Next, with these power traces, attackers calculate one specific set of key candidates for each LM, denoted by K 1 , K 2 , . . . , K m , with the knowledge of the implementation of each LM by demodulating techniques, such as DPA. Finally, by intersection of all sets of key candidates, given by
attackers are able to reduce the key candidates into a reasonable size for brute force or directly obtain the genuine key. For a particular LM (LM i ), the procedure of calculating the key candidate set involves two steps. Let us take the leakage power matrix shown in Fig. 3 to illustrate the details.
The first step is to apply DPA to identify the leakage power pattern, T P k * where k * denotes the actual value of K * . As attackers hold the knowledge of R, the identification process is achieved by grouping the power traces based on R value and calculating the mean power difference between R = 0 group and R = 1 group, denoted as DP 0−1 . According to the value of DP 0−1 , we can obtain T P k * as follows.
•
Conventional DPA is unable to differentiate T P k * = [1, 1] and T P k * = [0, 0], since both of them lead to DP 0−1 ≈ 0. Power effects of AES, PRNG and AWGN are assumed to be removed by DPA with sufficient power traces as discussed in Section 2. The power impacts of other LMs can be removed by DPA as well, since all power traces containing the power of other LMs are evenly distributed into two groups associated with R = 0 and R = 1 for this LM. This factor is to be illustrated by Lemma 1 in Section 3.3. For the TSC driven by n r random bits, T P k * can be determined in the similar way by calculating the differential mean power of power traces with different random numbers. However, DPA is unable to differentiate T P k * = [1, 1, . . . , 1] and T P k * = [0, 0, . . . , 0] as well.
The second step is to extract the key candidate set. With the identified leakage power pattern T P k * . The extraction key process can be done by simply finding the key that leads to the same leakage power pattern. As a result, we obtain the key candidate set for LM i denoted by K * i , wherein ∀k * ∈ K * i , T P k * = T P k * , as shown in Fig. 3 . In order to intersect the key candidate sets obtained by LMs, we define K i based on K * i and K i ⊆ K. For any k ∈ K i , the subkey of k, denoted by k * , belongs to K * i , given by k * ∈ K * i . Correspondingly, we define K 1 , K 2 , . . . , and K m .
Leakage Module Design
The key issue in a TSC design is the LM design, and hence we discuss its details in this subsection.
During the key cracking process for one LM, the main point is to remove the power impact of other LMs by DPA. Therefore, LM should be designed in such a manner that satisfies the following lemma. LEMMA 1. For ∀k * ∈ K * , if we randomly select r where r ∈ R, we have
Lemma 1 enables power trace containing the power of one LM be evenly distributed into groups when power traces for this LM are randomly allocated. According to the above key cracking procedure, we observe that attackers in fact adopt LM to distribute all key candidates into predefined several key sets. For a general LM, the number of available key sets is given by the following lemma. LEMMA 2. Consider an LM driven by n k * key bits and n r random bits. This LM is able to distribute the 2 n k * key candidates into at most N KS key candidate sets, where N KS is given by
PROOF. For this LM driven by n k * key bits and n r random bits, the size of T P k * is 2 nr . Therefore, there are at most 2 (2 nr ) possible values for T P k * . DPA is able to differentiate 2 (2 nr ) − 1 of them
. Therefore, this LM is able to distribute 2 n k * key candidates into Min {2 (2 nr ) − 1, 2 n k * } sets.
Attackers are always expecting to reduce as many key candidates as possible by LM, and the following lemma indicates how to obtain the minimum expected number of key candidates. LEMMA 3. Consider an LM driven by n k * key bits and n r random bits. The minimum expected number of key candidates after reducing by this LM is given by
PROOF. Let us suppose LM partitions the key space K * into N KS subsets, denoted by K * 1 , K * 2 , . . . , K * NKS . Let n, n 1 , n 2 , . . . , n NKS be the
Since the actual key k * ∈ K * is randomly chosen by designers, the probability of k * within K * i is given by
With the above, the expected number of key candidates left by this LM, denoted by E(n kc ) is calculated by
By minimizing E(n kc ), we have
Therefore, attackers are required to evenly distribute keys into N KS sets in case of the minimum expected number of key candidates. With the above, Lemma 3 is proved.
With Lemma 3, it is possible to estimate the number of LMs required to crack the whole key, denoted by n lm . Suppose the acceptable complexity of enumerating all remaining key candidates is 2 nb and assume every LM is perfectly designed to expect to shrink the key space by 1 m according to Lemma 3, where 
Therefore, we have
To estimate n lm by Eq. 15, we simply consider m i = 2 (2 nr ) − 1 and m i = 2 n k * , assuming attackers would fully use each LM, and consider n r and n k * are constant for all LMs. Therefore, for
for m i = 2 n k * , we have
Overhead and Complexity
The total area cost of TSC, denoted by C T SC , comes from GC denoted by C GC and LC denoted by C LC and is estimated by
C GC (n k * i +n ri ) approximately increases exponentially with the increase of (n k * i + n ri ) caused by implementing F(K * , R). C LC is determined by attackers in the consideration of the ratio between the power of LM and the power of the whole chip. In terms of C GC , the best choice of TSC is to set n k * i = 1 and n ri = 1, just like MOLES. The complexity to build TSC and crack the key in all is given by
The first part, O(∑ elements of T P k * . The fourth part, O(2 nb ), denotes the complexity of obtaining the key from the key set in a brute force manner. With the above, to obtain the minimum complexity building TSC of cracking the key, attackers are suggested to set n k * i = 1 and n ri = 1, just like MOLES. In other words, MOLES has the lowest hardware cost and the lowest key cracking complexity. However, it is also easy to be detected by our proposed TSC identification technique, as detailed in Section 4.
THE PROPOSED TSC IDENTIFICATION SOLUTION
In this section, we present the proposed TSC identification solution. We first illustrate the existence of the correlation between the key and the leaked power via TSC-induced side channels. Next, we present how to identify such correlation for TSC detection. Finally, we estimate the detection capability of our approach and discuss its limitations.
Observation
The observation used to detect TSC is that all key bits driving one LM are correlated with the leaked power. Next, we describe the existence of this correlation.
Consider an LM that is driven by K * and R. From the perspective of designers, power traces are generated by the same plaintext but different key values and collected at the same time spot, guaranteeing different keys are masked by the same R value. After collecting power traces, we group them according to the value of K * , and power traces in the same group are generated by the same value of K * .
Among all groups of power traces, we choose two groups, group x and group y, and k * x and k * y denote values of key bits for group x and group y. These two groups should satisfy the requirement of F(k * x , r) = F(k * y , r), and there must exist required group x and group y according to the design of LM discussed in Section 3.
We calculate the differential mean power of these two groups of power traces, given by
where n x and n y denotes the numbers of power traces in group x and group y, and P ( j,K * i ) denotes the power caused by the LM driven by K * i in the power trace j. With the assumption used in TSC, P AES , P PRNG and P AW GN in Eq. 20 are removed by DPA. To calculate DP (grpx,grpy) , let us consider the differential mean power caused by each LM separately. For the targeted LM driven by K * , the differential mean power caused by this LM, denoted by DP (grpx,grpy) (K * ), is given by
Since F(k * x , r) = F(k * y , r), DP (grpx,grpy) (K * x ) = 0. For any other LM driven by K * z , let us define n (x,z0) , n (x,z1) , n (y,z0) and n (y,z1) be the number of power traces in group x whose k * makes F z (k * , r) = 0, the number of power traces in group x whose k * makes F z (k * , r) = 1, the number of power traces in group y whose k * makes F z (k * , r) = 0, and the number of power traces in group y whose k * makes F z (k * , r) = 1. The differential mean power caused caused by this LM, denoted by DP (grpx,grpy) (K * z ), is given by
where P(0) and P(1) are calculated from Eq. 7. Since the grouping based on K * is not correlated with the distribution of n (x,z0) , n (x,z1) , n (y,z0) and n (y,z1) , we would have n (x,z0) ≈ n (x,z1) ≈ n (y,z0) ≈ n (y,z1) ≈ 1 2 n x ≈ 1 2 n y with sufficient power traces. Therefore, we can obtain DP (grpx,grpy) (K * z ) ≈ 0.
By considering the power caused by all LMs together, we have DP (grpx,grpy) = ∑ nlm i=1 DP (grpx,grpy) (K * i ) = 0, which indicates the existence of the correlation between key bits (K * ) and leaked power. This observation enables us to detect TSC by identifying such correlation.
TSC Detection Algorithm
The proposed TSC detection method is based on identifying the correlation between key bits and leaked power. However, the design of TSC discussed above, in fact, is intended to hide such correlation from two aspects. To be specific, on the one hand, the key information is masked by the random number; on the other hand, the correlation between which key bits and the leaked power is unknown to designers.
It is relatively easy to overcome the impact of the random number by collecting power traces masked with the same random number. To achieve this, we can sample the power under the same plaintext at the same time spot while varying keys. However, without knowledge of which key bits are selected to drive the leakage module, designers are required to try all key bit combinations to identify TSC in the worst case scenario. In the following, we present that designers could have a high detection probability with only a few attempts. 
Overall Algorithm
Algorithm 1 illustrates the detection algorithm for TSC. We start to set the number of key bits verified, N k * , and the number of processes to identify LM, N v . How to set N k * and N v is to be detailed in Section 4.3. Then, for each identifying leakage module process, we randomly select key bits K * from all key bits K to verify whether they drive any LM, denoted by IdentifyLeakageModule(K * ), and the number of key bits for K * is equal to N k * . Whenever any LM is detected by a selected K * , the algorithm stops. In the following, we first present the process of identifying LM, and then discuss the detection capability of the proposed detection algorithm as well as the limitations.
Identifying Leakage Module
Based on the observation in Section 4.1, we formulate the problem of identifying LM as identifying the correlation between given key bits and leaked power. The process is illustrated by Line 9-16 in Algorithm 1. We group power traces based on the value of K * , and we try N t pairs of groups of power traces. N t is determined, guaranteeing a high detection probability. For each pair of groups of power traces, we calculate their differential mean power. If there is any strange differential mean power, we consider that there are LMs embedded in the chip.
However, it is possible that designers choose key bits K * that drive no LM, one LM or multiple LMs. Let us discuss these cases one by one in the following.
• K * does not contain all key bits of any LM. For this case, we cannot detect any of LMs. This is because that power traces containing the power of LMs are evenly distributed among groups based on the value of K * according to Lemma 1, which makes DP (grpx,grpy) ≈ 0.
• K * contains all key bits of one LM. For this case, We are able to detect this LM whenever we choose two groups of power traces whose keys, k * x and k * y , satisfies the requirement of F(k * x , r) = F(k * y , r) according to Section 4.1. Assume groups are independent, and according to Lemma 1, the probability of detecting this leakage module is given by
• K * contains all key bits of one LM and some extra key bits. For this case, we are able to detect this LM and those extra key bits would not have influence on identifying this LM. Let K * 1 be all key bits driving this LM. Thus, this LM will be detected whenever two groups of power traces whose keys, k * 1x and k * 1y , satisfies T P k * 1x (r) = T P k * 1y (r). Since key bits are independent, the probability of selecting key bit value from K * and K * 1 belonging to any key set is the same. As a result, we have
• K * contains all key bits of more than one LMs. For this case, we are able to detect LMs. Suppose there are n l LMs. To detect them, we have to find two groups of power traces, group x and group y, satisfying DP (grpx,grpy) = 0. Suppose LMs consume the same power P LC when activated. Thus, for any k * ∈ K * , we have n l possible power values caused by these n l LMs, n l P LC , (n l − 1)P LC , . . . , P LC and 0. With the above, there must exist two groups make n l LMs consume different power.
Next, let us discuss the probability of n l LMs consuming different power under k x and k y with the same r. For any k * ∈ K * , the probability of n l LMs consuming tP 0 in all is given by C t nl 1 2 nl .
Therefore, the probability of k x and k y making n l LMs consume different power is given by
which can be proven to be greater than or equal to 1 2 .
• K * contains all key bits of more than one LMs and some extra key bits. For this case, we are able to detect LMs and those extra key bits would not have influence on detecting LMs. This is easily proven by above discussion.
With the above, we can conclude that if K * contains all key bits of any LMs, our approach is able to detect them whenever two chosen groups of power traces have clear and stable differential mean power.
We try N t pairs of groups of power traces in order to have a high detection probability. With N t tries, the probability of detecting LMs is given by
As a result, N t is determined according to user-defined confidence, denoted by P con f idence . Since Pr[DP (grpx,grpy) = 0] ≥ 1 2 according to Eq. 23, Eq. 24 and Eq. 25, we can have about 99.9% probability to detect LM with N t = 10 if K * contains all key bits of any of the LMs.
Finally, the complexity of the proposed TSC detection algorithm is approximately given by
where NPT denotes the number of power traces required.
Detection Capability
To estimate the detection capability of our approach by Algorithm 1, let us first summarize the TSC detection problem as follows. Let n k be the number of key bits. Attackers would like to build N lm LMs for TSC. Each LM is driven by n k * different key bits. Designers, on the contrary, expect to detect any of LMs by attackers. The detection flow is like this. Each time, designers randomly select N k * key bits from n k key bits. If n k * key bits of one of LMs are inside of N k * key bits, TSC is detected by designers. Thus, if designers try above process N v times, the probability of detecting TSC is given by
where P DeT SC (N v , N k * , n lm , n k * ) denotes the probability to detect TSC. If N k * < n k * , our approach definitely misses TSC as discussed. If
is influenced by N v , N k * , n lm and n k * . Therefore, let us discuss the detection capability of our approach from two aspects, the impact of n lm and n k * set by attackers and the impact of N v and N k * set by designers. We study four cases shown in Table 2 and present the estimated P DeT SC in Fig. 4 . From the perspective of attackers, we study cases 1 and case 2 where we vary n k * and n lm . As can be seen, P DeT SC decreases dramatically by increasing n k * than by decreasing n lm . This is because that the number of key bit combinations that can be used to build LM (C n k * nk ) increases significantly with the increase of n k * , which increases the detection complexity. Therefore, the LM driven by multiple key bits is more difficult to be detected, while the hardware cost of LM and the complexity of cracking the key would increase exponentially with n k * , shown in Eq. 18 and Eq. 19.
From the perspective of designers, we study case 3 and case 4 where we vary N k * and N v . We find that it is better to increase N k * to improve P DeT SC than to increase N v . This is because more key bit combinations (C n k * N k * ) are verified once with larger N k * . Moreover, increasing N k * would not introduce any detection complexity shown in Eq. 27. 
Figure 5: Power curves indicating the detection of T SC 1 with N k * = 1 and the first eight key bits chosen separately
As a result, the key of our approach is how to set N k * , making designers have a high probability to choose K * that contains all key bits of one of LMs. If N k * n k * , our approach could detect TSC quickly as shown by Eq. 28. Therefore, if n k = 128, we could set N k * = 80, considering the fact that the number of key bits driven by LM is usually limited in the consideration of hardware cost of implementing LM. If N * k = n k , in this extreme case, the algorithm would detect TSC extremely fast. However, for this ideal case, the assumption that P AES and P AW GN can be removed by sufficient power traces cannot hold. This is because that we are not be unable to generate sufficient power traces with the same random number by varying the value of key bits. If the random number does not depend on the plaintexts, designers are able to obtain power traces by varying plaintexts. This is the reason why the plaintext bit is recommended to be used by TSC.
Discussion
The proposed solution for TSC identification has the following two advantages. Compared to existing HT detection techniques using sidechannel analysis, our approach does not require TSC-free chips as golden reference and hence are practical for TSC identification. Compared to the method in [7] , our approach works for power-based TSCs with any GC designs that use a combination of arbitrary key bits, plaintext bits and random bits, while their method cannot handle such sophisticated TSCs.
Our approach works well against the always-on TSCs but may fail for trigger-based TSCs. This is because, our approach requires sufficient number of power traces with leaked information from Trojaninduced side channels, which may not be available if the trigger condition is not satisfied. At the same time, however, designers can use existing HT detection solutions such as [15] to identify HT triggers and they are compatible with our TSC identification technique.
EXPERIMENTAL RESULTS

Experimental Setup
We validate the proposed TSC identification technique by conducting experiments on FPGA. We adopt two TSCs in the experiments. One is MOLES whose architecture is shown in Fig. 1 , denoted by T SC 1 . Each LM of T SC 1 is implemented by XORing one key bit and one random bit, given by X i = K(i) ⊕ R(i). The other one, denoted by T SC 2 , is designed as follows. Each LM of T SC 2 is implemented by XORing two key bits and two random bits, given by
. For both T SC 1 and T SC 2 , the random number is generated by a linear feedback shift register, while the LC is realized by flip-flops and the load of LC is controlled by the number of flip-flops. We transplant T SC 1 and T SC 2 into an AES cryptosystem implemented on the FPGA. We then measure the transient power of the entire AES cryptosystem with the oscilloscope.
The experimental parameter settings are shown in Table 3 . All power values shown in Fig. 5-8 remove the bias power, 1.188W, in or- der to clearly show the power difference between power curves. Thus, the '0' value on the y coordinate illustrates '1.188W'.
Results and Discussion
We validate the performance of our proposed TSC identification technique from three aspects: (i). TSC detection capability; (ii) the impact of the number of key bits (N k * ) chosen for TSC identification; and (iii) the impact of LC load.
Detection Capability
Let us consider the detection of T SC 1 first. Fig. 5 plots the mean power curves with increasing number of power traces (NPT). For every LM in T SC 1 , there are only two groups based on the value of K(i). As can be observed in Fig. 5 (a) , all the power curves gradually become stable with the increase of NPT. Then, we detail the mean power of all groups when NPT ranges from 1.7 × 10 4 to 1.9 × 10 4 in Fig. 5 (b) . From this figure, we can observe that the difference between two groups of one LM is clear and stable. Such power gap is the evidence of the correlation between the key bit and power consumption, and hence our approach is able to detect all eight LMs in T SC 1 . Next, let us validate our TSC identification technique for T SC 2 . Fig. 6 shows the mean power curves of four groups when K(0)K(1) are chosen during Trojan detection and they drive one LM. A close examination of Fig. 6 reveals the following two observations. First, T SC 2 is successfully detected by our approach since there exists the pair of groups with clear power gap (e.g., K(0)K(1) = 00 and K(0)K(1) = 01), indicating the correlation between K(0)K(1) and the power consumption. Second, we obtain the similar mean power when the pair of groups associated with 00 and 11 of K(0)K(1) or the pair of groups associated with 01 and 10 are chosen. This is because that the value of X i is the same for these pairs of groups, which leads to the same leakage power caused by LM.
While we only validate the proposed solution for T SC 1 and T SC 2 , we believe it is insensitive to any power-based TSC designs according to the theoretical analysis in Section 4. In the above experiments, we have shown that whenever N k * = n k * , our approach is able to detect TSC when all key bits driven one LM are chosen during TSC detection. In this experiment, we study the cases when N k * > n k * and N k * < n k * .
Impact of
For N k * > n k * , we set N k * = 2 which is larger than n k * = 1 during the detection of T SC 1 . Fig. 7 shows four power curves when K(0) and K(1) are chosen and either of them drives one LM. We observe that the mean power associated with 00 of K(0)K(1) is the largest; the mean powers associated with 01 and 10 are close and are a little bit smaller; the mean power associated with 11 is the smallest. This is because that both LMs leaking power when the driven key bit is 0. Our approach is able to detect T SC 1 whenever any of the two groups with clear power gap are chosen.
For N k * < n k * , we set N k * = 1 which is smaller than n k * = 2 during the detection of T SC 2 . Fig. 8 present the case when K(1) is chosen for TSC identification. As can be observed, we cannot obtain a reliable power gap between the two mean power curves associated with K(1) = 0 and K(1) = 1. This is because the power of LMs is randomly distributed into the chosen groups and cannot be differentiated.
Impact of LC load
Finally, we study the impact of the LC size on the number of power traces required for TSC identification, and the results are shown in Table 4 . It is clear that fewer power traces are needed with higher capacitive load of the leakage source (implemented as flip-flops in this work). This result shows that, while a large leakage source could help attackers to extract keys easily, it also makes TSC easy to be detected by our proposed TSC identification technique.
CONCLUSION
In this paper, we first investigate the general architecture of powerbased TSCs and introduce the corresponding key cracking process. Next, we present a novel TSC identification technique by exploiting the correlation between key and power side-channel. Experimental results for two TSC designs inserted into an AES cryptosystem implemented on FPGA proved the effectiveness of our proposed solution. 
ACKNOWLEDGMENT
