Abstract: Hardware Trojan (HT), which usually is activated under rare conditions associated with low transition bits in a circuit, can lead to circuit functional failure or information leakage. Effectively activating hidden HTs is a major challenge during the HT detection process. In this study, the authors propose a novel approach for efficiently activating Trojans hidden in digital signal processing (DSP) circuits by increasing the transition activity of rare bits. In particular, the bit-level transition activity can be increased by controlling signal word-level statistical properties, such as standard deviation and autocorrelation, and their propagation through various operators involved in DSP circuit design. As a result, the proposed approach can generate appropriate test vectors, which effectively activate internal rare nodes and trigger HTs. The experimental results show that using the proposed approach the transition activity of rare bits is significantly increased and various HTs inserted into DSP circuits are activated with reduced time. By comparing to an existing activation approach working at the bit level, the proposed approach is superior in test vectors generation time up to 9 times reduction and HT activation time up to 66 times reduction.
Introduction
Malicious extra logic, namely hardware Trojan (HT), could be integrated into an integrated circuit (IC) during design with thirdparty IP cores and CAD tools or implanted during fabrication process [1] . HTs can cause the circuit failure or leak confidential information, which raise serious concerns about trustworthiness of ICs used in Internet of Things (IoT) [2] and consumer electronics (CE) [3] , as well as mission critical applications [4] . The latest prosperity of IoT and CE devices depends on the evolution of digital signal processing (DSP). Therefore, security in DSP circuits has become a major challenge for IoT and CE devices, where HT must be taken seriously for sake of trustworthy off-the-shelf commercial electronics devices [3] .
Based on the facts that HT could add redundant logic, cause functional errors and change the side-channel parameters, various detection approaches have been developed. In terms of whether HTs need to be active or not during detection, HT detection approaches can be classified into static detection and dynamic detection. Static detection does not require HTs to have activity, and determines their existence by formal verification to identify redundant and unused circuits [5] [6] [7] or by leakage power measurement [8] . High complexity of the host designs, the small size of HTs and process variations create great challenges to the static approaches. Dynamic detection finds HTs by observing their impact on circuit functionality and side-channel parameters when HTs are active. Therefore, the efficiency of dynamic HT detection significantly relies on HT activation. This is because usually HT circuits are quietly hidden in their host design and can only be triggered in rare conditions [9] . As a result, dynamic HT detection approaches require increasing HT's activity to reveal its impacts under process and environmental variations [10, 11] .
The trigger inputs of HTs are usually circuit internal nodes which have low transition probabilities. We call these nodes as rare nodes. The main objective of HT activation is to increase toggling rate of the rare nodes. Increasing the transition probabilities of the rare nodes can improve the probability of activating a HT and thus facilitate HT detection [1] . However, due to the large space of possible HTs, it is infeasible in practice to enumerate all possible HT instances to generate deterministic input vectors to activate each HT [1] . Dealing with this problem, Salmani et al. [9] proposed an approach at the logic gate level to identify rare nodes which have transition probability less than a threshold and insert dummy scan flip-flops to ensure all circuit nodes have transition probabilities higher than the threshold. The inserted dummy scan flip-flops increase circuit area. In [12] , a statistical approach was developed to generate a set of input vectors which can cause the rare nodes to their rare logic values multiple times. In this way, a HT with a triggering condition composed of the rare nodes is highly probable to be activated. The set of input vectors is determined in an iterative manner with random inputs and in each iteration only the input vector which increases the number of nodes satisfying their rare values is accepted in the final set. The approach was improved in [13] by combining genetic algorithm and Boolean satisfiability approaches into the input vector generation process during which trigger coverage and Trojan coverage are considered. This kind of statistical approach for HT activation is computationally more tractable than deterministic approaches and does not introduce silicon overhead.
This paper aims to be a further progress along the direction of the statistical approach for HT detection in DSP cores that may be integrated into a system. Recently, IP protection approaches specifically for DSP circuits were proposed [14, 15] . From a different standpoint of view, we look at whether DSP cores introduce security threats to the systems or not. A simple power analysis resistant methodology was proposed for DSP-embedded processors [16] . In [2] , HT was inserted into a coordinate rotation digital computer (CORDIC) core which is capable of performing DSP calculations. The HT was designed to fulfil the denial of service (DoS) attack. The golden reference fingerprint method was used to detect the inserted HT by finding the difference in register transfer level (RTL) schematics or in path delay [2] . In practice, the golden reference is not always available, e.g. the third-party IPs. In addition, given DSP cores and a system incorporating multiple DSP cores, it is extremely time consuming to analyse and detect a HT at the gate level using the approaches such as [9, 12] . For efficient detection, an activation approach working on high-level abstraction of circuits and using word-level statistical properties of signals is proposed in this paper. Here, a word is simply a bounded array of bits and each bit is a circuit node. For example, for the add operation x 3 [3: 0] = x 1 [3: 0] + x 2 [3: 0] the proposed approach will consider the four-bit signals x 1 , x 2 and x 3 as three words, respectively, and analyse the statistical propagation properties of the signal words through the adder. Through word-level analysis we can identify which bits in x 3 are rare nodes and derive inputs for x 1 and x 2 to increase the transition probabilities of the rare nodes.
Abstracting circuit description to the word level and analysing signal properties of multiple bits under the same operation together show two main advantages in reducing complexity over the bit level. First, the word-level analysis eases logic analysis due to the simplified representation and computation it implies. Second, the word-level representation of a circuit has less variables during signal propagation, which is significantly faster than does in the bit-level description [17] .
Specifically, the proposed approach works on a data flow graph (DFG) of a circuit composed of various DSP operators. The approach first locates an internal rare node b i by signal transition activity calculation based on the word-level statistical propertyautocorrelation (ρ). Then, the expected statistical property (μ i , σ i , ρ i ) is determined for the signal word x where the rare node b i locates, such that the transition activity of b i is maximised. Finally, a back-tracking algorithm is applied to determine the statistics (μ, σ, ρ) of primary input signals based on statistical property propagation. Such input signals applied to the primary inputs can ensure x have the expected property. An early version of the approach was published in [18] and in this paper we extend the approach in three aspects: (i) a constant assignment method is developed for propagation control of word-level signal properties of DSP operators to reduce the complexity, (ii) back-tracking algorithms for circuits with non-reconvergent structure and reconvergent structure are presented and (iii) various HTs are designed and inserted into classical DSP circuits to evaluate the approach in practice. The proposed approach can be applied to both pre-silicon and post-silicon stages.
The contributions of this paper are listed as follows:
• We exploit the relationship between bit-level transition activity and word-level signal statistics, to enhance the transition activity of internal rare nodes. The approach increases the possibility of activating HTs while avoiding the complexity of bit-level logic analysis.
• By analysing propagation properties of the operators involved in DSP circuits, we establish the conditions for monotonous control of signal statistical properties from the primary input ports to the internal nodes. Such monotonousness facilitates word-level signal statistics management.
• Based on the monotonous control, we propose the back-tracking algorithm to determine the statistical properties of primary input signals in the circuits with non-reconvergent structure and reconvergent structure. The purpose is to increase the transitions of the internal rare nodes from the primary inputs, so that no extra access facilities are needed.
• The proposed approach is evaluated with several typical DSP circuits and various HTs. The evaluation includes the efficiency of the proposed approach in increasing transition activity of rare nodes, the capability in HT activation and the runtime overhead of the approach, by comparison with existing approaches.
The rest of the paper is organised as follows. Section 2 briefly introduces related works on HT detection. Section 3 presents the principle of the proposed approach. Section 4 presents the proposed approach in detail. Section 5 shows evaluation results and the conclusion is given in Section 6.
Related work
HT is a piece of circuit that is added to the design or is modified from the original design for malicious purposes. In terms of activation characteristics, HTs could be always on or condition based. To be hidden in chips, the HTs usually are designed to be silent in most of time. A typical HT contains trigger and payload [19] . The payload circuit is responsible for implementing HT attacks, which may result in serious effects such as DoS, confidential information leakage and chip reliability degradation.
The trigger monitors a certain condition, which could be a specific logic state, a particular input pattern or a specific counter value.
A comprehensive survey of the state-of-art HT attacks and corresponding protection approaches were carried out in [1] . There are generally three classes of countermeasures against HT attacks: HT detection approaches, design and manufacturing approaches preventing HT insertion and run-time monitoring approaches. Among the countermeasure approaches, HT detection is the widely investigated one. It is a process that determines whether any HTs exist in the circuit. HT detection approaches can be further classified into static and dynamic detection approaches.
In static detection approaches, HTs are not required to be active. Formal verification [20, 21] and leakage power measurement [22] are the widely used techniques. These static approaches could assure 100% coverage when applicable. However, their practical application faces great challenges. First, as circuit size increases the complexity of formal analysis increases exponentially, resulting in unaffordable memory and time requirements. Second, given small circuit modification, large process variations and environmental noise, the leakage power difference introduced by the HTs may not be observable.
Therefore, dynamic detection approaches are developed, which require the HTs to be running to disclose their effects. In [23] , a criterion for HT detection is defined, which says that sufficient switching activity (e.g. > 10
) at the inputs of a gate is needed to detect a HT. Therefore, HT activation is a precondition for dynamic HT detection. In [10] , through input control a circuit is divided into multiple regions and only one region is activated per time while keeping other parts quiet. In [24] , a sustained vector technique is applied to both genuine circuit and Trojan circuit with the constant primary inputs for several clock cycles, in order to reduce extraneous toggles within the genuine circuit. These approaches help magnify the HT's impact on the circuit transient power in order to improve the efficiency of the side-channel analysis. A random sampling approach [12] is proposed to generate effective input vectors for HT activation. The basic concept is to detect low probability conditions at the internal nodes and then derive an optimal set of vectors to activate the rare nodes multiple times. The probability of activating a Trojan is improved by increasing the transitions of nodes that are random-pattern resistant. In addition, at the design stage, dummy flip-flops [9] and 2-to-1 MUXs [25] are inserted to circuit designs to increase the transition probabilities of rare nodes. Moreover, the dual modular redundant (DMR) scheme is applied to detect possible HTs in third-party IPs and a high-level synthesis approach for low-cost DMR scheduling is proposed in [26] . Detection of the HTs in the DMR scheme still needs to activate HTs. Over the past few years, machine learning techniques [27, 28] have started to be used in dynamic HT detection.
Idea of the proposed approach
The present approach focuses on condition-based HTs and aims at statistically maximising the transition probability of internal rare circuit nodes, so that the HT is activated more easily and can be detected.
In this paper, a rare node refers to a bit in circuit behavioural description with low 0-1 transition activity. In the rest of paper, rare node and rare bit are used interchangeably. Let x(n) be a signal word and b i (n) be the value of bit i of x(n) at time index n. The transition activity t i of bit i can be calculated as [29] 
where p i is the probability of bit i being 1, and ρ i is the temporal autocorrelation of bit i, i.e. the correlation between b i (n) and b i (n − 1). Therefore, the transition activity of bit i is closely related to p i and ρ i . The existing HT activation approaches increase the transition activity mainly from the aspect of p i . For example, in [12] a set of input vectors is generated and in [9] dummy scan flip-flops are inserted, to increase the probability of generating a specific trigger condition as below:
where the trigger condition is associated with q nodes, the rare value is assumed as 1 (otherwise using 1 − p i in (2)), and c ≥ 1. However, it was shown in [29, 30] that ρ i has a direct impact on the transition activity of bit i. Moreover, when considering multiple bits together x(n) = {b i (n)} 0 ≤ i ≤ m − 1 , the word-level statistical properties of x(n) such as mean μ, standard deviation σ and temporal autocorrelation ρ can be used to estimate the transition activities of each bit i. This observation was exploited in [29, 30] for power consumption investigation on hardware implementation of DSP algorithms. We innovatively leverage this observation for HT activation. The main advantage of our proposed approach over existing ones is that word-level statistical analysis and management are more efficient for logic analysis and signal propagation and thus dramatically reduce time for generating input vectors. This allows us to generate efficient inputs in limited time, which have high probability to activate HTs.
Proposed approach for Trojan activation
In this section, we present the proposed approach for activating HTs. First, the threat model considered in this paper is described. Second, we present the way to increase transition activity of rare bits by use of word-level signal statistical properties. Third, we show the method to control the internal signal statistical properties from the primary inputs, based on signal statistical property propagation analysis. This allows us to increase the toggling rate of internal rare bits from the primary inputs of a circuit. Various operators involved in DSP circuits are discussed. Finally, we propose a back-tracking algorithm to generate input vectors with the desired statistical characteristics for HT activation. The proposed approach is comprehensively discussed with consideration of circuits with non-reconvergent and reconvergent structures.
Threat model
DSP plays a critical role in a large number of applications, such as video processing, wired and wireless communications, speech processing and biomedical signal processing [15] . The threat model considered in this paper is that the condition-based digital HTs are inserted into ICs along with DSP circuits by untrusted members in DSP design teams, the integrated untrustworthy third-party DSP cores or untrusted members in manufacture parties. Attackers by inserting HTs into the DSP part could steal privacy or make critical damages to the electronic systems under certain conditions.
In terms of trigger condition, digital HTs can be classified into pattern-based HTs and counter-based HTs [31] . Pattern-triggered HTs contain single-pattern (SP), case-pattern (CP) and statemachine-pattern (SM) HTs. For SP triggered HT, there exists only one specific input pattern that can activate the Trojan. For the CP triggered Trojan, multiple input patterns exist to trigger the Trojan. A specific sequence of input patterns is needed to trigger the SM Trojan. For a counter-based Trojan, the Trojan will not be activated until the counter reaches a specific value or a certain range of values. Both pattern-based HTs and counter-based HTs rely on rare conditions that are not easily satisfied during normal function of the circuits. Fig. 1 shows typical structures of the HT triggers, which are targeted in this paper. The rare conditions are often designed based on multiple rare nodes (T 1 , T 2 , …, T q ) from the original circuit.
The HT payload could change the chip's side-channel properties for privacy interception, or functionality for system damage. To accelerate HT detection, the proposed approach efficiently generates a set of input vectors to fully or partially activate the HTs. Full activation of HTs refers to payload functioning, i.e. HTs operate and implement attacks. Partial activation refers to generating transitions inside the HT circuits while the HTs do not actually operate. For HTs affecting circuit function, full activation is required in order to detect the HTs. The generated input vectors could be applied at both functional verification stage (pre-silicon) and testing stage (post-silicon). For HTs which change the chip side-channel parameters, partial activation could improve the effectiveness of side-channel analysis-based HT detection methods at the chip testing stage. For example, combined with the power side-channel analysis tools such as [32] , the proposed approach could find the HTs by observing abnormal power patterns in short time.
Transition activity enhancement based on word-level signal statistical properties
Based on the theories in [29, 30] , m bits in a word x(n) are partitioned into four regions and ρ i of bit i in each region is calculated as below:
where i is the bit position, ρ is the temporal autocorrelation of x(n), BP 0 , BP 1 For a specific signal x(n), the breakpoints BP 0 and BP 1 are calculated as [29, 30] BP 0 = log 2 σ + log 2 
where [k] is the number nearest to k. BP 2 can be calculated by computing the common most significant bits in the binary representations of the maximum and minimum numbers of x(n) in its dynamic range [29] . For a rare node i (i > BP 1 ), moving it to the high transition activity region means the breakpoint BP 0 should move towards the most significant bit, such that
Bring (4) into (6) derives that σ of x(n) should at least be
to ensure bit i fall in the high transition activity region [0, BP 0 − 1]. Therefore, by manipulating the statistical properties ρ and σ of the word-level signal x(n), one can increase the transition activity of a rare bit of x(n). The proposed approach works in the following flow.
First, the DFG of a DSP circuit composed of various operators is extracted. The DFG G(V, E) is a set of vertices v i ∈ V and a set of edges e i j ∈ E. Vertex v i represents a primary input/output (PI/PO) port or an operator i, and e i j shows data flow from v i to v j . Define a word-level signal x i j on each edge e i j coming out from v i and going to v j . Fig. 2 shows an example of such DFG. In the example, there are three PIs, two operators and one primary output. We define five word-level signals, {x 15 , x 24 , x 34 , x 45 , x 56 }. Among them, the first three are PI signals.
Second, a normal test set is first propagated forward along the DFG, to obtain the statistical properties of the internal signals x i j , e.g. x 45 and x 56 in Fig. 2 . The temporal correlation ρ i j k of the kth bit of signal x i j is determined, according to (3) . Sorting all ρ i j k in the non-ascending order will reveal the rare bits.
Third, if the kth bit of signal x i j is selected to be the target of transition activity enhancement, the input vector generation involving a back-tracking process starts. For example, if the kth bit of x 56 is the suspicious circuit node, then σ of x 56 (σ x 56 ) is determined first according to (7) , then σ x 45 and σ x 15 are derived such that σ x 56 can be established, and similarly in the last step σ x 24 and σ x 34 are derived. The signals x 15 , x 24 and x 34 with the derived σ are the generated input vectors for the three inputs. This work realises the signal property back-tracking in the next section based on four operators widely used in DSP algorithms [29] , including adder, multiplier, multiplexer and delay.
Propagation control of word-level signal statistical properties over DSP operators
We have known that increasing σ or decreasing ρ in the word level can increase the transition activity of a rare bit. In this subsection, we exploit the word-level propagation properties of signals to control: (i) σ monotonous increase or (ii) ρ monotonous decrease. In this way, increasing σ or decreasing ρ of the PI signals will have a direct impact on the internal signals.
To realise the desired monotonous control, we define the following two conditions. For an operation
which ensures σ monotonous increase. Condition II: x 3 (n)'s ρ is not greater than that of x 1 (n) or x 2 (n)
which ensures ρ monotonous decrease. Note that, the two conditions are independent, and depending on the difficulty of control management different circuits may choose to meet Condition I or Condition II. To meet Condition I or Condition II, each operator has to satisfy certain specific conditions. The specific conditions for the four DSP operators were proved and shown in [18] . Here, we present a simplified version of the conditions, by setting one of inputs x 1 (n) and x 2 (n) as a constant.
The constant assignment shows several advantages of the proposed approach. First, the constant assignment could help to maximise the transition probability of a node. Fig. 3 shows an example at the gate level. The transition probability of node i is calculated as p i (1 − p i ), where p i is the probability of being 1. Therefore, the maximum transition probability is 0.25. In Fig. 3a , when random input vectors are given to the input ports, the transition probability of the output node of gate Tg is 1887/16,384. In Fig. 3b , one input of gate Tg is assigned with a constant 1, which is achieved by assigning 1 to one of the inputs of the OR gate and the AND gate as shown in the figure, and the rest inputs are random vectors. In this way, the transition probability of the output node of gate Tg becomes 0.25. As mentioned earlier, according to (1), the transition activity can now be improved by managing the temporal correlation factor. Section 4.4 will present how the constant assignment is applied and the application constraints. Second, the constant assignment can significantly reduce the complexity of signal propagation and the subsequent back-tracking procedure as shown below. Third, the constant assignment ensures that only the nodes in the interested signal propagation path have high transition activity and the other parts of the circuit stay static, avoiding large power consumption 
Adder:
The μ 3 , σ 3 and ρ 3 of output signal x 3 (n) of the adder can be computed as follows [33] :
where ρ x 1 x 2 and ρ x 2 x 1 are the cross-correlation between x 1 (n) and x 2 (n). The conditions for ρ x 1 x 2 and ρ x 2 x 1 to meet Conditions I or II were shown and proved in [18] . If we set x 2 (n) as a constant C, then (11)-(13) become as follows:
This simplification significantly reduces the complexity of derivation of signal properties while meeting either of the two conditions. Increasing σ 1 is equivalent to increasing σ 3 , and reducing ρ 1 is equivalent to reducing ρ 3 .
Multiplier:
After setting x 2 (n) = C, the signal properties of output x 3 (n) of the multiplier are presented as below:
By setting |C | ≥ 1, Conditions I and II are also satisfied.
Multiplexer and delay:
For the multiplexer, its control signal has probability p c of being 1. Assuming 0 and 1 on the control signal selects x 1 (n) and x 2 (n), respectively. After setting
will meet Condition I. For Condition II, when a signal propagates through the multiplexer, the output signal is one of the input signals under the probability p c . Therefore, the correlation factor of the output will be smaller than the correlation of the input signals. For the delay operator, the signal's statistical properties of the output are identical to that of the input. Conditions I and II are satisfied. For a datapath composed of multiple DSP operators, if we can find the PI signals which make the conditions for all operators satisfied, the signals can control the transition activities of the internal nodes. In the next, we present the back-tracking algorithms to find such signals.
Back-tracking algorithms
The proposed back-tracking method is used to find the desired signals for PIs related to the rare nodes. Here, the relationship is defined as that PIs/POs are in the signal propagation path where the rare nodes are. For the PIs not related to the rare nodes, the properties of the original input signals are kept. Therefore, in the rest of the paper, the PIs/POs means the rare node-related ones.
In the following, we propose the back-tracking algorithms for DFGs without reconvergence and with reconvergence, respectively, to determine the input signal with properties (μ * , σ * , ρ * ) for each PI, so that the transition activities of certain rare nodes are increased.
Non-reconvergent structure:
The back-tracking process for DFGs without reconvergent structures is shown in Algorithm 1 (see Fig. 4 ). It first generates all paths from PIs to the word-level signal S r containing the rare node (line 1). Each path from one of the PIs to S r can be visited by the depth-first search (DFS) algorithm considering S r as the root. A set of such paths is labelled TraPaths. Each path has the logic depth, which is measured as the number of vertices in the path. The path l with the smallest logic depth L min is selected from TraPaths (line 2). Then, for each path i in TraPaths − {l}, the constant assignment is applied (lines 3-13). The process starts from the vertex farthest to PI in each path. Afterwards, based on the constant assignment, the back-tracking process starts on path l from the vertex where S r locates and is applied to the operator vertices one by one (lines 14-16). In the end, a statistical signal (μ * , σ * , ρ * ) is derived and assigned to the PI of the path. At the same time, constants are assigned to the PIs of the other paths in TraPaths. As there is no reconvergent structure, the back-tracking procedure is justifiable.
For example, in the DFG shown in Fig. 2 34 , v 4 , x 45 , v 5 , x 56 }, respectively. Among the three paths, the first path has the smallest logic depth L min = 2. Then, v 2 and v 3 are set to one, and the back-tracking process solves (14) - (16) to determine (μ, σ, ρ) for signal x 15 , which is assigned to v 1 .
Reconvergent structure:
Reconvergence is defined to originate on a vertex v i , if branches of multiple fanouts v i join later at a vertex v j . Reconvergence may lead to contradiction in assignments during the back-tracking process so that Conditions I and II cannot be satisfied. In addition, constant assignment in reconvergent paths may lock the propagation of the HT's payload to the POs, i.e. the HT is activated but the malfunction cannot be observed. Therefore, the back-tracking process is elaborated for the DFGs with reconvergence as below.
For a DFG with reconvergence shown in Fig. 5a , a set of paths TraPaths starting from PIs to S r is generated by breadth-first search (BFS) and the reconvergent structures can be found in the approach [34] . For example, in Fig. 5 if S r is x 89 , then three paths in TraPaths are shown in Figs. 5b-d, respectively . Among them, the second and the third paths contain reconvergence.
To facilitate the description of the proposed approach, three types of PI are further defined. A PI is defined as type-I if the paths from the PI to each of the POs do not contain reconvergence, e.g. v 1 in Fig. 5a . Note that, the path here is not from the PI to S r , but to POs instead. Type-II PI is defined as that the paths from the PI to each of the POs have reconvergence and all reconvergent paths joint before S r , e.g. v 2 and v 3 in Fig. 5a . Type-III PI is defined as that the paths from the PI to each of the POs have reconvergence and some reconvergent paths join after S r . When needed, the constant assignment to the PIs belonging to type-III should be careful to prevent the propagation of HT's payload from being blocked by the enforced assignment.
The back-tracking algorithm for DFGs with reconvergent structures is shown in Algorithm 2 (see Fig. 6 ). The procedure is as follows. First, paths in TraPaths are divided into two sets TraPathsI and TraPathsII (line 1). TraPathsI contains paths with type-I PIs (e.g. the path in Fig. 5b) , and TraPathsII contains paths with type-II PIs (e.g. the paths in Figs. 5c and d) and type-III PIs.
If set TraPathsI is not empty, type-II PIs of the paths in TraPathsII will be assigned with constants and type-III PIs keep the original input signals (lines 2-5). Note that the constant assignment on the paths with the type-II PI starts from the PIs and propagates the constant inputs forward to the vertices in the paths in TraPathsI, and all constants propagated on the paths should be positive. Then, based on the propagation results in the paths in TraPathsII, Algorithm 1 (Fig. 4) is applied to the paths in set TraPathsI to determine the signal properties of type-I PIs. For example, in Fig. 5 , TraPathsI contains one path shown in Fig. 5b , and TraPathsII contains two paths shown in Figs. 5c and d. In this case, type-II PIs v 2 and v 3 are given one and the constant forward propagation leads to x 25 = 1, x 46 = 1 and x 78 = 1. Afterwards, the statistical property of type-I PI v 1 is derived by back-tracking and is used to increase the transition activity of the rare bit in signal x 89 . Also, in this example the probability p c of the control signal of the multiplexer v 8 is kept as the original, to prevent from signal propagation blocking.
If set TraPathsI is empty, then the back-tracking process has to be applied along the paths with reconvergence (lines 6-12). The process needs to solve a group of equations and inequations representing the definitions and general monotonous conditions of each operator in the paths, to find a solution ensuring all operators satisfy Conditions I or II. For DSP operators, the constant assignment solution exists in practice, because the conditions are semidefinite and a wide range of signals can be selected with the analysis in [18] . Fig. 7 shows an example of a five-tap FIR circuit. Assuming the output signal contains the suspicious node, the path from the PI to the PO contains reconvergence. As shown in the figure, the back-propagation still works following the property propagation requirements for each operator and the signal property for the PI is selected as the largest one from different branches. In addition, according to [18] , Condition II (monotonous decrease of ρ) is much easier to be satisfied than Condition I (monotonous increase of σ). Therefore, one can select Condition II as the target during back-tracking. However, increasing σ is more effective in improving transition activity which can be seen in Section 5.
Comparison with standard test generation:
Compared to the classical standard test generation problem, the above backtracking algorithm reduces the algorithm complexity by doing the following:
• Operates at the word level and significantly reduces the number of nodes for signal propagation.
• For non-reconvergent path, exploits the definition of logic depth of the path, and chooses the shortest path to back-track. At the same time, the constant assignment makes all vertices in the shortest path have single fan-out, which significantly simplifies the back-tracking process. • Considers mainly the controllability to enhance transition activity of the internal nodes in certain propagation paths, while keeps the original status of the other paths. Therefore, except from the constant assignment to reconvergent paths, the proposed statistical activation approach does not need to derive specific inputs which deterministically propagate the intermediate results to a primary output (path sensitising), which is the most difficult part in standard test generation.
Up to now, we have presented our proposed approach for transition activity enhancement of internal rare bits. In the next, we will evaluate the approach in terms of its efficiency in increasing the number of transitions, reducing HT activation time and detecting HTs from infected DSP circuits.
Experimental results
To evaluate the proposed approach, a number of DSP circuits are used, including finite/infinite impulse response (FIR/IIR) filters, multiplication-addition (MA) circuit, multiplication accumulation (MAC) circuit and CORDIC circuit [2] . All the benchmark circuits are implemented in the RTL level. For FIR and IIR, the word length is 16 bits. For MAC, MA and CORDIC, the implementation uses 32-bit word length. Thirteen autoregressive moving average (ARMA) signals [29] with different statistical properties are used as input signals, which are shown in Table 1 . We vary σ and ρ to see how they affect the transition activity, while keep μ as zero. However, signals with any values of μ can be used.
The experimental results presented next include three parts. The first part demonstrates the relationship between the word-level statistical properties and the bit-level transition activity and also evaluates the efficiency of the proposed approach to enhance the transition activity of internal rare bits in the benchmark circuits. The transition activity is measured as the number of 0-1 transitions over 1024 clock cycles. In the second part, various HTs are designed and inserted into the DSP circuits at the RTL level following [35] , to evaluate the proposed approach's capability in activating HTs in the real applications. Finally, we make comparisons with existing works. The closest works to the proposed approach, including multiple excitation of rare occurrence (MERO) [12] and HT design and detection method [2] , are compared with our work directly. Both MERO and our approach are implemented in Python language. The HT designed for DSP applications in [2] is implemented and can be detected by our approach without the need of a golden reference.
Results of transition activity enhancement
The signals listed in Table 1 have different values for three statistical parameters μ, σ and ρ. The property differences lead to the different transition activities of the bits in the signal word. The last column of Table 1 reports the BP 1 position of each signal word. Remember that the bits in the position greater than or equal to BP 1 are rare bits, mentioned in Section 4. Therefore, lower the BP 1 position is, more bits in a word have low transition activity. In Table 1 , the property which makes the BP 1 position various is σ. As σ increases, the BP 1 position moves towards the most significant bit.
We start with a five-tap FIR filter to illustrate how the proposed approach is applied and its efficiency in increasing the transition activities of rare bits in signal words. The DFG of the FIR filter is shown in Fig. 7 , where the coefficients are randomly set as C 0 = C 4 = 2, C 1 = C 3 = − 1 and C 2 = 1.
Let select bit 5 of signal Output to enhance its transition activity by use of our approach. With SIG5 as the input to the FIR, the bit has only 19 transitions in 1024 clock cycles. Now we want to move the bit to the high transition activity region. Then, taking i = 5 and ρ = 0.99 into (7) will obtain the expected value of the standard deviation of signal Output σ out = 342. By using the proposed backtracking process for reconvergent structure, we can derive the relation between signal Output and the PI in the FIR: μ out = 3μ in , σ out = 3σ in and ρ out = ρ in . Note that this relation is deduced based on the FIR architecture in Fig. 7 . From this relation, we can obtain the properties of the PI signal, μ in = 0, σ in = 114 and ρ in = 0.99. Hence, by selecting SIG9 from Table 1 and inputting it to FIR, the number of transitions of bit 5 of signal Output becomes 461 in 1024 clock cycles, which is 24.3 times enhancement.
Another way to increase the number of transitions of bit 5 is decreasing ρ out of signal Output. As shown in (3), the lower the word-level autocorrelation is, the higher the bit transition activity is. Therefore, we expect the properties of the PI signal to be μ in = 0, σ in = 10 and ρ in as smaller as better. We choose SIG6 and SIG7 from Table 1 to input to the FIR. The results are also shown in Fig. 8 . The number of transitions of bit 5 grows from 19 to 93 under SIG6 and 101 under SIG7, respectively, with the decrease of ρ in from 0.99 to 0.4 and 0.1, respectively.
HT activation capability
In the previous subsection, transition activity enhancement results on rare nodes are demonstrated. Here we demonstrate the activation capability of the proposed approach on real HTs. Various HTs in the structure shown in Fig. 1 are designed and inserted into the DSP circuits.
In our experiment, SIG5 from Table 1 is randomly selected as the input to make a statistics of toggle rate of each node of the DSP circuits. Note that other random signals with different properties can also be used. Given a threshold value, nodes with a transition probability less than the threshold are selected as rare nodes. Then, trigger signals T 1 to T q are randomly selected from the rare nodes. For example, to design a 6-input Trojan, six nodes are randomly selected from the rare nodes as the trigger inputs of the Trojan. Four six-input SP triggered HTs (HT-1-HT-4) are first inserted into the FIR, IIR and MAC circuits, respectively. Each HT randomly selects a different set of six rare nodes of the circuits. The proposed approach is then applied to activate the HTs. Experimental results are shown in Table 2 , in which the values denote the number of clock cycles needed to activate corresponding Trojans. The results show that using our approach all HTs are activated within 700 clock cycles, while using the random input SIG5 only HT-2 and HT-4 inserted in the FIR circuit are activated with much longer time within the simulation time 10 5 clock cycles.
We also conduct a series of experiments on CP, FSM and counter-based HTs inserted into the FIR circuit. The FSM HT inserted into the FIR circuits is a FSM with four states, and the input length of state transition is 6-bit. The counter-based HT will be triggered when the counter value is 8′hFF. Experimental results are shown in Table 2 . We can make the following observations. First, our approach is able to accelerate the activation of all HTs. Second, among the three types of HTs, CP triggered HTs are easier to be activated and counter-based HTs are harder to be activated.
Comparison with existing works
To demonstrate the superiority of the proposed approach over other related works, we make comparisons against the statistical approach MERO [12] and the HT detection method for DSP applications [2] .
We implemented the same CORDIC benchmark for sine and cosine calculation, where the designed HT [2] is inserted to cause DoS attack. A randomly selected statistical signal SIG5 in Table 1 is inputted into the CORDIC circuit. The results show that within 10 5 clock cycles, the DoS effect is not identified. Using our approach, a signal with (μ = 0, σ = 536870912, ρ = 0.99) is determined and it only takes seven clock cycles to activate the HT and identify the effect of DoS, without the need of a golden reference.
The approach MERO also has an improved version (iMERO) [13] . In this paper, we mainly compare our approach to MERO. Both MERO and our approach attempt to develop directed input test vectors to increase the probability of HT activation based on the fact that HT detection coverage has strong correlation with the activation probability, while iMERO explicitly targets at improving both HT activation and detection rates. For MERO and our approach to further increase HT detection coverage, existing lowcost design-for-testability approaches such as observable test point insertion and scan insertion, which are widely used in today's ICs, could be exploited [12] . The main difference between MERO and our approach is that MERO works at the bit level and uses a heuristic procedure to derive a compact set of test vectors, while our approach works at the word level. To show the impact of this difference, we use several benchmark circuits to evaluate both approaches in terms of runtime overhead for generating the appropriate input test vectors for HT activation as well as the HT activation time which shows the quality of the generated input test vectors.
The test vector generation time of both MERO and our approach is shown in Table 3 and the time is obtained on the PC with a two-core Intel Pentium CPU running at 3 GHz. From the results, we could make the following observations. First, the runtime overhead of our approach is lower than that of MERO, showing up to nine times reduction. Second, the runtime of MERO shows obvious variance over the tested circuits, while our approach does not. These results demonstrate the advantage of operating at the word level. #FF: the number of flip-flops. #Component: the number of logic elements.
In MERO [12] , authors inserted SP triggered HTs into circuits. In this paper, we use the same type of HTs inserted into a 33-tap FIR circuit to make a comparison with MERO. In addition, in the previous experiments, only six-input SP HTs are inserted into benchmarks. To evaluate the effect of the number of inputs q of a q-input HT, we also conduct experiments on the SP HTs with q = 4, 5, 6 and 7, respectively. The comparison results of activation time are shown in Table 4 . We can see that with a larger q the activation time of HT tends to increase and our approach is faster than MERO to activate the HTs with the speedup ranging from 2 to 66 times.
Conclusion
We present a novel approach to accelerate the activation of HTs in DSP circuits. The approach exploits the relation between the wordlevel signal statistical properties and the bit-level transition activity to increase the transitions of rare bits. A back-tracking procedure with the analysis of different circuit structures is presented to find the statistical properties of the primary input' signals based on a set of conditions, which ensure the monotonous propagation of the signal statistical properties through the circuits. The experimental results demonstrate that the proposed approach can effectively activate the HTs inserted in the DSP circuits with reduced time.
The proposed approach is also possible to apply to the circuits containing bitwise operators, such as OR, XOR, AND and so on, as long as the operations perform on a bounded array of bits and word-level signal properties can be extracted. The present approach and the existing approaches can provide complementary capabilities in detecting Trojans. For example, the proposed approach can accelerate HT activation so that HT's effects can be detected by side-channel analysis-based approaches; the proposed approach can quickly find the majority of HTs and leaves formal analysis-based approaches to find corner ones. 
