Abstract-This paper presents a procedure that checks whether a hardware implementation of a cryptographic algorithm is vulnerable to M safe-error attacks. It takes a registertransfer level (RTL) description of a design as an input and exposes the exact timing and a memory element that is a possible target of the attack. As a proof of concept, the presented procedure is applied to a hardware implementation of the Montgomery Powering Ladder, an exponentiation algorithm commonly used in public-key cryptography.
I. INTRODUCTION
A safe-error in a hardware or software implementation of a cryptographic algorithm is an error that does not lead to a faulty result. If such error is key-dependent, then it can be exploited by adversaries to extract the secret key. There are two types of safe-error attacks: the computational, C safe-error attack [8] and the memory, called M safe-error attack [3] . The C type of attack is mounted by inducing any temporary random computational fault(s) inside an Arithmetic Logic Unit (ALU) of a device. It attacks dummy operations, usually introduced to achieve Simple Power Analysis (SPA) resistance [2] . An adversary can induce temporary faults in the ALU or memory during a dummy operation, and depending on a key bit, obtain a faulty or a fault-free result at the output. While the C safeerror attack explores the weakness of an algorithm, the M safe-error attacks explore a possible safe-error in the actual implementation. The basic idea of an M safe-error attack is that, depending on a key bit, faults in some memory blocks are cleared.
Error detection, which verifies the correctness of a computation and typically aborts the results if errors are detected, can not prevent safe-error attacks, since the adversaries do not need the faulty output. In fact, an error detection mechanism helps the adversaries to check if the fault is effective or not. A possible way to thwart these attacks is to construct an algorithm which does not allow the safe behavior of the errors. However, existing countermeasures are typically algorithm specific, and do not include implementation information. Since M safe-error attacks target memory units, e.g., registers, protection on the algorithmic level is not sufficient. A low-level representation of a design which includes enough implementation details, such as RTL description, should be analyzed to expose possible M safe-errors in a design. This paper presents a procedure which detects M safe-errors in a hardware implementation of a cryptographic algorithm. It takes the RTL description of a design as an input and exposes all possible M safe-errors by pointing out the exact timing and the memory element that can be a target for the attack. Such precise information about the vulnerability helps a designer to remove it efficiently. We consider in this paper a simplified model, i.e. a key-related branch, where adversaries use a M safe-error to learn which branch is taken. However, the methodology can be extended to more general settings.
II. BACKGROUND

A. M Safe-error Attacks
An M safe-error is defined as an error that is injected in a memory element, but is cleared during the algorithm execution and hence does not propagate to the output. It can be illustrated on the example of the Montgomery powering ladder [7] , an algorithm commonly used for exponentiation in public-key cryptography. The field multiplications in the Algorithm 1 Montgomery Powering Ladder [7] Require:
5:
end if 8: end for 9: Return R 0 . algorithm are performed in an interleaved manner, as depicted in Algorithm 2. The multiplier R 1 is represented in 2 T -ary form as
where T is the digit size. The main idea of the M safe-error attack is based on the fact that when k j = 0 the result of the multiplication (R 1 ← R 1 × R 0 ) is still correct even if some blocks (R 1 ) i are (a) A fault injected after a register is read.
(b) A fault injected before a register is read. 
Algorithm 2 Interleaved Modular Multiplication
Return R modified after being used for the multiplication (Algorithm 2). Since the result is written back to the multiplier R 1 , all the errors in it are overwritten, which makes them safe-errors. In the case of k j = 1, the result is written to the multiplicand (R 0 ← R 1 × R 0 ), which enables the propagation of the errors injected in R 1 further through the algorithm. Just by checking the correctness of the final result of the exponentiation, an attacker can reveal a bit of the secret exponent k. Joye and Yen showed that a software implementation of the Montgomery powering ladder is vulnerable to M safe-error attack. In addition, Kim et al. [4] proved that M safe-error attack is a threat to a hardware implementation of the same algorithm.
B. The Existing Countermeasures
The existing countermeasures against M safe-error attacks are mainly algorithm specific. Considering public-key cryptography, they are focused on protecting the exponentiation algorithm only. In [3] , Joye and Yen presented a tweak for the Montgomery powering ladder which makes it protected against M safe-error attacks. Even though this countermeasure disables the M safe-error explained in Section II-A, it does not take into account the possible M safe-errors that are enabled when the algorithm is translated into the actual hardware implementation. Further, Boscher et al. presented a blinded fault resistant exponentiation algorithm in [1] . Since the check at the end of the algorithm detects the errors in any of the registers, not only in the one that keeps the final result, the security against safe-error attacks is claimed. However, Section II-A proves that safe-errors do not necessarily cause any faults in the registers at the end of the algorithm. Since all the registers can be fault-free, the proposed check could miss possible safe-errors.
III. M SAFE-ERROR DETECTION PROCEDURE
An efficient way to ensure that a design is resistant to M safe-error attacks is to evaluate its security at the RTL. Analysis at this abstraction level captures possible M safe-errors introduced by both the algorithm itself and the process of hardware implementation. This section describes a procedure that reports the registers that are possible targets for the M safe-error attacks and the clock cycle(s) in which a fault can be injected in order to provoke a safe-error. Besides helping a designer to spot and fix a security flaw, the presented procedure could be useful for an attacker, too. If she/he has an access to the RTL description of a device under attack, the described procedure can expose a location and a timing for a fault injection, thus shortening the preparation time for the attack.
We start the analysis by specifying the fault models covered by the procedure. Then, we explain the detection procedure by detaching it into two parts: firstly, the method which detects all M safe-errors without analyzing their impact on security is explained. Then, we explore if detected errors can be exploited to leak a secret information out of a device.
A. Fault Model
The M safe-error attacks require a transient fault in a memory location. However, if we consider the exact timing of the fault injection and its effect, it is possible to distinguish between two types of faults that can cause a safe-error. Using a simple example of exchanging the values between two registers, we define and explain two fault models commonly used in safe-error attacks.
Let us first identify two events in a clock cycle. The first one occurs when a value of a register is being read (Read bar in Fig. 1 ), i.e. when its value is available on a data bus. The second event occurs when a register is written, i.e. on the rising edge of the next clock cycle. The data exchange is performed as follows: registers R 0 and R 1 are firstly read, which makes the values (R 0 ) data and (R 1 ) data available on the data buses. Then, the registers get each others' values on the rising edge of the next clock cycle by capturing the data from the corresponding data bus.
• A fault injected after a register is read: This fault model is introduced in [4] and assumes that a fault is injected in a register after the Read event (Fig. 1a) . 1 In that case, the value that is read from the register R 0 ((R 0 ) data ) is fault-free. On the rising edge of the next clock cycle, R 0 gets the value of (R 1 ) data and R 1 takes the value of (R 0 ) data . As both registers are updated with the fault-free values, a fault in the register R 0 is overwritten, which makes it a safe-error.
• A fault injected before a register is read: Fig. 1b shows the same process of exchanging the register values, but the fault model assumes that a fault is injected in the register before it is being read. (R 0 ) data is faulty in this case, causing the erroneous value to be written in R 1 on the rising edge of the next clock cycle. Even though such fault is not a safe-error immediately, it can be cleared afterwards, causing the same effect. Fig. 1b shows a simple case where a faulty register R 1 is overwritten by the faulty-free R 0 (Overwrite bar) after certain number of clock cycles. Since after the Overwrite operation (Fig. 1b) both registers are fault-free, a fault injected in R 0 becomes a safe-error. This example shows a safe-error which is caused by a "sloppy" management of a register file. The procedure we present aims at covering both types of faults. While detection of the first type of M safe-errors is relatively easy, the second ones require a detailed analysis of the register file management. Furthermore, the detection procedure is adjustable according to the space precision of an attacker: if he/she is able to inject a fault that affect only a part of a memory location [3] , the procedure can be tuned to evaluate security against the M safe-error attacks caused by such faults. 1 Please note that we assume that a fault is always injected in a register and not in a data bus. A register file, which consists of a set of registers R j , j ∈ {1 .. M }, is updated on the rising edge of the clock signal.
B. Detection of M Safe-errors
Its new values are dependent on the previous state and the combinational logic. Let N denote the total number of clock cycles needed to execute an algorithm scheduled on a device under analysis and let a register R j in a clock cycle i, i ∈ {1 . . . N}, be denoted as R(i, j). For each R(i, j) we define a Boolean parameter A which specifies whether the register is assigned in the cycle i: A = True if it is; otherwise, A = False. Further, for every R(i, j) we define a set S ⊂ {R 1 , R 2 · · · R M } that contains the registers whose values in the next clock cycle (i + 1) are dependent on R(i, j):
Finally, for each register R j and a clock cycle i we define a tag which specifies whether an error in R(i, j) is masked, i.e. it does not propagate to the output. The tag is a Boolean variable and takes True value if an error is a safe-error. Otherwise, tag's value is False. Hence, to provide all the required attributes, a register represented as a structure defined in Table I . Now, the RTL description of a design can be represented in an unrolled format as N × M matrix whose entries are the structures defined in Table I . Each row depicts a clock cycle, while the columns refer to the corresponding registers.
The first step in the safe-error detection procedure is evaluation of all the attributes for every entry in the RT L matrix. The values of i, j, A and S are directly extracted from the RTL description of a design, which is given in one of the hardwaredescription languages (HDL). On the other hand, the value of the masked field has to be calculated based on the values of the other attributes. The initial value of all R(i, j).masked is set to True, except for the last clock cycle, where the registers that keep the final result are tagged as unmasked.
Let us now define a masked error. A fault in R(i, j) is tagged as masked if its value does not propagate to the next clock cycle or if all its destinations (R(i, j).S) are masked in the next clock cycle. This condition can be formally written as:
Based on this equation, we define a procedure which iterates through all the registers R j and evaluates whether a fault in the cycle i is masked, thus assigning True or False to a field R(i, j).masked. In order to make it easier to follow, the procedure is hierarchically decomposed into different functions which are given as the pseudo-code. The top level function TagAll (Algorithm 3) takes an RTL description of a design as an input (matrix RT L M,N ) and iterates through all the registers and clock cycles by calling the function IsMasked. It returns True or False, which is assigned to the masked filed of the examined register. The procedure starts from the last clock cycle, i.e, when the final result of the algorithm is outputted.
Algorithm 3 Function TagAll
Require:
j).masked = IsM asked(RT L(i, j), i, j); end for end for
The function IsMasked (Algorithm 4) determines if an error in R j in the clock cycle i is masked. It evaluates Eq. (1) by checking if a register is overwritten in the current clock cycle or if the masked fields of all its destinations (R(i, j).S). If at least one of the destination registers is not masked in the next clock cycle, the masked field of the register being examined is set to False.
Algorithm 4 Function IsMasked
A function AllDestinationsMasked (Algorithm 5), which is called from the function IsNotMasked, iterates through the set S of the register currently being examined and performs the logic conjunction of their masked fields. Hence, if at least one destination is unmasked, the function returns False.
The procedure is completed when all R(i, j).masked fields are evaluated.
C. Security Analysis
The procedure presented in the previous section exposes all the targets (registers) and the time moments of the fault
injection that can cause a safe-error. However, the fact that an injected error is safe does not necessarily mean that a design is vulnerable to M safe-error attacks. An error has to be related to a key-dependent information in order to be exploitable by the attack. Using a model of a public-key cryptographic algorithm, we hereby explain how the presented procedure can be used to check if a design is vulnerable to M safe-error attacks. Fig. 3 depicts a typical flow diagram of a public-key cryptographic algorithm, e.g. an exponentiation algorithm. For simplicity reasons, we assume that the branching decision is made depending only on one bit of the key information (k r ). If we assume that the RTL desctiption of a diagram in Fig. 3 is already available, we start analyzing one of its rounds by detaching it into two parts which correspond to two key related branches. Then, considering them as two separate designs, we apply the M safe-error detection procedure to each. As a result, two matrices RT L Table I .
Let us now check the security of a design against the fault models defined in Section III-A.
• The fault model depicted in Fig. 1a : In this case, a fault is injected after the register is being read for the computation. Since an error in R i in the clock cycle j is cleared if the register gets a new value in the analyzed clock cycle (R(i, j).A = 1), but not otherwise (R(i, j).A = 0), it is sufficient to compare the A values of the corresponding registers.
A it implies that a fault injected in R i in clock cycle j leaks the information about a secret key.
• The fault model depicted in Fig. 1b: The similar analysis can be performed for the second fault model: a fault is injected in a register before it is read for the computations. If RT L 0 (i, j).masked differs from RT L 1 (i, j).masked, this means an error at R j in clock cycle i is cleared only for k r = 0 or k r = 1, but not for both. As a result, such a fault leak one key bit. Algorithm 6 iterates through both matrices and reports all the locations and clock cycles where the safe-errors are possible.
Algorithm 6 Function Compare
The presented M safe-error detection procedure captures the basic nature of safe-errors: a key-dependent behavior of an injected fault. Although the case study explained in this section assumes that a key-related branching decision is made depending only on one key bit, which is typical for publickey cryptographic algorithms, the presented procedure could be extended and applied to symmetric-key algorithms, too. The main difference is that a design needs to be detached into 2 m parts, where m is the number of key bits used in a key-related decision 2 . By analyzing these parts separately and comparing the results, the errors that are masked in some cases could be identified.
IV. CASE STUDY: THE MONTGOMERY POWERING LADDER
This section describes the application of the M safe-error detection procedure to an implementation of the Montgomery Powering Ladder (MPL), given in Algorithm 1. Let us assume that the multiplications in the algorithm are implemented in a digit-serial manner 3 , as depicted in Algorithm 2. Without loss of generality, we assume that the multiplier R 1 is composed of only two digits (m = 2 in Algorithm 2). Since the fault model we use allows faults that affect only one digit, we represent R 1 as two separate registers: R Tables IIa and IIb show the register management of one round of the algorithm 2 The key related decisions in symmetric-key algorithms are usually dependent on a group of key bits, e.g. a byte 3 The actual implementation of the squaring operation is not considered in this analysis for k r = 0 and k r = 1, respectively. Register R 2 is used as a temporary storage for the intermediate result in the interleaved multiplication. Every row in the tables lists the operations that are executed in one clock cycle in parallel. Following the steps from Section II-A, we construct two matrices, RT L 0 and RT L 1 (Table III) , where every register is represented as the structure from Table I .
Finally, RT L 0 and RT L 1 are compared in order to expose a register and a clock cycle that can be the target of the M safe-error attack. Since RT L 0 and RT L 1 have identical A, it can be concluded that the fault model from Figure 1a does not apply on this design. On the other hand, it is obvious that masked fields differ for certain registers. We hereby analyze all such cases:
If an error is injected in a higher part of the register R 1 in the specified clock cycle, it is overwritten in the CLK#3 when k r = 0 (R 1 ← R 2 ). In the case of k r = 1, the error propagates further. Indeed, as shown in [3] , such error leaks the information about the secret exponent k.
• A fault in R 1 in CLK#3: In this case a fault is overwritten by the assignment operation R 1 ← R 2 when k r = 0. When k r = 1 an error propagates further since R 1 is squared in that case 4 .
• A fault in R 0 in CLK#3: Similarly, if a fault is injected in R 0 in the third clock cycle, it is cleared when k r = 0. Otherwise, it propagates further through the algorithm. The first detected M safe-error is caused by the interleaved modular multiplication used in the algorithm. On the other hand, the other two vulnerabilities are the result of the actual implementation, which justifies our assumption that the M safe-error detection procedure must be applied on the RTL level. In addition, this example depicts the flexibility of the presented method in respect to the strength of an attacker. If he/she is able to inject a fault in a part of a variable stored in a register, the detection method can be tuned to process such a fault. The registers are then divided into the sub-registers which correspond to the maximal precision of the attacker. For instance, if an attacker can flip a single bit of a register, all flip-flops in a design must be considered as separate registers.
The flexibility comes at the cost of increased computational time, which is discussed in the next section.
V. DISCUSSION
This section discusses the complexity and the applicability of the presented M safe-error detection procedure. Since it is based on the exhaustive check of all the registers in a design through all clock cycles, its time complexity is O(M 2 × N ), where M is the number of registers and N is the total number of clock cycles required to execute an implemented algorithm. The complexity of the procedure is not necessarily a limitation, considering the following the facts about its application. Firstly, it targets implementations of the cryptographic algorithms, which are usually only a part of a 4 A fault can injected in either R 1 1 or R 0 1 bigger design. In addition, the implementations of public-key cryptographic algorithms usually aim at a small number of registers. For instance, typical hardware implementations of Elliptic Curve Cryptography (ECC) [6] and RSA [5] contain up to 10 registers. Finally, since the application of the M safe-error detection method to a public-key algorithm assumes processing of only one of its rounds, it significantly reduces the number of clock cycles required for the analysis.
VI. CONCLUSION AND FUTURE WORK
This paper presents a procedure that checks whether a hardware implementation of a cryptographic algorithm is vulnerable to M safe-error attacks. It takes an RTL description of a design and exposes the registers and the clock cycles that are possible targets for the attack. Such a procedure enables the security check of a device during a design process, hence providing a possibility to hardware designer to "fix" it before the device is taped out.
Our future research will focus on improving the time efficiency of the presented procedure and on its application to hardware implementations of several different cryptographic algorithms, e.g. AES and RSA. 
