Abstract-The series of published works, related to differential fault attack (DFA) against the Grain family, require quite a large number (hundreds) of faults and also several assumptions on the locations and the timings of the faults injected. In this paper, we present a significantly improved scenario from the adversarial point of view for DFA against the Grain family of stream ciphers. Our model is the most realistic one so far as it considers that the cipher has to be re-keyed only a few times and faults can be injected at any random location and at any random point of time, i.e., no precise control is needed over the location and timing of fault injections. We construct equations based on the algebraic description of the cipher by introducing new variables so that the degrees of the equations do not increase. In line of algebraic cryptanalysis, we accumulate such equations based on the fault-free and faulty key-stream bits and solve them using the SAT Solver Cryptominisat-2.9.5 installed with SAGE 5.7. In a few minutes we can recover the state of Grain v1, Grain-128 and Grain-128a with as little as 10, 4 and 10 faults respectively.
INTRODUCTION

F
AULT attacks study the robustness of a cryptosystem in a setting that is weaker than its original or expected mode of operation. Though optimistic, this model of attack can successfully be employed against a number of cryptographic primitives [8] . In a practical setting, it is indeed possible to mount such an attack when the number of faults is very low and we do not require precise controls over fault injection, both in terms of exact register locations as well as timing. In this paper, we achieve these goals and present the most efficient differential fault attack (DFA), as of now, against the Grain family. Our main contribution in this paper is to generate a large number of equations from each of the faults and further, to introduce new variables at each stage to keep the degree of each equation as low as possible. We use the SAT solver Cryptominisat-2.9.5 [34] installed in SAGE [36] to solve the equations towards obtaining the complete secret key.
Fault attacks have received serious attention in literature for quite some time [12] , [14] . Such attacks on stream ciphers have been studied in [24] where a typical attack scenario consists of an adversary who can inject a random fault (using laser shots/clock glitches [32] , [33] ) in a cryptographic device as a result of which one or more bits of its internal state get altered. The faulty output from this altered device is then used to deduce information about its internal state/secret key. Here the adversary requires certain privileges like the ability to re-key the device, control the timing of the fault etc.
The more privileges the adversary is granted, the more the attack becomes impractical and unrealistic.
In [13] , differential cryptanalysis for stream ciphers has been studied. The authors showed that the key difference or initial value difference can be used to predict key-stream difference. In differential fault attack, the attacker is allowed to inject faults in the internal state during the key-stream generation too. Then by analyzing the difference between faulty and the fault-free key-streams, she should be able to obtain some information about the internal state.
The Grain family of stream ciphers [1], [2] , [22] , [23] has received a lot of attention and in particular, Grain v1 is in the hardware profile of eStream [20] . In most of the fault attacks reported so far [4] , [10] , [28] on this cipher, the adversary is granted far too many privileges to make the attacks practical. In particular, these works consider reproducing the faults in the same location more than once. This particular assumption has been relaxed in the work [6] , where faulting the same location more than once could be avoided. Further the work [6] considered some restricted cases to accommodate situations when a fault affects more than one register location. For detailed cryptanalytic results related to Grain family, the reader may refer to [3] , [9] , [11] , [16] , [17] , [18] , [19] , [21] , [29] , [30] , [35] , [37] , [38] .
Recently, in [27] , Trivium has been cryptanalysed under fault model. The work [27] assumed that the fault positions as well as injection time of faults are random. Considering random fault positions, but the fixed fault injection time, a DFA on Trivium using SAT solvers has been presented in [31] (see also [25] , [26] for other DFAs on Trivium). For more than a decade, there has been seminal research in the area of algebraic cryptanalysis. One may have a look at [7] for details on algebraic cryptanalysis using SAT solvers. The main idea here is to solve multivariate polynomial systems that describe a cipher. For a very brief introduction in this, one may refer [31, Section 5] . The DFA on Trivium [31] requires only two faults and this is far fewer than the fault S. Sarkar is with the Chennai Mathematical Institute, Chennai 603103, India. E-mail: sarkar.santanu.bir@gmail.com. S. Banik requirements against the Grain family. This motivates us to see how this kind of algebraic cryptanalysis can be exploited towards DFA against Grain family.
Our contribution. This paper improves our series of earlier works [4] , [5] , [6] and in the process obtains the best possible parameters, given all the published works on DFA against Grain family, to date. The idea of [6] considers introducing faults in all the LFSR locations. Given an n-bit LFSR, this requires an expected n ln n random fault injections which is around 2 8:5 for n ¼ 80 as in Grain v1. Our preliminary intuition was that the number of faults used in the existing works [4] , [6] , [10] , [28] is considerably high. This is because, for each fault, one can generate a significant number of equations using the faulty key-stream bits that might be enough to solve the system. This convinced us that one actually requires only a few faults. We generated the equations carefully, by introducing new variables at each stage, so that although the number of variables increased, the degrees of the equations do not increase. Since the degrees of the equations were low, they could be fed into a SAT solver to obtain the solutions in the direction of [31] . We found that this strategy indeed succeeds and it is possible to recover the secret key from only a few faults (not more than 10). Furthermore, all the published literature on DFAs against the Grain family require the adversary to be able to inject fault at a precise stage of operation, i.e., at the beginning of the PRGA. In this work we will show that in Grain v1 and Grain-128, the adversary does not need to impose such precise control over fault injections. If the adversary is able to inject fault at some PRGA round t 2 ½0; t max À 1 (the value of t max varies for Grain v1 and Grain-128 and is related to the algebraic structure of each cipher), then with high probability she is able to deduce the value of t and also the register location that the fault has affected. The same can not be reproduced for Grain-128a because the cipher does not make all the key-stream bits directly available to the attacker. Thus, we provide the following improvements.
We require very few faults and thus the actual hardware will be minimally stressed and the DFA can be implemented in practice. There are certain kinds of fault attacks where the device may get damaged [8] , and in such cases less number of faults invites less stress. So far all the DFAs on Grain family considered injecting fault either in the LFSR or in the NFSR.
Here we can tackle the cases where fault is injected randomly considering both the LFSR and the NFSR at the same time. We also improve upon the technique of fault location identification proposed in [6] [6] ) so that the probability of success in exact fault location identification improves. For Grain v1 and Grain-128, we outline techniques that allow the adversary to relax requirements related to the timing of fault injections. Table 1 summarizes the contribution of our work with the previous works in this topic. 1 One should note that the main focus of this work is on single-bit faults. We considered how one can handle the multiple-bit faults in some restricted cases in Section 4.4. The way we handle the multiple-bit faults are as follows. We try to distinguish the multiple-bit faults from the singlebit faults and then discard the key-stream bits produced due to a multiple-bit fault. As the number of bits affected in a multiple-bit fault increases, the options of different kinds of Signatures increase and thus it becomes harder to tackle that as the analysis becomes tedious. In this paper we have studied the cases for at-most three consecutive bits.
ALGEBRAIC DESCRIPTION OF THE GRAIN FAMILY
Consider a i ; b i ; c i 2 f0; 1g for i 2 ½0; . . . ; n À 1. Any cipher in the Grain family consists of an n-bit LFSR and an n-bit NFSR (see Fig. 1 and Table 2 ). The update function of the LFSR is given by the equation
where Y t ¼ ½y t ; y tþ1 ; . . . ; y tþnÀ1 is an n-bit vector that denotes the LFSR state at the tth clock interval and f is a linear function on the LFSR state bits obtained from a primitive polynomial in GF ð2Þ of degree n.
The NFSR state is updated as where h is a non-linear Boolean function, and may be degenerate on some of the variables.
Key scheduling algorithm (KSA). The Grain family uses an n-bit key K, and an m-bit initialization vector IV , with m < n. The key is loaded in the NFSR and the IV is loaded in the 0th to the ðm À 1Þth bits of the LFSR. The remaining mth to ðn À 1Þth bits of the LFSR are loaded with some fixed pad P 2 f0; 1g nÀm . Then, for the first 2n clocks, the keystream bit z t is XOR-ed to both the LFSR and NFSR update functions.
Pseudo-random key-stream generation algorithm (PRGA). After the KSA, z t is no longer XOR-ed to the LFSR and the NFSR but it is used as the Pseudo-Random key-stream bit. Thus, during this phase, the LFSR and NFSR are updated as y tþn ¼ fðY t Þ; x tþn ¼ y t È gðX t Þ.
MAC generation algorithm in Grain-128a. Grain-128a [1], [2] also considers the generation of MAC. Here we follow the description given in [2] . Let z 0 ; z 1 ; z 2 ; . . . denote the keystream bits produced by the cipher. Assume that we have a message of length L defined by the bits m 0 ; . . . ; m LÀ1 . Set m L ¼ 1 as padding. To provide authentication, two registers, called accumulator and shift register of size 32 bits each, are used. The contents of the accumulator and the shift register at time t are denoted by a 0 t ; . . . ; a 31 t and r t ; . . . ; r tþ31 respectively. The accumulator is initialized through a j 0 ¼ z j ; 0 j 31 and the shift register is initialized through r j ¼ z 32þj ; 0 j 31. The shift register is updated as r tþ32 ¼ z 64þ2tþ1 . The accumulator is updated as a Lþ1 is used for authentication. For fault attack, we need to consider the key-stream bits and thus it is important to note that we can not use the first 64 bits of key-stream and also then each alternative key-stream bit as those are used in MAC for Grain-128a.
DESCRIPTION OF THE ATTACK USING SAT SOLVER
Let us first consider that the exact location (only a single register) of the fault is known. We will later (in Section 4) discuss the issues related to how we can obtain the exact location of the fault after injecting the fault at a random location, how we can reject the cases where fault has disturbed more than one locations and how she can guess the time of injection of a randomly timed fault. To explain our idea for exploiting the SAT solver more precisely, let us now consider that the fault will be injected after the KSA, i.e., just before the PRGA starts.
Populating the Bank of Equations for Grain v1
and Grain-128
We will now explain the method of obtaining a large number of equations that will be used for algebraic cryptanalysis. For the time being we will consider the case that will work for Grain v1 or Grain-128. The case of Grain-128a will be little different that we will discuss later.
Equations from Fault-Free Key-Stream
Consider the equations from the '-bit fault-free key-stream z 0 ; . . . ; z 'À1 . As discussed, the LFSR state just after the KSA (at the beginning of the 0th clock) is Y 0 ¼ ½y 0 ; y 1 ; . . . ; y nÀ1 and the NFSR state is X 0 ¼ ½x 0 ; x 1 ; . . . ; x nÀ1 . In general, the value of ' required to complete the attack is more than 160 for all the three versions of Grain. But it is not feasible to compute the algebraic normal form (ANF) of z 159 on any standard PC, for any version of Grain. For example, using a workstation with 1.83 GHz processor, 3 GHz RAM and 2 MB system cache, computing the ANF of any z ' in Grain v1, for ' > 44 is infeasible. The ANF of z 44 in Grain v1 has algebraic degree 17 and consists of 80;643 monomials. To enable the SAT solver to solve the system efficiently, the degrees of the expressions in the equation system must also be controlled. At each PRGA round t > 0, we introduce two new variables y tþn ; x tþn to update the LFSR and NFSR state respectively. To illustrate the technique, let us denote the states at the beginning of the tth (t ! 0) PRGA round as Y t ¼ ½y t ; y tþ1 ; . . . ; y tþnÀ1 ; X t ¼ ½x t ; x tþ1 ; . . . ; x tþnÀ1 : Given these, we formulate the following equations.
i¼0 a i x tþi È hðy t ; . . . ; y tþnÀ1 ; x t ; . . . ; x tþnÀ1 Þ. In the Grain family, while the first equation is linear, the degrees of the other two equations are also not very high. We initially start with 2n variables, y 0 ; y 1 ; . . . ; y nÀ1 and x 0 ; x 1 ; . . . ; x nÀ1 . Then corresponding to each key-stream bit z t , we introduce two new variables y tþn ; x tþn and obtain three more equations. Thus we have in total 2n þ 2' variables and 3' equations. The advantages of using such a technique are as follows. First of all it allows us to formulate the expression for z ' (via a series of equations) for values of ' ! 159. Instead, if at each round t > 0, the variables y tþn ; x tþn were replaced by their equivalent algebraic expressions in y 0 ; y 1 ; . . . ; y nÀ1 and x 0 ; x 1 ; . . . ; x nÀ1 , this would never have been possible on an ordinary PC. Since the expressions in the LFSR and NFSR cells always stay linear, this allows us to control the algebraic degree and the number of monomials in each of the 3' equations so obtained.
Equations from Faulty Key-Streams
We use a similar technique to extract equations from faulty key-streams. Let us assume that a fault is injected in the LFSR location f at PRGA round 0. The same method will work if the fault is injected in the NFSR. Since we re-key the cipher with the same Key-IV before injecting a fault, after fault injection we obtain the state y 0 ; y 1 ; . . . ; y fÀ1 ; 1 È y f ; y fþ1 . . . ; y nÀ1 and x 0 ; x 1 ; . . . ; x nÀ1 at the start of PRGA. Then, as described previously, corresponding to each key-stream bit z t , we introduce two new variables y ðfÞ tþn ; x ðfÞ tþn and obtain three more equations. Thus we have additional 2' variables and 3' equations.
Total Number of Variables and Equations
Given that we introduce n faults after equal number of rekeyings, the total number of variables is 2n þ 2ðn þ 1Þ' and the total number of equations is 3ðn þ 1Þ'.
Populating the Bank of Equations for Grain-128a
We will now explain the formation of equations for Grain128a. Here the first 64 key-stream bits z 0 ; . . . ; z 63 and every other (alternating) key-stream bits thereafter are used to construct MAC. Hence these bits are unavailable to the attacker.
Equations from Fault-Free Key-Stream
Consider the equations from the '-bit fault-free key-stream z 64 ; z 66 ; . . . ; z 64þ2'À2 as only alternative key-stream bits are used for encryption. Hence, similar to the above, we have the following equations. 1) Two LFSR equations: y tþn ¼ fðY t Þ and y tþnþ1 ¼ fðY tþ1 Þ. 2) Two NFSR equations: x tþn ¼ y t È gðX t Þ and x tþnþ1 ¼ y tþ1 È gðX tþ1 Þ.
3) One Key-stream equation:
i¼0 a i x tþi È hðy t ; . . . ; y tþnÀ1 ; x t ; . . . ; x tþnÀ1 Þ. We initially start with 2n variables, y 0 ; y 1 ; . . . ; y nÀ1 and x 0 ; x 1 ; . . . ; x nÀ1 . Then corresponding to each key-stream bit z t , we introduce four new variables y tþn ; y tþnþ1 , x tþn ; x tþnþ1 and obtain five more equations. Thus we have in total 2n þ 4' variables and 5' equations.
Equations from Faulty Key-Streams
Let us consider that a fault is injected in the LFSR location f at the beginning of the PRGA. Again, the method works similarly if the fault is injected in the NFSR. Since we will re-key the cipher with the same Key-IV, in such a case we 
x tþ3 x tþ67 È x tþ11 x tþ13 x tþ3 x tþ67 È x tþ11 x tþ13 gðÁÞ x tþ60 x tþ52 x tþ45 È x tþ33 Èx tþ17 x tþ18 È x tþ27 x tþ59 Èx tþ17 x tþ18 È x tþ27 x tþ59 x tþ28 x tþ21 È x tþ63 x tþ60
Èx tþ40 x tþ48 È x tþ61 Èx tþ40 x tþ48 È x tþ61 x tþ21 x tþ15 È x tþ63 x tþ60 x tþ65 È x tþ68 x tþ84 x tþ65 È x tþ68 x tþ84 x tþ52 x tþ45 x tþ37 È x tþ33
Èx tþ88 x tþ92 x tþ93 x tþ95 x tþ28 x tþ21 x tþ15 x tþ9 È È x tþ22 x tþ24 x tþ25 È x tþ52 x tþ45 x tþ37 x tþ33
x tþ70 x tþ78 x tþ82 x tþ28 x tþ21 y tþ3 y tþ25 y tþ46 È y tþ3 y tþ46 y tþ64 È y tþ3 y tþ46
x tþ12 x tþ95 y tþ95 È x tþ12 x tþ12 x tþ95 y tþ94 È x tþ12 hðÁÞ x tþ63 È y tþ25 y tþ46 x tþ63 È y tþ8 È y tþ13 y tþ20 È x tþ95 y tþ8 È y tþ13 y tþ20 È x tþ95 y tþ46 y tþ64 x tþ63 È y tþ3 y tþ42 È y tþ60 y tþ79 y tþ42 È y tþ60 y tþ79 y tþ64 È y tþ46 y tþ64 È y tþ64
will obtain the state y 0 ; y 1 ; . . . ; y fÀ1 ; 1 È y f ; y fþ1 . . . ; y nÀ1 and x 0 ; x 1 ; . . . ; x nÀ1 . Then corresponding to each key-stream bit and obtain five more equations. Thus we have additional 4' variables and 5' equations.
Total Number of Variables and Equations
Given that we introduce n faults after these many re-keyings, the total number of variables is 2n þ 4ðn þ 1Þ' and the total number of equations is 5ðn þ 1Þ'. All these equations are used in the SAT solver to obtain y 0 ; y 1 ; . . . ; y nÀ1 and x 0 ; x 1 ; . . . ; x nÀ1 . This completes the attack. As the ciphers in the Grain family are invertible both in KSA and PRGA, one can also get the secret key efficiently.
HOW TO IDENTIFY THE LOCATION OF A RANDOM FAULT
Initially, we will assume that the adversary is able to inject faults at the beginning of the PRGA. In the next section, we will relax this requirement and show how to deduce the injection time of a randomly timed fault (for Grain v1 and Grain-128). Now so far, in all the published literature on fault analysis of Grain [4] , [6] , [10] , [28] , either the LFSR or the NFSR has been chosen for fault injection. Techniques have been proposed in all of the above works, to identify the location of a randomly applied bit fault in the internal state, provided the adversary knows a priori whether it is the LFSR or the NFSR she is injecting faults in. In the attack we propose, the adversary does not need a priori knowledge of this information, i.e., she injects a fault in a random bit location of the internal state without knowing whether the fault has affected a location in the LFSR or the NFSR. We will propose a technique (along the lines of [6] ) that will enable the adversary to not only identify the location of an injected fault but also help her to determine whether the fault was injected in the LFSR or NFSR. The idea of determining the location of a randomly applied fault in the LFSR of the Grain family by comparing the difference of the faultless and faulty key-stream sequence with certain pre-computed Signature vectors was first introduced in [4] . This technique however required the adversary to be able to fault the same LFSR location more than once to conclusively determine the fault location. This idea was further developed in [6] where the differential key-stream was compared with two sets of vectors called the First and Second Signature vectors. Using this technique it was no longer necessary to fault the same location more than once and it enabled the adversary to exercise less control over fault injections.
First and Second Signature Vectors
Here we will summarize the basic ideas in our earlier works [4] , [6] . Consider two initial PRGA states S and S f which differ only in the LFSR location f, i.e., S f is produced when a random fault toggles the LFSR location f of S. Let Z ¼ ½z 0 ; z 1 ; z 2 ; . . . and Z f ¼ ½z
. . . be the faultless and faulty key-stream sequence produced by S and S f respectively.
This is because at each PRGA round i, only a few bits of the internal state are used to produce z i . Therefore at all rounds i when the faulty and faultless internal states differ in bit locations which have no contribution towards z i , the faulty and faultless key-stream bits are guaranteed to be equal. b) Furthermore at certain other PRGA rounds j, z j and z f j are guaranteed to be unequal for all values of S. Again certain bits of the internal state are linearly xor-ed to the output function h to produce the output key-stream bit. It is when an induced fault causes a deterministic difference between S and S f in an odd number of these bits that the output bits are guaranteed to be unequal. c) For every fault location f (0 f < n) in the LFSR, one can define [6] two 2n length vectors Q 
&
The Signature vectors can be efficiently computed by performing analysis of the differential trail of the Grain PRGA following the methods described in [6] . d) The location identification algorithm consists of comparing
f for all f 2 ½0; n À 1 and finding a match. For any element V 2 f0; 1g 2n , define the support of V as
Now define a relation " in f0; 1g 2n such that for two
So the strategy is to formulate the first candidate set C 0;f ¼ fc : 0 c n À 1; Q 1 c " E f g. If jC 0;f j is 1, then the single element in C 0;f will give us the fault location f. If not, we then formulate the second candidate set
then the single element in C 1;f will give us the fault location f. If C 1;f has more than one element, then the strategy fails.
Our Results
We use the basic technique outlined in [6] , but with a few tweaks. First of all, the work in [6] considers faults in LFSR locations only. We have extended the technique to determine the location of a fault introduced in either the LFSR or the NFSR. For this we increase the number of pre-computed first and second Signature vectors to 2n, i.e., one for each register location in the NFSR and LFSR. We now compare the differential vector E f with the first and second Signature vectors of all the 2n register locations using the strategy outlined in [d] . After the comparison with the Signature vectors the algorithm will either output 1) The LFSR or NFSR location f of the induced fault, OR 2) If jC 1;f j > 1, then it outputs a failure message. Further, we also introduce two additional Signature vectors over the two described in [6] .
We performed computer experiments by simulating random single bit faults for 2 20 randomly chosen Key-IVs. The probability that the new algorithm identifies the correct fault location in the LFSR or the NFSR, i.e., PrðjC 1;f j ¼ 1Þ is around 1:00 for Grain v1, 1:00 for Grain-128 and 0:81 for Grain-128a (improving the success probabilities of [6] that were around 0:99 for Grain v1, 1:00 for Grain-128 and 0:79 for Grain-128a).
Improving the Success Probabilities: Third and Fourth Signature Vectors
While the probabilities of success of fault location identification are very high (close to 1) for both Grain v1 and Grain-128, it is around 0:79 for Grain-128a. One of the reasons why the success probability relatively low for Grain-128a is because the cipher does not make each and every keystream bit directly available to the adversary. As has been explained, the key-stream bits of the first 64 rounds and every alternate round thereafter contribute to the computation of the MAC and is not directly available to the adversary. This limits the information available to the location identification algorithm and hence the slightly low success probability. This leaves plenty of room for improving the success probabilities towards 1. We already know that given a single bit fault in the internal state of the cipher, the faulty and the faultless key-streams at certain PRGA rounds are guaranteed to be equal and they are also guaranteed to be different at certain other rounds. However there may be situations when the difference of the faulty and faultless keystream bits at a certain PRGA round i, i.e., z i È z f i is deterministically equal or unequal to the difference of the faulty and faultless key-stream bits at some other PRGA round j even though the difference of these bits at either rounds i or j themselves is not guaranteed to be either 0 or 1.
That is to say z i È z
Alternatively, for some other values of i; j; f, we may get z i È z
One may refer to Examples 1, 2 later for more details.
Generation of Third and Fourth Signature Vectors
Let S; S f be two internal states in Grain that differ only in the LFSR location f at the beginning of the PRGA. Also let
for integers t 0 ; t 1 . We will first use the tool D-GRAINðf; rÞ proposed in [6, Section 2.1] that can be used to analyze all the three versions of Grain. Briefly recalling, D-GRAINðf; rÞ is an algorithm that performs simple truncated differential analysis of the Grain cipher. It takes two inputs: (a) the difference location f 2 ½0; n À 1 of the LFSR, and (b) the number of PRGA rounds r for which the analysis is to be performed. The algorithm initializes a differential engine D f -GRAIN, which consists of an n-integer LFSR and NFSR with the same taps as a given version of Grain, but with different update functions. Table 3 presents a comparison. Here L t ¼ ½u t ; u tþ1 ; . . . ; u tþnÀ1 and N t ¼ ½v t ; v tþ1 ; . . . ; v tþnÀ1 denote respectively the LFSR and NFSR states of D f -GRAIN at the PRGA round t and OR is a map from Z bþ1 ! f0; 1g which roughly represents the logical 'or' operation and is defined as The key-stream element Dz t output from the engine D f -GRAIN is given as Dz t ¼ 0; if Ç t ¼ 0 AND x t v 1 AND jx t j is even; 1; if Ç t ¼ 0 AND x t v 1 AND jx t j is odd; 2; otherwise:
V v a implies that all elements of V are less than or equal to a. The algorithm D-GRAINðf; rÞ initializes the LFSR and NFSR of D f -GRAIN to all 0 0 s except the fth LFSR element which is initialized to 1. It then runs the engine for r PRGA rounds. For each t; ð0 t < rÞ it returns the 3-tuple ½x t ; Ç t ; Dz t .
x t and Ç t that contain elements from f0; 1; 2; 3g. Dz t which is an integer from the set f0; 1; 2g. Let us denote the symbols S t ¼ X t jjY t and S t that contribute to the output key-stream bit as a linear mask and input to the function h respectively. Then it has been proven in [6] , that if the ith element of x t ðÇ t Þ is Consider the situation when for some particular value of f the output in the t 0 th PRGA round of D-GRAINðf; rÞ, i.e., ½x t 0 ; Ç t 0 ; Dz t 0 be such that (i) Ç t 0 ¼ 0 and (ii) x t 0 has all but one element equal to 0, and this non-zero element is strictly greater than 1, i.e., v t 0 þlw > 1 for some w c and all other v t 0 þl k ; u t 0 þi k equals 0. Then following (1)- (3) (2) sum of the elements of a vector, then we have
Consider the output of D-GRAINðf; rÞ at the PRGA round t 1 ¼ t 0 À l e þ l w for some l e < l w . Due to the evolution of the LFSR of D f -GRAIN the difference of the eth element of x t 1 must be equal to the wth element of x t 0 . Now if (iii) all the remaining elements of x t 1 and the entire of Ç t 1 are all 0 0 s then following the previous argument we have
Thus at PRGA rounds t 0 ; t 1 we have
Experimental results have shown that for all the three versions of Grain, taking r ¼ 2n, there exist such pairs t 0 ; t 1 for many values of f.
Similarly consider some other PRGA round t 1 ¼ t 0 À l e 0 þ l w for some l e 0 < l w . Then the difference of the e 0 th element of x t 1 must be equal to the wth element of x t 0 . Now if (iv) Ç t 1 ¼ 0 and (v) there exists some w 0 such that x t 1 ½w 0 ¼ 1 and all the remaining elements of of x t 1 is 0, this implies that (vi) all elements of u t 1 and u g) Else, formulate the set C 3;f ¼ ff : f 2 C 2;f and
If jC 3;f j ¼ 1, the output only element in C 3;f . If jC 3;f j > 1, then our strategy fails. All tuples to generate Third and Forth Signature vectors of Grain v1 when the locations of faults are in LFSR are presented in Table 4 .
We again performed computer experiments by simulating random single bit faults for 2 20 randomly chosen KeyIVs. The probability that the new algorithm identifies the correct fault location in the LFSR or the NFSR, i.e., PrðjC 3;f j ¼ 1Þ is around 1:00 for Grain v1, 1:00 for Grain-128 and 0:81 for Grain-128a.
Implication of the Success Probabilities
In Table 5 , the number of faults required to determine the internal states of Grain v1, Grain-128 and Grain-128a are given. As can be seen, the attack may be carried out in very little time by employing around 10; 4; 10 faults for Grain v1, Grain-128 and Grain-128a family. While the probabilities of success of fault location identification are very high (close to 1) for both Grain v1 and Grain-128, it is around 0:81 for Grain-128a. Since the success probabilities in Grain v1 and Grain-128 are very high, it is expected that for any set of 10 (for Grain v1) and 4 (for Grain-128) randomly applied faults in the internal state, the algorithm will succeed in finding the fault location of all the faults with very high probability and hence help complete the attack. But this is not the case for Grain-128a. For Grain-128a, the location identification algorithm is expected to succeed with 0:81 and so if the adversary wants to complete the attack, she has to apply around 10 Á 1 0:81 % 12:3 faults to succeed.
Identifying Multiple-Bit Faults
In [6] , a preliminary study was made of the situation when a single fault injection affects the value of upto three consecutive locations in the LFSR. It gave rise to 4n À 5 possible cases of faults out of which n were due to single bit faults and the other 3n À 5 due to double or triple bit faults. The same fault identification routine is used to determine the fault location of faulty streams arising due to double or triple bit faults. In [6] , it was shown that if the faults are restricted to the LFSR then the location identification will be able with a very high probability (close to 1 for Grain v1, Grain-128 and Grain-128a) identify that a faulty stream produced due to a double or triple bit fault could not have been produced due to a single bit fault (this happens when C 3;f ¼ ;) and in all such cases the algorithm outputs a null message. In our experiments we have explored this situation with respect to faults in both the LFSR and the NFSR. After experimenting with randomly chosen single, double and triple bit faults for around 2 20 Key-IV pairs, it was found that the probability that the algorithm successfully rejects a faulty stream produced due to a multiple-bit fault, i.e., PrðC 3;f ¼ ;Þ is 0:94 for Grain v1, 0:99 Grain-128 and 0:86 for Grain-128a.
IDENTIFYING FAULT LOCATIONS FOR INJECTIONS
AT RANDOM TIME Until now we have assumed that the adversary is able to inject all faults at the beginning of a fixed PRGA round. This is usually practical as fault injections are usually synchronized with the power consumption curves of the device implementing the cryptosystem [15] . In this section we show that it is possible to attack Grain even if this requirement is relaxed. We will show that if the adversary injects a fault at a PRGA round t where t 2 ½0; t max À 1. In such an event, it is possible for the adversary, with high probability, determine the values of the fault location f and the injection time t. Before we get into further details, let us recap a few things and look at a definition that we will be using extensively.
The location identification algorithm presented so far (call it FLI(E f ) takes the difference vector E f ¼ Z È Z f , performs the seven steps (a)-(g) and returns the following The fault location f if the set C 3;f has cardinality 1. The ; message if the set C 3;f has cardinality 0, which is indicative of the fact that Z f was generated due to multiple-bit fault. A failure message if the set C 3;f has cardinality strictly greater than 1. This case may arise for both single and multiple-bit faults.
Definition 1. Two distinct fault location and time injection pairs
ðf; tÞ and ðf 0 ; t 0 Þ are said to be equivalent if they produce the same faulty key-stream.
For example, in Grain v1, faulting the NFSR location 70 at PRGA round 0 would produce the same faulty key-stream as faulting NFSR location 69 at PRGA round 1. This is because the difference that is induced in location 70 at PRGA round 0 shifts to location 69 in PRGA round 1 anyway. Thus ð70; 0Þ and ð69; 1Þ are equivalent pairs. However ð62; 0Þ and ð61; 1Þ are not equivalent since 62 is a tap for the update function of the NFSR for Grain v1. A difference induced in PRGA round 0 in location 62 travels to both locations 61 and 79 in the next round. Whereas a fault at location 61 in round 1 would affect only this location and not location 79. f has been produced due to fault injection at some LFSR or NFSR location f at time t where 0 t t max À 1. To identify ðf; tÞ the adversary runs the routine FLIðE f i Þ for all i 2 ½0; t max À 1. As a result, the adversary could obtain the following. 3) The failure message for all values of i. In this event she rejects the key-stream. However, the probability of this outcome is quite low. 4) If she obtains ; for some value of i, she deduces multiple-bit injection and rejects the key-stream. 5) If she obtains the outputs f 1 for i ¼ i 1 and f 2 for i ¼ i 2 such that ðf 1 ; i 1 Þ and ðf 2 ; i 2 Þ are not equivalent then the attacker deduces that the algorithm has failed. Experiments performed for around 2 20 random Key-IVs the probability of Case 5 occurring is only about 0:089 for Grain v1 if we take t max ¼ 10. For for Grain-128, taking t max ¼ 15, the failure probability comes to 0:079. For higher values of t max the failure probability becomes non-negligible.
This approach, however, fails for Grain-128a. Recall, that every alternate key-stream bit in Grain-128a is used for the computation of MAC and is therefore not directly available to the attacker. It is easy to see that the given approach will fail in all cases when the injection time is an odd number.
EXPERIMENTAL RESULTS
In this section we present the experimental results in detail. After the fault location and injection time of a particular faulty key-stream vector have been identified using the Signature vectors, a system of equations are formulated using the steps outlined in Section 3, and the equations are then fed into a SAT solver. There are several issues to be considered.
The number of faults is the most significant figure that we minimize using the SAT solvers. This implies that we also reduce the number of re-keyings of the cipher. We deduce the fault location and the injection time using the idea of the four Signature vectors. Note that, in [6] , faults have been introduced only in LFSR, but here we can handle the situation when faults may be introduced either the LFSR, NFSR or both of them. The number of faulty key-stream bits required to solve the system is also important. In our experiments, We have used 2n key-stream bits corresponding to each fault, i.e., 2 Á 80 ¼ 160 for Grain v1 and 2 Á 128 ¼ 256 for Grain-128 and Grain-128a. In fact, for Grain-128a, we use even fewer key-stream bits as we obtain more equations per key-stream bit. We have solved the equations using SAT solver Cryptominisat-2.9.5 [34] installed with SAGE 5.7 on Linux Ubuntu 2.6. The hardware platform is an HP Z800 workstation with 3 GHz Intel(R) Xeon(R) CPU. We have considered three different cases: (i) the faults are introduced in LFSR only, (ii) the faults are introduced in NFSR only, and (iii) the faults are introduced in both LFSR and NFSR (here we consider that expected half of the faults are injected in LFSR and the other half in NFSR). The results have been presented in Table 5 . We have presented the time required for the SAT solver part only as the time for identifying the location of the fault using Signature vectors is negligible. For each row, we consider a set of ten (10) experiments. As it is not easy to count the exact number of computational steps required in the SAT solver, we have reported the amount of time required in seconds.
If R is the number of key-stream bits used to populate the equation bank for each fault, and N is the total number of faults used, then the total key-stream requirement is ðN þ 1ÞR, as the amount of fault-free key-stream bits is R and the total number of faulty key-stream bits is NR. For example, in case of Grain v1, we need 10 faults considering both LFSR and NFSR and for the fault-free case and also for each fault, we need 160 key-stream bits. Thus the total number of key-stream used is ð10 þ 1Þ Á 160 ¼ 1;760.
Our method requires significantly fewer faults than what earlier known for the Grain family (of the order of hundreds) in literature so far [4] , [6] , [10] , [28] . Several issues may be optimized in the experiments. We note that with very little amount of key-stream, the attack takes longer time. The number of faults may be reduced further with more computational effort.
CONCLUSION
The differential fault attack (DFA) against the Grain family of stream ciphers has been a fairly well researched topic [4] , [6] , [10] , [28] and has been studied under various fault models-some more restrictive and some more relaxed. In this work, we propose a DFA of the Grain family that requires the adversary to have the least control over fault injections, i.e., same as that of [6] but requires far fewer faults than that required in [6] . Furthermore the adversary need not restrict the fault injections to either the LFSR or the NFSR, a stipulation that has been imposed in all the previous fault attacks on the Grain family. For Grain v1 and Grain-128, the adversary need not even exercise precise control over timing of fault injection.
The algorithm we propose first finds the location and injection time of a randomly applied bit fault (it rejects the faulty stream if it infers that it was produced due to multiple-bit fault) and then populates a bank of equations in the internal state variables of the cipher at the start of the PRGA. The algorithm then tries to solve the equations using the Cryptominisat-2.9.5 SAT solver [34] . For all the three ciphers the solver is able to recover the entire internal state using equations generated by less than or equal to 10 random faults in a few minutes. This is, to the best of our knowledge, the best fault analysis that has been reported against the Grain family.
As we have pointed out, the number of faults may be reduced further with more computational effort. Dedicated hardware and parallel computation may be exploited in this direction. However, this is not in the scope of this work as we are interested in the proof-ofconcept that can be achieved in a few minutes through a simple implementation. Estimating the minimum number of faults given some high-end hardware is an important open question for future research. " For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib.
