Abstract RSA-CRT fault attacks have been an active research area since their discovery by Boneh, DeMillo and Lipton in 1997. We present alternative key-recovery attacks on RSA-CRT signatures: instead of targeting one of the subexponentiations in RSA-CRT, we inject faults into the public modulus before CRT interpolation, which makes a number of countermeasures against Boneh et al.'s attack ineffective. Our attacks are based on orthogonal lattice techniques and are very efficient in practice: depending on the fault model, between 5 and 45 faults suffice to recover the RSA factorization within a few seconds. Our simplest attack requires that the adversary knows the faulty moduli, but more sophisticated variants work even if the moduli are unknown, under reasonable fault models. All our attacks have been fully validated experimentally with fault-injection laser techniques.
Introduction

RSA-CRT signatures
RSA [26] is the most widely used signature scheme. To sign a message m, the signer first applies an encoding function μ to m and then computes the signature σ = μ(m) d mod N . To verify the signature σ, the receiver checks that σ e = μ(m) mod N . The Chinese Remainder Theorem (CRT) is often used to speed up signature generation by a factor of about 4. This is done by computing:
and deriving σ from (σ p , σ q ) using the CRT.
Fault attacks on RSA-CRT signatures
Back in 1997, Boneh et al. [7] showed that RSA-CRT implementations are vulnerable to fault attacks. Assuming that the attacker can induce a fault when σ q is computed while keeping the computation of σ p correct, one gets This attack applies to any deterministic padding function μ, such as the Full-Domain Hash [3] , or probabilistic signatures where the randomizer used to generate the signature is sent along with the signature, such as PFDH [14] . Only probabilistic signature schemes such that the randomness remains unknown to the attacker may be safe, though some particular cases have been attacked as well [13] . In 2005, Seifert [27] introduced a new type of RSA fault attacks, by inducing faults on the RSA public modulus. The initial attack [27] only allowed to bypass RSA verification, but key-recovery attacks were later discovered by Brier et al. [9] and improved or extended in [4] [5] [6] 19] . These key-recovery attacks only apply to RSA without CRT, and they require significantly more faults than the attack by Boneh et al., at least on the order of 1,000 faulty signatures.
Our contribution
We present new key-recovery attacks on RSA-CRT signatures: instead of targeting one of the RSA-CRT subexponentiations, we inject faults into the public modulus like in Seifert's attack. This makes typical countermeasures against Boneh et al.'s attack ineffective against the new attacks.
Our attacks are based on the orthogonal lattice techniques introduced by Nguyen and Stern [21] . They are very effective in practice: they disclose the RSA factorization within a few seconds using only between 5 and 45 faulty signatures. The exact running time and number of faulty signatures depend on the fault model.
For instance, in our simplest attack, the running time is a fraction of a second using only five faulty signatures, but the attacker is assumed to know the faulted moduli for the five different messages. However, our attack can be extended to the case where the attacker no longer knows the faulted moduli, using at most 45 faulty signatures, under the following two fault models: either the faulted moduli only differ from the public modulus on a single byte of unknown position and unknown value, or the faulted moduli may differ from the public modulus by many bytes, but the differences are restricted to the least significant bits, up to half of the modulus size.
All our attacks have been fully validated with physical experiments with laser shots on a RISC microcontroller.
Related work
Many countermeasures have been proposed to protect against Boneh et al.'s attack and its numerous generalizations, but they often focus on the exponentiation process. The previously mentioned fault attacks [4] [5] [6] 9, 19] on RSA using faulty moduli only apply to standard RSA without CRT, and they use non-lattice techniques. Our attack seems to be the first attack on RSA-CRT with faulted moduli.
It should be pointed out, however, that a number of protected RSA-CRT implementations also protect the CRT recombination. This is, for example, the case of [1, 8, 12, 16, 25, 30] .
More generally, as we observe in Sect. 6, using the technique known as Garner's formula for CRT recombination does thwart the attack introduced in this paper. Since this formula is often used in practice, typical implementations conforming to RSA standards like PKCS#1 and IEEE P1363 should in principle be immune to this attack.
Roadmap
In Sect. 2, we describe the basic attack where the faulty moduli are assumed to be known to the attacker. In Sect. 3, we extend the attack to realistic fault models in which the faulty moduli are no longer known to the attacker. In Sect. 4, we describe physical experiments with laser shots on a RISC microcontroller to validate the attack. As a side contribution, in Sect. 5, we describe a conceptually different, more elementary attack in a simpler model. Finally, in Sect. 6, we suggest possible countermeasures against these attacks.
The new attack
Overview
Consider again the generation of RSA-CRT signatures. To obtain the signature σ of a message m padded as μ(m), the signer computes the mod-p and mod-q parts:
and returns the signature:
where α, β are the pre-computed Chinese Remainder coeffi-
Assume that an adversary can obtain the correct signature σ, and also a signature σ of the same padded message μ(m) after corrupting the modulus N before the CRT step (1) . In other words, the attacker gets σ as before but also σ defined as
Suppose further, for the moment, that the adversary is able to recover the faulty modulus N : we will see in Sect. 3 how this not-so-realistic hypothesis can be lifted in a more practical setting. Then, by applying the Chinese Remainder Theorem to σ and σ , the adversary can compute
But if we denote the bit length of N by n, then N · N is a 2n-bit integer, whereas α, β are of length n and σ p , σ q of length n/2, so v is really a linear combination of α and β in Z:
That alone does not suffice to factor N , but several such pairs (σ, σ ) provide multiple linear combinations of the (unknown) integers α, β with relatively small coefficients. Then lattice reduction techniques allow us to recover the coefficients σ p and σ q , and hence obtain the factorization of N by GCDs. The following sections describe this process in detail.
Applying orthogonal lattice techniques
We assume that the reader is familiar with cryptanalysis based on lattices (see [20, 23] 
where x, y are unknown vectors with n/2-bit components and α, β are the (unknown) CRT coefficients relative to p and q. Lattice reduction can exploit such a hidden linear relationship as follows: Using standard techniques [21, 22] , it is possible to compute a reduced basis {b 1 , . . . , b −1 } of the lattice v ⊥ ⊂ Z of vectors orthogonal to v in Z . In particular we get 
Since the lattice L = {x, y} ⊥ is of rank − 2, Case 1 cannot hold for all − 1 linearly independent vectors b j , so that the longest one b −1 should be in Case 2 and hence
. On the other hand, the other vectors form a lattice of rank − 2 which can be heuristically be expected to behave like a random lattice. The volume of this lattice is
and hence satisfies
As a result, we should have 
Taking orthogonal lattices, we get (L ) ⊥ ⊃ L ⊥ = Zx ⊕ Zy. Therefore, x and y belong to the orthogonal lattice (L ) ⊥ of L . Let {x , y } be a reduced basis of that lattice. We can enumerate all the lattice vectors in (L ) ⊥ of length at most √ N as linear combinations of x and y . The Gaussian heuristic suggests that there should be roughly
such vectors, so this is certainly feasible. For all those vectors z, we can compute gcd(v − z, N ). We will thus quickly find gcd(v − x, N ) among them, since x is a vector of length
But by definition of v we have v = x mod p and v = y mod q so gcd(v − x, N ) = p, which reveals the factorization of N .
Attack summary
Assume that, for ≥ 5 padded messages μ(m i ), we know a correct signature σ i and a signature σ i computed with a faulty modulus N i . Then, we can heuristically recover the factorization of N as follows:
Using the LLL algorithm [17] , compute a reduced basis {b 1 , . . . , b −1 } of the lattice v ⊥ ⊂ Z of vectors in Z orthogonal to v. This is done by applying LLL to the lattice in Z 1+ generated by the rows of the following matrix:
where κ is a suitably large constant, and removing the first component of each resulting vector [21] .
Compute an LLL-reduced basis {x , y } of the orthogonal lattice (L ) ⊥ to that lattice. Again, this is done by applying LLL to the lattice in Z +2+ generated by the rows of
and keeping the last components of each resulting vector. 4. Enumerate the vectors z = ax + by ∈ (L ) ⊥ of length at most √ N , and for each such vector z, compute gcd(v−z, N ) using all components, and return any nontrivial factor of N .
Simulation results
Since the attack is heuristic, it is important to evaluate its experimental performances. To do so, we have implemented a simulation of the attack in SAGE [29] : for a given modulus N , we compute the vector v corresponding to a series of signatures on random messages and apply the lattice attack, attempting to recover a factor of N . Table 1 shows the measured success probabilities for various values of and modulus sizes. It confirms the heuristic prediction that 5 faulty signatures should always suffice to factor N . It turns out that even 4 signatures are enough in almost half the cases.
Experimental running times are given in Table 2 . The whole attack takes a few dozen milliseconds on a standard PC. The number of vectors to test as part of the final exhaustive search step is about 20 in practice, which is done very quickly. Each parameter set was tested with random faults on 500 random moduli of the given size Each parameter set was tested with random faults on 500 random moduli of the given size. Timings for a SAGE implementation, on a single 2.4 GHz Core2 CPU core 3 Extending the attack to unknown faulty moduli
As mentioned in Sect. 2.1, in its basic form, the attack requires the recovery of the faulty moduli N i in addition to the corresponding faulty signatures σ i . This is not a very realistic assumption, since a typical implementation does not output the public modulus along with each signature. To work around this limitation, we would like to reconstruct the vector v of integer values needed to run the attack from signatures alone, without the knowledge of the faulty moduli-possibly at the cost of requiring a few more faulty signatures.
This can actually be achieved in various ways depending on the precise form of the faults inflicted to the modulus. We propose solutions for the following two realistic fault models:
1. The faulty moduli N i differ from N on a single (unknown) byte. This is known to be possible using power glitches or laser shots. 2. The differences between the faulty moduli N i and N are located on the least significant half: the errors on the least significant bits can be up to half of the modulus size. It is easy to obtain such faults with a laser or a cold boot attack.
Single byte faults
In this model, the attacker is able to obtain a certain number ≥ 5 of pairs (σ i , σ i ) where σ i = αx i + βy i mod N is a valid signature and σ i = αx i + βy i mod N i is the same signature computed with a faulty modulus. The faulty moduli N i are not known, but they only differ from N on a single byte whose position and value is unknown. This type of fault can, for example, occur when attacking the transfer of the modulus to memory on a smart card with an 8-bit processor or when using a laser attack with a sufficiently focused beam.
For a 1024-bit modulus N , for example, there are 128 × 255 ≈ 2 15 possible faulty moduli. It can thus seem like a reasonable approach to try and run the attack with all possible faults. However, since this should be done with 5 signatures, this results in a search space of size ≈(2 15 ) 5 = 2 75 which is prohibitive. This kind of exhaustive search can be made practical, though, if we take into account the fact that the CRT value
. Now, for a given value of σ i , there are only very few possible target moduli N * i differing from N on a single byte such that
often only one or two and almost never more than 20. We only need to run the attack with those target v * i 's until we find a factor. Experimentally, for a 1024-bit modulus, the average base 2 logarithm of the number of possible v * i 's is about 2.5, so if an attacker has 5 pairs (σ i , σ i ) in this model, they can expect to try all vectors v in a search space of less than around 12.5 bits, i.e. run the attack a few thousand times, for a total running time of under 2 min. This is already quite practical.
If more pairs are available, the attacker can keep the 5 pairs for which the number of possible v * i 's is the smallest. This reduces the search space accordingly. In Table 3 , we show how the exhaustive search space size and the expected running time evolve with the number of signatures in a typical example.
Faults on many least significant bits
In this model, the attacker is able to obtain = 5 signature families of the form (σ i , σ i,1 , . . . , σ i,k ) , where the σ i 's are correct signatures:
and the σ i, j 's are faulty signatures of the form
In other words, for each one of the different messages, the attacker learns the reduction of the CRT value v i = αx i +βy i modulo N , as well as modulo k different unknown faulty moduli N i, j . Additionally, it is assumed that all N i, j differ from N only on the least significant bits, but the number of distinct bits can be as large as half of the modulus size: we assume that |N − N i, j | < N δ for a certain constant δ < 1/2. Tested with random 1024-bit moduli. In the simulation, errors ε j are modeled as uniformly random signed integers of the given size, and 10,000 of them were generated for each parameter set This is a reasonable fault model for a laser attack: it suffices to target a laser beam on the least significant bits of N to produce this type of faults.
To run the attack successfully, the attacker needs to recover the CRT values v i . This can be done with high probability when the number of available faults k for a given message is large enough. The simplest approach is based on a GCD computation.
Indeed, fix an index i ∈ {1, . . . , }, and write N i, j = N + ε j , v i = u, σ i = u 0 and σ i, j = u j . The attacker knows the u j 's and wants to recover u. Now, observe that there are integers t j such that u satisfies u = u 0 + t 0 · N and u = u j + t j · (N + ε j ). In particular, for j = 1, . . . , k we can write:
This implies that u j − u 0 ≡ t j · ε j (mod N ). However, we have t j · ε j < N 1/2+δ N , so that the congruence is really an equality in Z. In view of (3), this implies that all t j 's are in fact equal and hence
If the errors ε j on the modulus are co-prime, which we expect to happen with probability ≈ 1/ζ (k), we can then deduce t 0 as the GCD of all values u 0 − u j , and this gives
As seen in Table 4 , the success probability is in practice very close to 1/ζ (k) regardless of the size of errors.
It is probably possible to further improve the success probability by trying to remove small factors from the computed GCD g = gcd(u 0 − u 1 , . . . , u 0 − u k ) to find t 0 when g > √ N , but we find that the number of required faults is already reasonable without this computational refinement.
Indeed, recall that = 5 CRT values are required to run the attack. If k faults are obtained for each of the messages, the probability that these CRT values can be successfully recovered with this GCD approach is ζ(k) − . This is greater than 95% for k = 7, and 99% for k = 9.
We can also mention an alternate, lattice-based approach to recovering the CRT value u. The relation between the different quantities above can be written in vector form as u 0 1 = u + t 0 e, where 1 = (1, . . . , 1), u = (u 1 , . . . , u k ) and e = (ε 1 , . . . , ε k ).
Then, since u 0 ≈ N is much larger than t 0 e ≈ N 1/2+δ , short vectors orthogonal to u will be orthogonal to both 1 and e. More precisely, we can heuristically expect that when k is large enough (k 2/ (1 − 2δ) ), the first k − 2 vectors of a reduced basis of u ⊥ will be orthogonal to 1 and e.
Taking orthogonal lattices again, we can thus obtain a reduced basis {x, y} of a two-dimensional lattice containing 1 and e (and of course u). Since 1 is really short, we always find that x = 1 in practice. Then, it happens quite often that y can be written as λ1 ± e, in which case t 0 is readily recovered as the absolute value of the second coordinate of u in the basis {x, y}.
However, this fails when Z1 ⊕ Ze is a proper sublattice of Zx⊕Zy = Z k ∩(Q1⊕Qe), namely, when there is some integer d > 1 such that all errors ε j are congruent mod d. Thus, we expect the success probability of this alternate approach to be 1/ζ (k − 1), which is slightly less than with the GCD approach.
Practical experiments
Practical experiments validating the new attack were conducted on an 8-bit 0.35 µm RISC microcontroller with no countermeasures. Since the microprocessor had no arithmetic coprocessor the values σ p and σ q were pre-computed by an external program for each fault injection experiment and fed into the attacked device. The target combined σ p and σ q using multiplications and additions (using Formula 1) as well as the final modular reduction.
We conducted several practical experiments corresponding to three different scenarios, roughly corresponding to the fault models considered in Sects. 2.1, 3.1 and 3.2, respectively. Let us describe these experiments in order. A description of the physical setting (common to the experiments reported in [18] ) follows in Sect. 4.4.
First scenario: known modulus
In this case, we considered 5 messages for a random 1024-bit RSA modulus N . For each message m i , we obtained a correct signature σ i , as well as a faulty-modulus signature σ i where the faulty modulus N i was also read back from the microcontroller.
Therefore, we were exactly in the setting described in Sect. 2.1, and could apply the algorithm from Sect. 2.3 directly: apply the Chinese Remainder Theorem to construct the vector v of CRT values and run the lattice-based attack to recover a factor of N .
The implementation of the attack used the same SAGE code as the simulation from Sect. 2.4. In our experimental case, the ball of radius √ N contained only about 10 vectors of the double orthogonal lattice, and the whole attack revealed a factor of N in less than 20 ms.
Second scenario: unknown single byte fault
In this case, we tried to replicate a setting similar to the one considered in Sect. 3.1. We considered 20 messages and a random 1024-bit RSA modulus N . For each message m i , we obtained a correct signature σ i , as well as faulty-modulus signatures σ i with undisclosed faulty modulus N i generated by targeting a single byte of N with the laser.
We had to eliminate some signatures, however, because in some cases, errors on the modulus turned out to exceed 8 bits. 1 After discarding those, we had 12 pairs (σ i , σ i ) left to carry out the approach described in Sect. 3.1.
The first step in this approach is to find,
differing from N only on one byte) that are small enough to be correct candidate CRT values. Unlike the setting of Sect. 3.1, we could not assume that bit-differences were aligned on byte boundaries: we had to test a whole 1016 × 255 candidate moduli 2 N * i for each i. Therefore, this search step was a bit costly, taking a total of 11 min and 13 s. Additionally, due to the higher number of candidate moduli, the number of candidate CRT values v * i was also somewhat larger than in Sect. 3.1, namely 7, 17, 3, 9, 15, 5, 14, 44, 44, 17, 10, 55 for our 12 pairs, respectively. Keeping only the 5 indices with the smallest number of candidates, we obtained 3 × 5 × 7 × 9 × 10 = 9450 possible CRT value vectors v * .
We then ran the lattice-based attack on each of these vectors in order until a factor of N was found. The factor was found at iteration number 2120, after a total computation time of 43 s.
Third scenario: unknown least significant bytes faults
In this case, we considered 10 messages for a random 1024-bit N . For each message m i , we obtained a correct signature σ i , as well as 10 faulty-modulus signature σ i, j with undisclosed faulty modulus N i . The laser beam targeted the lower order bytes of N but with a large aperture, generating multiple faults stretching over as much as 448 modulus bits.
In practice, we only used the data (σ i , σ i,1 , . . . , σ i,10 ) for the first 5 messages, discarding the rest. Then, we reconstructed the CRT values v i using the GCD technique of Sect. 3.2:
and applied the lattice-based attack on the resulting vector v. This revealed a factor of N in 16 ms.
We also tried the same attack using a fewer number of the σ i, j 's and found that it still worked when taking only 4 of those values in the computation of v i :
but failed if we took 3 instead. In view of the fact that 1/ζ (3) 5 ≈ .40 and 1/ζ (4) 5 ≈ .67, this is quite in line with expectations.
Details on laser fault injection
Laser (Light Amplification by Stimulated Emission of Radiation) is a stimulated-emission electromagnetic radiation in the visible or the invisible domain. Laser light is monochromatic, unidirectional, coherent and artificial (i.e. laser does not spontaneously exist in nature). Laser light can be generated as a beam of very small diameter (a few µm). The beam can pass through various material obstacles before impacting a target during a very short duration.
Laser impacts on electronic circuits are known to alter functioning. Current chip manufacturing technologies are in the nanometers range. This, and the brief and precise reaction time of laser, makes this technique particularly suitable for fault injection.
Photoelectric effects of laser on silicon
Static Random Access Memory (SRAM) laser exposure is known to cause bit-flips [2, 11, 15, 28] , a phenomenon called Single Event Upset (SEU). By tuning the beam energy level below a destructive threshold, the target will not suffer any permanent damage.
A conventional one-bit SRAM cell (Fig. 1 ) consists of two cross-coupled inverters. Every cell has two additional transistors controlling content access during write and read. As every inverter is made of two transistors, an SRAM cell contains six MOS.
In each cell, the states of four transistors encode the stored value. By design, the cell admits only two stable states: a "0" or a "1". In each stable state, two transistors are at an ON state and two others are OFF.
If a laser beam hits the drain/bulk reversed-biased PN junctions of a blocked transistor, the energy of the beam may create pairs of electrons as the beam passes through the silicon. The charge carriers induced in the collection volume of the drain-substrate junction of the blocked transistor are collected and create a transient current that inverts the output voltage logically. This voltage inversion is in turn applied to the second inverter that switches to its opposite state: all in all, a bit flip happens [2, 15] .
From the opponent's perspective, an additional advantage of laser fault injection is reproducibility. Identical faults can be repeated by carefully tuning the parameters of the laser and the operating conditions of the target.
Different parameters in a fault attack by laser
In a laser attack, the opponent usually controls beam diameter, wavelength, amount of emitted energy, impact coordinates (attacked circuit part) and exposure duration. Sometimes, the opponent may also control the timing of the impact (i.e. its synchrony with a given clock cycle of the target.), its clock frequency, V cc and temperature. Finally, laser attacks may target either the front side or back side of the chip.
However, the front and back sides have different characteristics when exposed to a laser beam.
Front side attacks are particularly suited to green wavelength (∼532 nm). The visibility of chips components makes positioning very easy in comparison to backside attacks. But because of the metallic interconnects' reflective effect, it is difficult to target a component with enough accuracy. In addition, progress in manufacturing technologies results in both a proliferation of metal interconnects and much smaller chips. All in all, it becomes increasingly difficult to hit a target area.
Backside attacks are more successful at the infrared wavelength (∼1064 nm) as the laser needs to deeply enter the silicon. Positioning is more difficult for lack of visibility. Nevertheless, backside attacks allow to circumvent the reflective problem of metallic surfaces.
Practical CRT fault injection
After decapsulating the chip and mapping its components, we selected a large target area, given our knowledge of the implementation. Using automated search on the front side of the chip, we modified the coordinates of the impact, the beam parameters and timing until a reproducible fault area was obtained.
The target is an 8-bit 0.35 µm 16 MHz RISC microcontroller with an integrated 4 kB SRAM and no countermeasures. The device runs SOSSE (Simple Operating System for Smartcard Education [10] ) to which we added some commands, most notably for feeding in the values {N , p, q, σ p , σ q , p −1 mod q, q −1 mod p}. Upon reception all these parameters are stored in SRAM. The laser, shown in Figs. 6 and 7, is equipped with a YAG laser emitter in three different wavelengths: green, infrared and ultraviolet.
The diameter of the spot can be set between 0 and 2,500 µm. As the beam passes through a lens, it gets reduced by the lens' zoom factor and loses a big part of its energy. Our experiments were conducted with a 20× Mitutoyo lens, a green (532nm) beam of ∅4 µm and 15pJ per shot at the laser source emitter (before passing through the lens). The circuit is installed on a programmable Prior Scientific X-Y positioning table 3 . The X-Y table, card reader, laser and an FPGA trigger board, were connected via RS-232 to a control PC. The FPGA trigger board receives an activation signal from the reader and sends a trigger signal to the laser after a delay defined by the control PC.
Experiments were conducted in ambient temperature and at V cc = 5 V. These parameters are within the normal operating conditions of the device: 2.7 V ≤ V cc ≤ 5.5 V.
The chip was decapsulated by chemical etching using a Nisene JetEtch automated acid decapsulator. The decapsulator can be programmed for the chemical opening of different chip types using different ratios of nitric acid (HNO 3 ) and sulfuric acid (H 2 SO 4 ), at a desired temperature and during a specified time. For opening our chip, we used only nitric acid at 80 • C for 40 s. The etched chip (Fig. 2) successfully passed functional tests before and during fault injection.
As it is very difficult to target the ALU (Arithmetic Logic Unit) of the chip during a very specific calculation step, we decided to hit the SRAM to corrupt data rather than modify calculations. 3 Motorized stepper stage for upright microscopes with 0.1 µm steps. Finding the SRAM area containing each data element and properly tuning the parameters of the laser is very time consuming. The number of faults in the read back data, their position and their contents indicate which element has been hit. Figure 5 compares a 1 µm laser spot and SRAM cells in different technology sizes. As technology advances, transistor density per µm grows. With several transistors are packed into 1 µm areas, single-bit fault injection will require much more precise equipment and are likely to become unfeasible using cheap lasers. Figure 4 shows how we explored the SRAM space of the target. The method consisted in searching the precise storage area of the modulus N , shooting into it and reading data back. Figure 4 is just a schematic model of the real SRAM (Fig. 3) to describe our technique and does not correspond to real address allocation. We could successfully inject multiple byte faults into selected parts of N and iterate the process for different moduli and signatures. This sufficed to implement our scenario. 
Using dichotomy in the absence of padding
As a side contribution, we describe a different, more elementary modulus fault attack in a restricted setting.
Consider again the setting of Sect. 2.1, in which an adversary is able to obtain both a correct signature σ on a message m, and a signature on the same message m computed with a faulty modulus, allowing him to deduce the non reduced value v = σ p · α + σ q · β ∈ Z. We can write
Moreover, observe that α + β = N + 1 (as is easily seen by reducing α + β modulo p and q). Therefore, we have
Hence, if we let ω = (σ · (N + 1) − v)/N , we get
and this value ω is an integer since v ≡ σ (mod N ). Now assume further that the adversary can ask signatures on messages m such that σ is small. This is the case, for example, when signatures are computed without padding and the physical device under consideration will answer arbitrary signature queries: then, the adversary can simply ask signatures on messages of the form σ e for small values σ of his choice.
In such a setting, the adversary can pick a σ close to N 1/2 , carry out the fault attack and compute the integer ω. By (4), he gets ω = 0 if σ < min( p, q) and ω > 0 otherwise. Trying this process again several times, the smallest prime factor of N can be recovered by dichotomy.
Countermeasures and further research
Probabilistic and stateful signature schemes are usually secure against the attack of Sect. 2, since they make it difficult to obtain two signatures on the same padded message. However, all deterministic schemes are typically vulnerable, including those in which the attacker doesn't have full access to the signed message, provided that the target device can be forced to compute the same signature twice.
A natural countermeasure is to use a CRT interpolation formula that does not require N , such as Garner's formula, computed as follows:
where we assume that p > q, and γ is the usual CRT coefficient q −1 mod p. Note that the evaluation of σ does not require a modular reduction because σ = σ q + (t · γ mod p) · q ≤ q − 1 + ( p − 1)q < N Besides the obvious countermeasure consisting in checking signatures before release, it would be interesting to devise specific countermeasures for protecting Formula (1) (or Garner's formula) taking into account the possible corruption of all data involved.
Finally, in a number of special cases and particular settings (e.g. Sect. 5) other fault attacks on the CRT recombination phase can be devised. A thorough analysis of such scenarios is also an interesting research direction.
