826 research outputs found

    Secure and Efficient RNS Approach for Elliptic Curve Cryptography

    Get PDF
    Scalar multiplication, the main operation in elliptic curve cryptographic protocols, is vulnerable to side-channel (SCA) and fault injection (FA) attacks. An efficient countermeasure for scalar multiplication can be provided by using alternative number systems like the Residue Number System (RNS). In RNS, a number is represented as a set of smaller numbers, where each one is the result of the modular reduction with a given moduli basis. Under certain requirements, a number can be uniquely transformed from the integers to the RNS domain (and vice versa) and all arithmetic operations can be performed in RNS. This representation provides an inherent SCA and FA resistance to many attacks and can be further enhanced by RNS arithmetic manipulation or more traditional algorithmic countermeasures. In this paper, extending our previous work, we explore the potentials of RNS as an SCA and FA countermeasure and provide an description of RNS based SCA and FA resistance means. We propose a secure and efficient Montgomery Power Ladder based scalar multiplication algorithm on RNS and discuss its SCAFA resistance. The proposed algorithm is implemented on an ARM Cortex A7 processor and its SCA-FA resistance is evaluated by collecting preliminary leakage trace results that validate our initial assumptions

    Design and implementation of robust embedded processor for cryptographic applications

    Get PDF
    Practical implementations of cryptographic algorithms are vulnerable to side-channel analysis and fault attacks. Thus, some masking and fault detection algorithms must be incorporated into these implementations. These additions further increase the complexity of the cryptographic devices which already need to perform computationally-intensive operations. Therefore, the general-purpose processors are usually supported by coprocessors/hardware accelerators to protect as well as to accelerate cryptographic applications. Using a configurable processor is just another solution. This work designs and implements robust execution units as an extension to a configurable processor, which detect the data faults (adversarial or otherwise) while performing the arithmetic operations. Assuming a capable adversary who can injects faults to the cryptographic computation with high precision, a nonlinear error detection code with high error detection capability is used. The designed units are tightly integrated to the datapath of the configurable processor using its tool chain. For different configurations, we report the increase in the space and time complexities of the configurable processor. Also, we present performance evaluations of the software implementations using the robust execution units. Implementation results show that it is feasible to implement robust arithmetic units with relatively low overhead in an embedded processor

    Tamper-Resistant Arithmetic for Public-Key Cryptography

    Get PDF
    Cryptographic hardware has found many uses in many ubiquitous and pervasive security devices with a small form factor, e.g. SIM cards, smart cards, electronic security tokens, and soon even RFIDs. With applications in banking, telecommunication, healthcare, e-commerce and entertainment, these devices use cryptography to provide security services like authentication, identification and confidentiality to the user. However, the widespread adoption of these devices into the mass market, and the lack of a physical security perimeter have increased the risk of theft, reverse engineering, and cloning. Despite the use of strong cryptographic algorithms, these devices often succumb to powerful side-channel attacks. These attacks provide a motivated third party with access to the inner workings of the device and therefore the opportunity to circumvent the protection of the cryptographic envelope. Apart from passive side-channel analysis, which has been the subject of intense research for over a decade, active tampering attacks like fault analysis have recently gained increased attention from the academic and industrial research community. In this dissertation we address the question of how to protect cryptographic devices against this kind of attacks. More specifically, we focus our attention on public key algorithms like elliptic curve cryptography and their underlying arithmetic structure. In our research we address challenges such as the cost of implementation, the level of protection, and the error model in an adversarial situation. The approaches that we investigated all apply concepts from coding theory, in particular the theory of cyclic codes. This seems intuitive, since both public key cryptography and cyclic codes share finite field arithmetic as a common foundation. The major contributions of our research are (a) a generalization of cyclic codes that allow embedding of finite fields into redundant rings under a ring homomorphism, (b) a new family of non-linear arithmetic residue codes with very high error detection probability, (c) a set of new low-cost arithmetic primitives for optimal extension field arithmetic based on robust codes, and (d) design techniques for tamper resilient finite state machines

    Multiple bit error correcting architectures over finite fields

    Get PDF
    This thesis proposes techniques to mitigate multiple bit errors in GF arithmetic circuits. As GF arithmetic circuits such as multipliers constitute the complex and important functional unit of a crypto-processor, making them fault tolerant will improve the reliability of circuits that are employed in safety applications and the errors may cause catastrophe if not mitigated. Firstly, a thorough literature review has been carried out. The merits of efficient schemes are carefully analyzed to study the space for improvement in error correction, area and power consumption. Proposed error correction schemes include bit parallel ones using optimized BCH codes that are useful in applications where power and area are not prime concerns. The scheme is also extended to dynamically correcting scheme to reduce decoder delay. Other method that suits low power and area applications such as RFIDs and smart cards using cross parity codes is also proposed. The experimental evaluation shows that the proposed techniques can mitigate single and multiple bit errors with wider error coverage compared to existing methods with lesser area and power consumption. The proposed scheme is used to mask the errors appearing at the output of the circuit irrespective of their cause. This thesis also investigates the error mitigation schemes in emerging technologies (QCA, CNTFET) to compare area, power and delay with existing CMOS equivalent. Though the proposed novel multiple error correcting techniques can not ensure 100% error mitigation, inclusion of these techniques to actual design can improve the reliability of the circuits or increase the difficulty in hacking crypto-devices. Proposed schemes can also be extended to non GF digital circuits

    On Fault-based Attacks and Countermeasures for Elliptic Curve Cryptosystems

    Get PDF
    For some applications, elliptic curve cryptography (ECC) is an attractive choice because it achieves the same level of security with a much smaller key size in comparison with other schemes such as those that are based on integer factorization or discrete logarithm. Unfortunately, cryptosystems including those based on elliptic curves have been subject to attacks. For example, fault-based attacks have been shown to be a real threat in today’s cryptographic implementations. In this thesis, we consider fault-based attacks and countermeasures for ECC. We propose a new fault-based attack against the Montgomery ladder elliptic curve scalar multiplication (ECSM) algorithm. For security reasons, especially to provide resistance against fault-based attacks, it is very important to verify the correctness of computations in ECC applications. We deal with protections to fault attacks against ECSM at two levels: module and algorithm. For protections at the module level, where the underlying scalar multiplication algorithm is not changed, a number of schemes and hardware structures are presented based on re-computation or parallel computation. It is shown that these structures can be used for detecting errors with a very high probability during the computation of ECSM. For protections at the algorithm level, we use the concepts of point verification (PV) and coherency check (CC). We investigate the error detection coverage of PV and CC for the Montgomery ladder ECSM algorithm. Additionally, we propose two algorithms based on the double-and-add-always method that are resistant to the safe error (SE) attack. We demonstrate that one of these algorithms also resists the sign change fault (SCF) attack

    Fault attacks and countermeasures for elliptic curve cryptosystems

    Get PDF
    In this thesis we have developed a new algorithmic countermeasures that protect elliptic curve computation by protecting computation of the finite binary extension field, against fault attacks. Firstly, we have proposed schemes, i.e., a Chinese Remainder Theorem based fault tolerant computation in finite field for use in ECCs, as well as Lagrange Interpolation based fault tolerant computation. Our approach is based on the error correcting codes, i.e., redundant residue polynomial codes and the use of first original approach of Reed-Solomon codes. Computation of the field elements is decomposed into parallel, mutually independent, modular/identical channels, so that in case of faults at one channel, errors will not distribute to other channels. Based on these schemes we have developed new algorithms, namely fault tolerant residue representation modular multiplication algorithm and fault tolerant Lagrange representation modular multiplication algorithm, which are immune against error propagation under the fault models that we propose: Random Fault Model, Arbitrary Fault Model, and Single Bit Fault Model. These algorithms provide fault tolerant computation in GF (2k) for use in ECCs. Our new developed algorithms where inputs, i.e., field elements, are represented by the redundant residue representation/ redundant lagrange representation enables us to overcome the problem if during computation one, or both coordinates x, y GF (2k) of the point P E/GF (2k) /Fk are corrupted. We assume that during each run of an attacked algorithm, in one single attack, an adversary can apply any of the proposed fault models, i.e., either Random Fault Model, or Arbitrary Fault Model, or Single Bit Fault Model. In this way more channels can be targeted, i.e., different fault models can be used on different channels. Also, our proposed algorithms can have masked errors and will not be immune against attacks which can create those kind of errors, but it is a difficult problem to counter masked errors, since any anti-fault attack scheme will have some masked errors. Moreover, we have derived conditions that inflicted error needs to have in order to yield undetectable faulty point on non-supersingular elliptic curve over GF(2k). Our algorithmic countermeasures can be applied to any public key cryptosystem that performs computation over the finite field GF (2k)

    On Error Detection and Recovery in Elliptic Curve Cryptosystems

    Get PDF
    Fault analysis attacks represent a serious threat to a wide range of cryptosystems including those based on elliptic curves. With the variety and demonstrated practicality of these attacks, it is essential for cryptographic implementations to handle different types of errors properly and securely. In this work, we address some aspects of error detection and recovery in elliptic curve cryptosystems. In particular, we discuss the problem of wasteful computations performed between the occurrence of an error and its detection and propose solutions based on frequent validation to reduce that waste. We begin by presenting ways to select the validation frequency in order to minimize various performance criteria including the average and worst-case costs and the reliability threshold. We also provide solutions to reduce the sensitivity of the validation frequency to variations in the statistical error model and its parameters. Then, we present and discuss adaptive error recovery and illustrate its advantages in terms of low sensitivity to the error model and reduced variance of the resulting overhead especially in the presence of burst errors. Moreover, we use statistical inference to evaluate and fine-tune the selection of the adaptive policy. We also address the issue of validation testing cost and present a collection of coherency-based, cost-effective tests. We evaluate variations of these tests in terms of cost and error detection effectiveness and provide infective and reduced-cost, repeated-validation variants. Moreover, we use coherency-based tests to construct a combined-curve countermeasure that avoids the weaknesses of earlier related proposals and provides a flexible trade-off between cost and effectiveness

    Frequency Domain Finite Field Arithmetic for Elliptic Curve Cryptography

    Get PDF
    Efficient implementation of the number theoretic transform(NTT), also known as the discrete Fourier transform(DFT) over a finite field, has been studied actively for decades and found many applications in digital signal processing. In 1971 Schonhage and Strassen proposed an NTT based asymptotically fast multiplication method with the asymptotic complexity O(m log m log log m) for multiplication of mm-bit integers or (m-1)st degree polynomials. Schonhage and Strassen\u27s algorithm was known to be the asymptotically fastest multiplication algorithm until Furer improved upon it in 2007. However, unfortunately, both algorithms bear significant overhead due to the conversions between the time and frequency domains which makes them impractical for small operands, e.g. less than 1000 bits in length as used in many applications. With this work we investigate for the first time the practical application of the NTT, which found applications in digital signal processing, to finite field multiplication with an emphasis on elliptic curve cryptography(ECC). We present efficient parameters for practical application of NTT based finite field multiplication to ECC which requires key and operand sizes as short as 160 bits in length. With this work, for the first time, the use of NTT based finite field arithmetic is proposed for ECC and shown to be efficient. We introduce an efficient algorithm, named DFT modular multiplication, for computing Montgomery products of polynomials in the frequency domain which facilitates efficient multiplication in GF(p^m). Our algorithm performs the entire modular multiplication, including modular reduction, in the frequency domain, and thus eliminates costly back and forth conversions between the frequency and time domains. We show that, especially in computationally constrained platforms, multiplication of finite field elements may be achieved more efficiently in the frequency domain than in the time domain for operand sizes relevant to ECC. This work presents the first hardware implementation of a frequency domain multiplier suitable for ECC and the first hardware implementation of ECC in the frequency domain. We introduce a novel area/time efficient ECC processor architecture which performs all finite field arithmetic operations in the frequency domain utilizing DFT modular multiplication over a class of Optimal Extension Fields(OEF). The proposed architecture achieves extension field modular multiplication in the frequency domain with only a linear number of base field GF(p) multiplications in addition to a quadratic number of simpler operations such as addition and bitwise rotation. With its low area and high speed, the proposed architecture is well suited for ECC in small device environments such as smart cards and wireless sensor networks nodes. Finally, we propose an adaptation of the Itoh-Tsujii algorithm to the frequency domain which can achieve efficient inversion in a class of OEFs relevant to ECC. This is the first time a frequency domain finite field inversion algorithm is proposed for ECC and we believe our algorithm will be well suited for efficient constrained hardware implementations of ECC in affine coordinates

    High-speed VLSI implementation of Digit-serial Gaussian normal basis Multiplication over GF(2m)

    Get PDF
    In this paper, by employing the logical effort technique an efficient and high-speed VLSI implementation of the digit-serial Gaussian normal basis multiplier is presented. It is constructed by using AND, XOR and XOR tree components. To have a low-cost implementation with low number of transistors, the block of AND gates are implemented by using NAND gates based on the property of the XOR gates in the XOR tree. To optimally decrease the delay and increase the drive ability of the circuit the logical effort method as an efficient method for sizing the transistors is employed. By using this method and also a 4-input XOR gate structure, the circuit is designed for minimum delay. The digit-serial Gaussian normal basis multiplier is implemented over two binary finite fields GF(2163) and GF(2233) in 0.18μm CMOS technology for three different digit sizes. The results show that the proposed structures, compared to previous structures, have been improved in terms of delay and area parameters

    On the Analysis of Public-Key Cryptologic Algorithms

    Get PDF
    The RSA cryptosystem introduced in 1977 by Ron Rivest, Adi Shamir and Len Adleman is the most commonly deployed public-key cryptosystem. Elliptic curve cryptography (ECC) introduced in the mid 80's by Neal Koblitz and Victor Miller is becoming an increasingly popular alternative to RSA offering competitive performance due the use of smaller key sizes. Most recently hyperelliptic curve cryptography (HECC) has been demonstrated to have comparable and in some cases better performance than ECC. The security of RSA relies on the integer factorization problem whereas the security of (H)ECC is based on the (hyper)elliptic curve discrete logarithm problem ((H)ECDLP). In this thesis the practical performance of the best methods to solve these problems is analyzed and a method to generate secure ephemeral ECC parameters is presented. The best publicly known algorithm to solve the integer factorization problem is the number field sieve (NFS). Its most time consuming step is the relation collection step. We investigate the use of graphics processing units (GPUs) as accelerators for this step. In this context, methods to efficiently implement modular arithmetic and several factoring algorithms on GPUs are presented and their performance is analyzed in practice. In conclusion, it is shown that integrating state-of-the-art NFS software packages with our GPU software can lead to a speed-up of 50%. In the case of elliptic and hyperelliptic curves for cryptographic use, the best published method to solve the (H)ECDLP is the Pollard rho algorithm. This method can be made faster using classes of equivalence induced by curve automorphisms like the negation map. We present a practical analysis of their use to speed up Pollard rho for elliptic curves and genus 2 hyperelliptic curves defined over prime fields. As a case study, 4 curves at the 128-bit theoretical security level are analyzed in our software framework for Pollard rho to estimate their practical security level. In addition, we present a novel many-core architecture to solve the ECDLP using the Pollard rho algorithm with the negation map on FPGAs. This architecture is used to estimate the cost of solving the Certicom ECCp-131 challenge with a cluster of FPGAs. Our design achieves a speed-up factor of about 4 compared to the state-of-the-art. Finally, we present an efficient method to generate unique, secure and unpredictable ephemeral ECC parameters to be shared by a pair of authenticated users for a single communication. It provides an alternative to the customary use of fixed ECC parameters obtained from publicly available standards designed by untrusted third parties. The effectiveness of our method is demonstrated with a portable implementation for regular PCs and Android smartphones. On a Samsung Galaxy S4 smartphone our implementation generates unique 128-bit secure ECC parameters in 50 milliseconds on average
    corecore