20 research outputs found

    Efficient Implementation of Elliptic Curve Cryptography on FPGAs

    Get PDF
    This work presents the design strategies of an FPGA-based elliptic curve co-processor. Elliptic curve cryptography is an important topic in cryptography due to its relatively short key length and higher efficiency as compared to other well-known public key crypto-systems like RSA. The most important contributions of this work are: - Analyzing how different representations of finite fields and points on elliptic curves effect the performance of an elliptic curve co-processor and implementing a high performance co-processor. - Proposing a novel dynamic programming approach to find the optimum combination of different recursive polynomial multiplication methods. Here optimum means the method which has the smallest number of bit operations. - Designing a new normal-basis multiplier which is based on polynomial multipliers. The most important part of this multiplier is a circuit of size O(nlogn)O(n \log n) for changing the representation between polynomial and normal basis

    Reconfigurable Architecture for Elliptic Curve Cryptography Using FPGA

    Get PDF
    The high performance of an elliptic curve (EC) crypto system depends efficiently on the arithmetic in the underlying finite field. We have to propose and compare three levels of Galois Field , , and . The proposed architecture is based on Lopez-Dahab elliptic curve point multiplication algorithm, which uses Gaussian normal basis for field arithmetic. The proposed is based on an efficient Montgomery add and double algorithm, also the Karatsuba-Ofman multiplier and Itoh-Tsujii algorithm are used as the inverse component. The hardware design is based on optimized finite state machine (FSM), with a single cycle 193 bits multiplier, field adder, and field squarer. The another proposed architecture is based on applications for which compactness is more important than speed. The FPGA’s dedicated multipliers and carry-chain logic are used to obtain the small data path. The different optimization at the hardware level improves the acceleration of the ECC scalar multiplication, increases frequency and the speed of operation such as key generation, encryption, and decryption. Finally, we have to implement our design using Xilinx XC4VLX200 FPGA device

    Secure architectures for pairing based public key cryptography

    Get PDF
    Along with the growing demand for cryptosystems in systems ranging from large servers to mobile devices, suitable cryptogrophic protocols for use under certain constraints are becoming more and more important. Constraints such as calculation time, area, efficiency and security, must be considered by the designer. Elliptic curves, since their introduction to public key cryptography in 1985 have challenged established public key and signature generation schemes such as RSA, offering more security per bit. Amongst Elliptic curve based systems, pairing based cryptographies are thoroughly researched and can be used in many public key protocols such as identity based schemes. For hardware implementions of pairing based protocols, all components which calculate operations over Elliptic curves can be considered. Designers of the pairing algorithms must choose calculation blocks and arrange the basic operations carefully so that the implementation can meet the constraints of time and hardware resource area. This thesis deals with different hardware architectures to accelerate the pairing based cryptosystems in the field of characteristic two. Using different top-level architectures the hardware efficiency of operations that run at different times is first considered in this thesis. Security is another important aspect of pairing based cryptography to be considered in practically Side Channel Analysis (SCA) attacks. The naively implemented hardware accelerators for pairing based cryptographies can be vulnerable when taking the physical analysis attacks into consideration. This thesis considered the weaknesses in pairing based public key cryptography and addresses the particular calculations in the systems that are insecure. In this case, countermeasures should be applied to protect the weak link of the implementation to improve and perfect the pairing based algorithms. Some important rules that the designers must obey to improve the security of the cryptosystems are proposed. According to these rules, three countermeasures that protect the pairing based cryptosystems against SCA attacks are applied. The implementations of the countermeasures are presented and their performances are investigated

    Implementation of ECC on FPGA using Scalable Architecture With equal Data and Key for WSN

    Get PDF
    Security of data transferred on the Wireless Sensor Network is of vital importance. In public key cryptography RSA algorithm has been used for a long time, but it does not meet the constraints of WSNs. Elliptic Curve Cryptography(ECC) has been employed recently because of its highest security for same length bit. ECC point multiplication operation is time consuming which affects the speed of encryption and decryption of data. Security in WSNs is addressed in our work, where a modified ECC is designed by performing the point multiplication using Montgomery multiplication technique that achieves considerable speed and with reduced area utilization. The ECC is first simulated on different FPGA devices, with key length 11, 112, 131 and 163 bits and the area-speed tradeoff is compared. ECC algorithm is implemented with software and hardware choosing Artix 7 XC7a100t-3csg324 FPGA which supports key lengths of 11, 112, 131 and 163 bits. When implemented on a Artix 7 FPGA, it completes 163 bit data encryption operation over GF(2163 ) in 1ms with the maximum frequency of 229MHz. The ECC algorithm is reconfigurable with low level to high level security with different bit key sizes. The proposed ECC algorithm modeled using VHDL and synthesized on Spartan 3 and 6, Virtex 4, 5 and 6 and Artix7 before the hardware implementation on Atrix 7. The design satisfies the needs of resource constrained devices by decreasing the encryption and decryption time to 1 ms with equal keylength and datasize, while device utilization is within 13%

    Hardware Architectures for Post-Quantum Cryptography

    Get PDF
    The rapid development of quantum computers poses severe threats to many commonly-used cryptographic algorithms that are embedded in different hardware devices to ensure the security and privacy of data and communication. Seeking for new solutions that are potentially resistant against attacks from quantum computers, a new research field called Post-Quantum Cryptography (PQC) has emerged, that is, cryptosystems deployed in classical computers conjectured to be secure against attacks utilizing large-scale quantum computers. In order to secure data during storage or communication, and many other applications in the future, this dissertation focuses on the design, implementation, and evaluation of efficient PQC schemes in hardware. Four PQC algorithms, each from a different family, are studied in this dissertation. The first hardware architecture presented in this dissertation is focused on the code-based scheme Classic McEliece. The research presented in this dissertation is the first that builds the hardware architecture for the Classic McEliece cryptosystem. This research successfully demonstrated that complex code-based PQC algorithm can be run efficiently on hardware. Furthermore, this dissertation shows that implementation of this scheme on hardware can be easily tuned to different configurations by implementing support for flexible choices of security parameters as well as configurable hardware performance parameters. The successful prototype of the Classic McEliece scheme on hardware increased confidence in this scheme, and helped Classic McEliece to get recognized as one of seven finalists in the third round of the NIST PQC standardization process. While Classic McEliece serves as a ready-to-use candidate for many high-end applications, PQC solutions are also needed for low-end embedded devices. Embedded devices play an important role in our daily life. Despite their typically constrained resources, these devices require strong security measures to protect them against cyber attacks. Towards securing this type of devices, the second research presented in this dissertation focuses on the hash-based digital signature scheme XMSS. This research is the first that explores and presents practical hardware based XMSS solution for low-end embedded devices. In the design of XMSS hardware, a heterogenous software-hardware co-design approach was adopted, which combined the flexibility of the soft core with the acceleration from the hard core. The practicability and efficiency of the XMSS software-hardware co-design is further demonstrated by providing a hardware prototype on an open-source RISC-V based System-on-a-Chip (SoC) platform. The third research direction covered in this dissertation focuses on lattice-based cryptography, which represents one of the most promising and popular alternatives to today\u27s widely adopted public key solutions. Prior research has presented hardware designs targeting the computing blocks that are necessary for the implementation of lattice-based systems. However, a recurrent issue in most existing designs is that these hardware designs are not fully scalable or parameterized, hence limited to specific cryptographic primitives and security parameter sets. The research presented in this dissertation is the first that develops hardware accelerators that are designed to be fully parameterized to support different lattice-based schemes and parameters. Further, these accelerators are utilized to realize the first software-harware co-design of provably-secure instances of qTESLA, which is a lattice-based digital signature scheme. This dissertation demonstrates that even demanding, provably-secure schemes can be realized efficiently with proper use of software-hardware co-design. The final research presented in this dissertation is focused on the isogeny-based scheme SIKE, which recently made it to the final round of the PQC standardization process. This research shows that hardware accelerators can be designed to offload compute-intensive elliptic curve and isogeny computations to hardware in a versatile fashion. These hardware accelerators are designed to be fully parameterized to support different security parameter sets of SIKE as well as flexible hardware configurations targeting different user applications. This research is the first that presents versatile hardware accelerators for SIKE that can be mapped efficiently to both FPGA and ASIC platforms. Based on these accelerators, an efficient software-hardwareco-design is constructed for speeding up SIKE. In the end, this dissertation demonstrates that, despite being embedded with expensive arithmetic, the isogeny-based SIKE scheme can be run efficiently by exploiting specialized hardware. These four research directions combined demonstrate the practicability of building efficient hardware architectures for complex PQC algorithms. The exploration of efficient PQC solutions for different hardware platforms will eventually help migrate high-end servers and low-end embedded devices towards the post-quantum era

    A microcoded elliptic curve cryptographic processor.

    Get PDF
    Leung Ka Ho.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves [85]-90).Abstracts in English and Chinese.Abstract --- p.iAcknowledgments --- p.iiiList of Figures --- p.ixList of Tables --- p.xiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivation --- p.1Chapter 1.2 --- Aims --- p.3Chapter 1.3 --- Contributions --- p.3Chapter 1.4 --- Thesis Outline --- p.4Chapter 2 --- Cryptography --- p.6Chapter 2.1 --- Introduction --- p.6Chapter 2.2 --- Foundations --- p.6Chapter 2.3 --- Secret Key Cryptosystems --- p.8Chapter 2.4 --- Public Key Cryptosystems --- p.9Chapter 2.4.1 --- One-way Function --- p.10Chapter 2.4.2 --- Certification Authority --- p.10Chapter 2.4.3 --- Discrete Logarithm Problem --- p.11Chapter 2.4.4 --- RSA vs. ECC --- p.12Chapter 2.4.5 --- Key Exchange Protocol --- p.13Chapter 2.4.6 --- Digital Signature --- p.14Chapter 2.5 --- Secret Key vs. Public Key Cryptography --- p.16Chapter 2.6 --- Summary --- p.18Chapter 3 --- Mathematical Background --- p.19Chapter 3.1 --- Introduction --- p.19Chapter 3.2 --- Groups and Fields --- p.19Chapter 3.3 --- Finite Fields --- p.21Chapter 3.4 --- Modular Arithmetic --- p.21Chapter 3.5 --- Polynomial Basis --- p.21Chapter 3.6 --- Optimal Normal Basis --- p.22Chapter 3.6.1 --- Addition --- p.23Chapter 3.6.2 --- Squaring --- p.24Chapter 3.6.3 --- Multiplication --- p.24Chapter 3.6.4 --- Inversion --- p.30Chapter 3.7 --- Summary --- p.33Chapter 4 --- Literature Review --- p.34Chapter 4.1 --- Introduction --- p.34Chapter 4.2 --- Hardware Elliptic Curve Implementation --- p.34Chapter 4.2.1 --- Field Processors --- p.34Chapter 4.2.2 --- Curve Processors --- p.36Chapter 4.3 --- Software Elliptic Curve Implementation --- p.36Chapter 4.4 --- Summary --- p.38Chapter 5 --- Introduction to Elliptic Curves --- p.39Chapter 5.1 --- Introduction --- p.39Chapter 5.2 --- Historical Background --- p.39Chapter 5.3 --- Elliptic Curves over R2 --- p.40Chapter 5.3.1 --- Curve Addition and Doubling --- p.41Chapter 5.4 --- Elliptic Curves over Finite Fields --- p.44Chapter 5.4.1 --- Elliptic Curves over Fp with p>〉3 --- p.44Chapter 5.4.2 --- Elliptic Curves over F2n --- p.45Chapter 5.4.3 --- Operations of Elliptic Curves over F2n --- p.46Chapter 5.4.4 --- Curve Multiplication --- p.49Chapter 5.5 --- Elliptic Curve Discrete Logarithm Problem --- p.51Chapter 5.6 --- Public Key Cryptography --- p.52Chapter 5.7 --- Elliptic Curve Diffie-Hellman Key Exchange --- p.54Chapter 5.8 --- Summary --- p.55Chapter 6 --- Design Methodology --- p.56Chapter 6.1 --- Introduction --- p.56Chapter 6.2 --- CAD Tools --- p.56Chapter 6.3 --- Hardware Platform --- p.59Chapter 6.3.1 --- FPGA --- p.59Chapter 6.3.2 --- Reconfigurable Hardware Computing --- p.62Chapter 6.4 --- Elliptic Curve Processor Architecture --- p.63Chapter 6.4.1 --- Arithmetic Logic Unit (ALU) --- p.64Chapter 6.4.2 --- Register File --- p.68Chapter 6.4.3 --- Microcode --- p.69Chapter 6.5 --- Parameterized Module Generator --- p.72Chapter 6.6 --- Microcode Toolkit --- p.73Chapter 6.7 --- Initialization by Bitstream Reconfiguration --- p.74Chapter 6.8 --- Summary --- p.75Chapter 7 --- Results --- p.76Chapter 7.1 --- Introduction --- p.76Chapter 7.2 --- Elliptic Curve Processor with Serial Multiplier (p = 1) --- p.76Chapter 7.3 --- Projective verses Affine Coordinates --- p.78Chapter 7.4 --- Elliptic Curve Processor with Parallel Multiplier (p > 1) --- p.79Chapter 7.5 --- Summary --- p.80Chapter 8 --- Conclusion --- p.82Chapter 8.1 --- Recommendations for Future Research --- p.83Bibliography --- p.85Chapter A --- Elliptic Curves in Characteristics 2 and3 --- p.91Chapter A.1 --- Introduction --- p.91Chapter A.2 --- Derivations --- p.91Chapter A.3 --- "Elliptic Curves over Finite Fields of Characteristic ≠ 2,3" --- p.92Chapter A.4 --- Elliptic Curves over Finite Fields of Characteristic = 2 --- p.94Chapter B --- Examples of Curve Multiplication --- p.95Chapter B.1 --- Introduction --- p.95Chapter B.2 --- Numerical Results --- p.9

    Studies on high-speed hardware implementation of cryptographic algorithms

    Get PDF
    Cryptographic algorithms are ubiquitous in modern communication systems where they have a central role in ensuring information security. This thesis studies efficient implementation of certain widely-used cryptographic algorithms. Cryptographic algorithms are computationally demanding and software-based implementations are often too slow or power consuming which yields a need for hardware implementation. Field Programmable Gate Arrays (FPGAs) are programmable logic devices which have proven to be highly feasible implementation platforms for cryptographic algorithms because they provide both speed and programmability. Hence, the use of FPGAs for cryptography has been intensively studied in the research community and FPGAs are also the primary implementation platforms in this thesis. This thesis presents techniques allowing faster implementations than existing ones. Such techniques are necessary in order to use high-security cryptographic algorithms in applications requiring high data rates, for example, in heavily loaded network servers. The focus is on Advanced Encryption Standard (AES), the most commonly used secret-key cryptographic algorithm, and Elliptic Curve Cryptography (ECC), public-key cryptographic algorithms which have gained popularity in the recent years and are replacing traditional public-key cryptosystems, such as RSA. Because these algorithms are well-defined and widely-used, the results of this thesis can be directly applied in practice. The contributions of this thesis include improvements to both algorithms and techniques for implementing them. Algorithms are modified in order to make them more suitable for hardware implementation, especially, focusing on increasing parallelism. Several FPGA implementations exploiting these modifications are presented in the thesis including some of the fastest implementations available in the literature. The most important contributions of this thesis relate to ECC and, specifically, to a family of elliptic curves providing faster computations called Koblitz curves. The results of this thesis can, in their part, enable increasing use of cryptographic algorithms in various practical applications where high computation speed is an issue

    On Large Polynomial Multiplication in Certain Rings

    Get PDF
    Multiplication of polynomials with large integer coefficients and very high degree is used in cryptography. Residue number system (RNS) helps distribute a very large integer over a set of smaller integers, which makes the computations faster. In this thesis, multiplication of polynomials in ring Z_p/(x^n + 1) where n is a power of two is analyzed using the Schoolbook method, Karatsuba algorithm, Toeplitz matrix vector product (TMVP) method and Number Theoretic Transform (NTT) method. All coefficients are residues of p which is a 30-bit integer that has been selected from the set of 30-bit moduli for RNS in NFLlib. NTT has a computational complexity of O(n log n) and hence, it has the best performance among all these methods for the multiplication of large polynomials. NTT method limits applications in ring Z_p/(x^n + 1). This restricts size of the polynomials to only powers of two. We consider multiplication in other cyclotomic rings using TMVP method which has a subquadratic complexity of O(n log_2 3). An attempt is made to improve the performance of TMVP method by designing a hybrid method that switches to schoolbook method when n reaches a certain low value. It is first implemented in Zp/(x^n + 1) to improve the performance of TMVP for large polynomials. This method performs almost as good as NTT for polynomials of size 2^(10). TMVP method is then exploited to design multipliers in other rings Z_p/Φ_k where Φ_k is a cyclotomic trinomial. Similar hybrid designs are analyzed to improve performance in the trinomial rings. This allows a wider range of polynomials in terms of size to work with and helps avoid unnecessary use of larger key size that might slow down computations

    Automated Design Space Exploration and Datapath Synthesis for Finite Field Arithmetic with Applications to Lightweight Cryptography

    Get PDF
    Today, emerging technologies are reaching astronomical proportions. For example, the Internet of Things has numerous applications and consists of countless different devices using different technologies with different capabilities. But the one invariant is their connectivity. Consequently, secure communications, and cryptographic hardware as a means of providing them, are faced with new challenges. Cryptographic algorithms intended for hardware implementations must be designed with a good trade-off between implementation efficiency and sufficient cryptographic strength. Finite fields are widely used in cryptography. Examples of algorithm design choices related to finite field arithmetic are the field size, which arithmetic operations to use, how to represent the field elements, etc. As there are many parameters to be considered and analyzed, an automation framework is needed. This thesis proposes a framework for automated design, implementation and verification of finite field arithmetic hardware. The underlying motif throughout this work is “math meets hardware”. The automation framework is designed to bring the awareness of underlying mathematical structures to the hardware design flow. It is implemented in GAP, an open source computer algebra system that can work with finite fields and has symbolic computation capabilities. The framework is roughly divided into two phases, the architectural decisions and the automated design genera- tion. The architectural decisions phase supports parameter search and produces a list of candidates. The automated design generation phase is invoked for each candidate, and the generated VHDL files are passed on to conventional synthesis tools. The candidates and their implementation results form the design space, and the framework allows rapid design space exploration in a systematic way. In this thesis, design space exploration is focused on finite field arithmetic. Three distinctive features of the proposed framework are the structure of finite fields, tower field support, and on the fly submodule generation. Each finite field used in the design is represented as both a field and its corresponding vector space. It is easy for a designer to switch between fields and vector spaces, but strict distinction of the two is necessary for hierarchical designs. When an expression is defined over an extension field, the top-level module contains element signals and submodules for arithmetic operations on those signals. The submodules are generated with corresponding vector signals and the arithmetic operations are now performed on the coordinates. For tower fields, the submodules are generated for the subfield operations, and the design is generated in a top-down fashion. The binding of expressions to the appropriate finite fields or vector spaces and a set of customized methods allow the on the fly generation of expressions for implementation of arithmetic operations, and hence submodule generation. In the light of NIST Lightweight Cryptography Project (LWC), this work focuses mainly on small finite fields. The thesis illustrates the impact of hardware implementation results during the design process of WAGE, a Round 2 candidate in the NIST LWC standardization competition. WAGE is a hardware oriented authenticated encryption scheme. The parameter selection for WAGE was aimed at balancing the security and hardware implementation area, using hardware implementation results for many design decisions, for example field size, representation of field elements, etc. In the proposed framework, the components of WAGE are used as an example to illustrate different automation flows and demonstrate the design space exploration on a real-world algorithm

    Novel algorithms and hardware architectures for Montgomery Multiplication over GF(p)

    Get PDF
    This report describes the design and implementation results in FPGAs of a scalable hardware architecture for computing modular multiplication in prime fields GF(pp), based on the Montgomery multiplication (MM) algorithm. Starting from an existing digit-serial version of the MM algorithm, a novel {\it digit-digit} based MM algorithm is derived and two hardware architectures that compute that algorithm are described. In the proposed approach, the input operands (multiplicand, multiplier and modulus) are represented using as radix β=2k\beta = 2^k. Operands of arbitrary size can be multiplied with modular reduction using almost the same hardware since the multiplier\u27s kernel module that performs the modular multiplication depends only on kk. The novel hardware architectures proposed in this paper were verified by modeling them using VHDL and implementing them in the Xilinx FPGAs Spartan and Virtex5. Design trade-offs are analyzed considering different operand sizes commonly used in cryptography and different values for kk. The proposed designs for MM are well suited to be implemented in modern FPGAs, making use of available dedicated multiplier and memory blocks reducing drastically the FPGA\u27s standard logic while keeping an acceptable performance compared with other implementation approaches. From the Virtex5 implementation, the proposed MM multiplier reaches a throughput of 242Mbps using only 219 FPGA slices and achieving a 1024-bit modular multiplication in 4.21μ\musecs
    corecore