96 research outputs found

    Novel Single and Hybrid Finite Field Multipliers over GF(2m) for Emerging Cryptographic Systems

    Get PDF
    With the rapid development of economic and technical progress, designers and users of various kinds of ICs and emerging embedded systems like body-embedded chips and wearable devices are increasingly facing security issues. All of these demands from customers push the cryptographic systems to be faster, more efficient, more reliable and safer. On the other hand, multiplier over GF(2m) as the most important part of these emerging cryptographic systems, is expected to be high-throughput, low-complexity, and low-latency. Fortunately, very large scale integration (VLSI) digital signal processing techniques offer great facilities to design efficient multipliers over GF(2m). This dissertation focuses on designing novel VLSI implementation of high-throughput low-latency and low-complexity single and hybrid finite field multipliers over GF(2m) for emerging cryptographic systems. Low-latency (latency can be chosen without any restriction) high-speed pentanomial basis multipliers are presented. For the first time, the dissertation also develops three high-throughput digit-serial multipliers based on pentanomials. Then a novel realization of digit-level implementation of multipliers based on redundant basis is introduced. Finally, single and hybrid reordered normal basis bit-level and digit-level high-throughput multipliers are presented. To the authors knowledge, this is the first time ever reported on multipliers with multiple throughput rate choices. All the proposed designs are simple and modular, therefore suitable for VLSI implementation for various emerging cryptographic systems

    Design and analysis of an FPGA-based, multi-processor HW-SW system for SCC applications

    Get PDF
    The last 30 years have seen an increase in the complexity of embedded systems from a collection of simple circuits to systems consisting of multiple processors managing a wide variety of devices. This ever increasing complexity frequently requires that high assurance, fail-safe and secure design techniques be applied to protect against possible failures and breaches. To facilitate the implementation of these embedded systems in an efficient way, the FPGA industry recently created new families of devices. New features added to these devices include anti-tamper monitoring, bit stream encryption, and optimized routing architectures for physical and functional logic partition isolation. These devices have high capacities and are capable of implementing processors using their reprogrammable logic structures. This allows for an unprecedented level of hardware and software interaction within a single FPGA chip. High assurance and fail-safe systems can now be implemented within the reconfigurable hardware fabric of an FPGA, enabling these systems to maintain flexibility and achieve high performance while providing a high level of data security. The objective of this thesis was to design and analyze an FPGA-based system containing two isolated, softcore Nios processors that share data through two crypto-engines. FPGA-based single-chip cryptographic (SCC) techniques were employed to ensure proper component isolation when the design is placed on a device supporting the appropriate security primitives. Each crypto-engine is an implementation of the Advanced Encryption Standard (AES), operating in Galois/Counter Mode (GCM) for both encryption and authentication. The features of the microprocessors and architectures of the AES crypto-engines were varied with the goal of determining combinations which best target high performance, minimal hardware usage, or a combination of the two

    Efficient implementation of elliptic curve cryptography.

    Get PDF
    Elliptic Curve Cryptosystems (ECC) were introduced in 1985 by Neal Koblitz and Victor Miller. Small key size made elliptic curve attractive for public key cryptosystem implementation. This thesis introduces solutions of efficient implementation of ECC in algorithmic level and in computation level. In algorithmic level, a fast parallel elliptic curve scalar multiplication algorithm based on a dual-processor hardware system is developed. The method has an average computation time of n3 Elliptic Curve Point Addition on an n-bit scalar. The improvement is n Elliptic Curve Point Doubling compared to conventional methods. When a proper coordinate system and binary representation for the scalar k is used the average execution time will be as low as n Elliptic Curve Point Doubling, which makes this method about two times faster than conventional single processor multipliers using the same coordinate system. In computation level, a high performance elliptic curve processor (ECP) architecture is presented. The processor uses parallelism in finite field calculation to achieve high speed execution of scalar multiplication algorithm. The architecture relies on compile-time detection rather than of run-time detection of parallelism which results in less hardware. Implemented on FPGA, the proposed processor operates at 66MHz in GF(2 167) and performs scalar multiplication in 100muSec, which is considerably faster than recent implementations.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .A57. Source: Masters Abstracts International, Volume: 44-03, page: 1446. Thesis (M.A.Sc.)--University of Windsor (Canada), 2005

    Studies on high-speed hardware implementation of cryptographic algorithms

    Get PDF
    Cryptographic algorithms are ubiquitous in modern communication systems where they have a central role in ensuring information security. This thesis studies efficient implementation of certain widely-used cryptographic algorithms. Cryptographic algorithms are computationally demanding and software-based implementations are often too slow or power consuming which yields a need for hardware implementation. Field Programmable Gate Arrays (FPGAs) are programmable logic devices which have proven to be highly feasible implementation platforms for cryptographic algorithms because they provide both speed and programmability. Hence, the use of FPGAs for cryptography has been intensively studied in the research community and FPGAs are also the primary implementation platforms in this thesis. This thesis presents techniques allowing faster implementations than existing ones. Such techniques are necessary in order to use high-security cryptographic algorithms in applications requiring high data rates, for example, in heavily loaded network servers. The focus is on Advanced Encryption Standard (AES), the most commonly used secret-key cryptographic algorithm, and Elliptic Curve Cryptography (ECC), public-key cryptographic algorithms which have gained popularity in the recent years and are replacing traditional public-key cryptosystems, such as RSA. Because these algorithms are well-defined and widely-used, the results of this thesis can be directly applied in practice. The contributions of this thesis include improvements to both algorithms and techniques for implementing them. Algorithms are modified in order to make them more suitable for hardware implementation, especially, focusing on increasing parallelism. Several FPGA implementations exploiting these modifications are presented in the thesis including some of the fastest implementations available in the literature. The most important contributions of this thesis relate to ECC and, specifically, to a family of elliptic curves providing faster computations called Koblitz curves. The results of this thesis can, in their part, enable increasing use of cryptographic algorithms in various practical applications where high computation speed is an issue

    A new approach in building parallel finite field multipliers

    Get PDF
    A new method for building bit-parallel polynomial basis finite field multipliers is proposed in this thesis. Among the different approaches to build such multipliers, Mastrovito multipliers based on a trinomial, an all-one-polynomial, or an equally-spacedpolynomial have the lowest complexities. The next best in this category is a conventional multiplier based on a pentanomial. Any newly presented method should have complexity results which are at least better than those of a pentanomial based multiplier. By applying our method to certain classes of finite fields we have gained a space complexity as n2 + H - 4 and a time complexity as TA + ([ log2(n-l) ]+3)rx which are better than the lowest space and time complexities of a pentanomial based multiplier found in literature. Therefore this multiplier can serve as an alternative in those finite fields in which no trinomial, all-one-polynomial or equally-spaced-polynomial exists

    Efficient Implementation of Elliptic Curve Cryptography on FPGAs

    Get PDF
    This work presents the design strategies of an FPGA-based elliptic curve co-processor. Elliptic curve cryptography is an important topic in cryptography due to its relatively short key length and higher efficiency as compared to other well-known public key crypto-systems like RSA. The most important contributions of this work are: - Analyzing how different representations of finite fields and points on elliptic curves effect the performance of an elliptic curve co-processor and implementing a high performance co-processor. - Proposing a novel dynamic programming approach to find the optimum combination of different recursive polynomial multiplication methods. Here optimum means the method which has the smallest number of bit operations. - Designing a new normal-basis multiplier which is based on polynomial multipliers. The most important part of this multiplier is a circuit of size O(nlogn)O(n \log n) for changing the representation between polynomial and normal basis

    Design and analysis of efficient and secure elliptic curve cryptoprocessors

    Get PDF
    Elliptic Curve Cryptosystems have attracted many researchers and have been included in many standards such as IEEE, ANSI, NIST, SEC and WTLS. The ability to use smaller keys and computationally more efficient algorithms compared with earlier public key cryptosystems such as RSA and ElGamal are two main reasons why elliptic curve cryptosystems are becoming more popular. They are considered to be particularly suitable for implementation on smart cards or mobile devices. Power Analysis Attacks on such devices are considered serious threat due to the physical characteristics of these devices and their use in potentially hostile environments. This dissertation investigates elliptic curve cryptoprocessor architectures for curves defined over GF(2m) fields. In this dissertation, new architectures that are suitable for efficient computation of scalar multiplications with resistance against power analysis attacks are proposed and their performance evaluated. This is achieved by exploiting parallelism and randomized processing techniques. Parallelism and randomization are controlled at different levels to provide more efficiency and security. Furthermore, the proposed architectures are flexible enough to allow designers tailor performance and hardware requirements according to their performance and cost objectives. The proposed architectures have been modeled using VHDL and implemented on FPGA platform

    Hardware Implementations of Scalable and Unified Elliptic Curve Cryptosystem Processors

    Get PDF
    As the amount of information exchanged through the network grows, so does the demand for increased security over the transmission of this information. As the growth of computers increased in the past few decades, more sophisticated methods of cryptography have been developed. One method of transmitting data securely over the network is by using symmetric-key cryptography. However, a drawback of symmetric-key cryptography is the need to exchange the shared key securely. One of the solutions is to use public-key cryptography. One of the modern public-key cryptography algorithms is called Elliptic Curve Cryptography (ECC). The advantage of ECC over some older algorithms is the smaller number of key sizes to provide a similar level of security. As a result, implementations of ECC are much faster and consume fewer resources. In order to achieve better performance, ECC operations are often offloaded onto hardware to alleviate the workload from the servers' processors. The most important and complex operation in ECC schemes is the elliptic curve point multiplication (ECPM). This thesis explores the implementation of hardware accelerators that offload the ECPM operation to hardware. These processors are referred to as ECC processors, or simply ECPs. This thesis targets the efficient hardware implementation of ECPs specifically for the 15 elliptic curves recommended by the National Institute of Standards and Technology (NIST). The main contribution of this thesis is the implementation of highly efficient hardware for scalable and unified finite field arithmetic units that are used in the design of ECPs. In this thesis, scalability refers to the processor's ability to support multiple key sizes without the need to reconfigure the hardware. By doing so, the hardware does not need to be redesigned for the server to handle different levels of security. Unified refers to the ability of the ECP to handle both prime and binary fields. The resultant designs are valuable to the research community and industry, as a single hardware device is able to handle a wide range of ECC operations efficiently and at high speeds. Thus, improving the ability of network servers to handle secure transaction more quickly and improve productivity at lower costs

    Fast Modular Reduction for Large-Integer Multiplication

    Get PDF
    The work contained in this thesis is a representation of the successful attempt to speed-up the modular reduction as an independent step of modular multiplication, which is the central operation in public-key cryptosystems. Based on the properties of Mersenne and Quasi-Mersenne primes, four distinct sets of moduli have been described, which are responsible for converting the single-precision multiplication prevalent in many of today\u27s techniques into an addition operation and a few simple shift operations. A novel algorithm has been proposed for modular folding. With the backing of the special moduli sets, the proposed algorithm is shown to outperform (speed-wise) the Modified Barrett algorithm by 80% for operands of length 700 bits, the least speed-up being around 70% for smaller operands, in the range of around 100 bits

    Hardware Architectures for Post-Quantum Cryptography

    Get PDF
    The rapid development of quantum computers poses severe threats to many commonly-used cryptographic algorithms that are embedded in different hardware devices to ensure the security and privacy of data and communication. Seeking for new solutions that are potentially resistant against attacks from quantum computers, a new research field called Post-Quantum Cryptography (PQC) has emerged, that is, cryptosystems deployed in classical computers conjectured to be secure against attacks utilizing large-scale quantum computers. In order to secure data during storage or communication, and many other applications in the future, this dissertation focuses on the design, implementation, and evaluation of efficient PQC schemes in hardware. Four PQC algorithms, each from a different family, are studied in this dissertation. The first hardware architecture presented in this dissertation is focused on the code-based scheme Classic McEliece. The research presented in this dissertation is the first that builds the hardware architecture for the Classic McEliece cryptosystem. This research successfully demonstrated that complex code-based PQC algorithm can be run efficiently on hardware. Furthermore, this dissertation shows that implementation of this scheme on hardware can be easily tuned to different configurations by implementing support for flexible choices of security parameters as well as configurable hardware performance parameters. The successful prototype of the Classic McEliece scheme on hardware increased confidence in this scheme, and helped Classic McEliece to get recognized as one of seven finalists in the third round of the NIST PQC standardization process. While Classic McEliece serves as a ready-to-use candidate for many high-end applications, PQC solutions are also needed for low-end embedded devices. Embedded devices play an important role in our daily life. Despite their typically constrained resources, these devices require strong security measures to protect them against cyber attacks. Towards securing this type of devices, the second research presented in this dissertation focuses on the hash-based digital signature scheme XMSS. This research is the first that explores and presents practical hardware based XMSS solution for low-end embedded devices. In the design of XMSS hardware, a heterogenous software-hardware co-design approach was adopted, which combined the flexibility of the soft core with the acceleration from the hard core. The practicability and efficiency of the XMSS software-hardware co-design is further demonstrated by providing a hardware prototype on an open-source RISC-V based System-on-a-Chip (SoC) platform. The third research direction covered in this dissertation focuses on lattice-based cryptography, which represents one of the most promising and popular alternatives to today\u27s widely adopted public key solutions. Prior research has presented hardware designs targeting the computing blocks that are necessary for the implementation of lattice-based systems. However, a recurrent issue in most existing designs is that these hardware designs are not fully scalable or parameterized, hence limited to specific cryptographic primitives and security parameter sets. The research presented in this dissertation is the first that develops hardware accelerators that are designed to be fully parameterized to support different lattice-based schemes and parameters. Further, these accelerators are utilized to realize the first software-harware co-design of provably-secure instances of qTESLA, which is a lattice-based digital signature scheme. This dissertation demonstrates that even demanding, provably-secure schemes can be realized efficiently with proper use of software-hardware co-design. The final research presented in this dissertation is focused on the isogeny-based scheme SIKE, which recently made it to the final round of the PQC standardization process. This research shows that hardware accelerators can be designed to offload compute-intensive elliptic curve and isogeny computations to hardware in a versatile fashion. These hardware accelerators are designed to be fully parameterized to support different security parameter sets of SIKE as well as flexible hardware configurations targeting different user applications. This research is the first that presents versatile hardware accelerators for SIKE that can be mapped efficiently to both FPGA and ASIC platforms. Based on these accelerators, an efficient software-hardwareco-design is constructed for speeding up SIKE. In the end, this dissertation demonstrates that, despite being embedded with expensive arithmetic, the isogeny-based SIKE scheme can be run efficiently by exploiting specialized hardware. These four research directions combined demonstrate the practicability of building efficient hardware architectures for complex PQC algorithms. The exploration of efficient PQC solutions for different hardware platforms will eventually help migrate high-end servers and low-end embedded devices towards the post-quantum era
    corecore