Search CORE

162 research outputs found

A versatile Montgomery multiplier architecture with characteristic three support

Author: Ozturk Erdinc
Savas Erkay
Savaş Erkay
Sunar Berk
Öztürk Erdinç
Publication venue: 'Elsevier BV'
Publication date: 08/05/2008
Field of study

We present a novel unified core design which is extended to realize Montgomery multiplication in the fields GF(2n), GF(3m), and GF(p). Our unified design supports RSA and elliptic curve schemes, as well as the identity-based encryption which requires a pairing computation on an elliptic curve. The architecture is pipelined and is highly scalable. The unified core utilizes the redundant signed digit representation to reduce the critical path delay. While the carry-save representation used in classical unified architectures is only good for addition and multiplication operations, the redundant signed digit representation also facilitates efficient computation of comparison and subtraction operations besides addition and multiplication. Thus, there is no need for a transformation between the redundant and the non-redundant representations of field elements, which would be required in the classical unified architectures to realize the subtraction and comparison operations. We also quantify the benefits of the unified architectures in terms of area and critical path delay. We provide detailed implementation results. The metric shows that the new unified architecture provides an improvement over a hypothetical non-unified architecture of at least 24.88%, while the improvement over a classical unified architecture is at least 32.07%

Sabanci University Research Database

Hardware Implementations of Scalable and Unified Elliptic Curve Cryptosystem Processors

Author: Loi Kung Chi Cinnati 1985-
Publication venue: 'University of Saskatchewan Library'
Publication date: 11/02/2020
Field of study

As the amount of information exchanged through the network grows, so does the demand for increased security over the transmission of this information. As the growth of computers increased in the past few decades, more sophisticated methods of cryptography have been developed. One method of transmitting data securely over the network is by using symmetric-key cryptography. However, a drawback of symmetric-key cryptography is the need to exchange the shared key securely. One of the solutions is to use public-key cryptography. One of the modern public-key cryptography algorithms is called Elliptic Curve Cryptography (ECC). The advantage of ECC over some older algorithms is the smaller number of key sizes to provide a similar level of security. As a result, implementations of ECC are much faster and consume fewer resources. In order to achieve better performance, ECC operations are often offloaded onto hardware to alleviate the workload from the servers' processors. The most important and complex operation in ECC schemes is the elliptic curve point multiplication (ECPM). This thesis explores the implementation of hardware accelerators that offload the ECPM operation to hardware. These processors are referred to as ECC processors, or simply ECPs. This thesis targets the efficient hardware implementation of ECPs specifically for the 15 elliptic curves recommended by the National Institute of Standards and Technology (NIST). The main contribution of this thesis is the implementation of highly efficient hardware for scalable and unified finite field arithmetic units that are used in the design of ECPs. In this thesis, scalability refers to the processor's ability to support multiple key sizes without the need to reconfigure the hardware. By doing so, the hardware does not need to be redesigned for the server to handle different levels of security. Unified refers to the ability of the ECP to handle both prime and binary fields. The resultant designs are valuable to the research community and industry, as a single hardware device is able to handle a wide range of ECC operations efficiently and at high speeds. Thus, improving the ability of network servers to handle secure transaction more quickly and improve productivity at lower costs

University of Saskatchewan Research Archive

Customisable arithmetic hardware designs

Author: Cheung Chak-Chung Ray
Cheung Chak-Chung Ray
Publication venue
Publication date: 01/01/2007
Field of study

Imperial Users onl

Spiral - Imperial College Digital Repository

Recommended from our members

Side channel attack resistant elliptic curves cryptosystem on multi-cores for power efficiency

Author: Yoo Jaewon
Publication venue: 'Oregon State University'
Publication date
Field of study

The Advent of multi-cores allows programs to be executed much faster than before. Cryptoalgorithms use long-bit words thus parallelizing these operations on multi-cores will achieve significant performance improvement. However, not all long-bit word operations in cryptosystems are suitable for parallel execution on multi-cores. In particular, long-bit words used in Elliptic Curves Cryptography (ECC) do not efficiently divide by the system word size. This causes some of the cores to be idle, which makes it vulnerable for attackers to guess how many operations occurred and thus what field size is being used. Multiplication is the most important part of public key cryptosystems. Long-bit word multiplication operations are needed for encryption and decryption. J. Fan et al. proposed using Montgomery multiplication on multi-cores using GF(2²⁵⁶) [25, 26], which is suitable for comput-er systems with 16-bit or 32-bit word size. Fan‟s Montgomery multiplication is suitable for most RSA. However, in ECC, some GFs will cause idle cores. For example, suppose GF(2¹³¹) is used (which is one of the recommended word size by NIST) on a quad-core with a 32-bit word size, which requires [132/32] =5 iterations with the last iteration requiring just a 3-bit operation. This cause three of the cores to be idle during this time causing needless power consumption. The most general and the easiest way to make side channel attacks difficult is to insert dummy instructions to cover the idle processors. However, dummy instructions result in extra workloads that lead to performance degradation and increases in power consumption. In this thesis, we will present a multiplier adjuster technique to improve the execution time and the power consumption for the last unbalanced iteration. By appropriately applying dummy instructions between point-addition and point-doubling operations, a balanced point operation can be achieved in ECC. The performance and power-efficiency of the proposed method on multi-cores are analyzed for each GF used in ECC

ScholarsArchive@OSU

Efficient Elliptic Curve Cryptography Software Implementation on Embedded Platforms

Author: albahri Mohamed
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 05/09/2019
Field of study

White Rose E-theses Online

Hardware Architectures for Post-Quantum Cryptography

Author: Wang Wen
Publication venue: EliScholar – A Digital Platform for Scholarly Publishing at Yale
Publication date: 01/04/2021
Field of study

The rapid development of quantum computers poses severe threats to many commonly-used cryptographic algorithms that are embedded in different hardware devices to ensure the security and privacy of data and communication. Seeking for new solutions that are potentially resistant against attacks from quantum computers, a new research field called Post-Quantum Cryptography (PQC) has emerged, that is, cryptosystems deployed in classical computers conjectured to be secure against attacks utilizing large-scale quantum computers. In order to secure data during storage or communication, and many other applications in the future, this dissertation focuses on the design, implementation, and evaluation of efficient PQC schemes in hardware. Four PQC algorithms, each from a different family, are studied in this dissertation. The first hardware architecture presented in this dissertation is focused on the code-based scheme Classic McEliece. The research presented in this dissertation is the first that builds the hardware architecture for the Classic McEliece cryptosystem. This research successfully demonstrated that complex code-based PQC algorithm can be run efficiently on hardware. Furthermore, this dissertation shows that implementation of this scheme on hardware can be easily tuned to different configurations by implementing support for flexible choices of security parameters as well as configurable hardware performance parameters. The successful prototype of the Classic McEliece scheme on hardware increased confidence in this scheme, and helped Classic McEliece to get recognized as one of seven finalists in the third round of the NIST PQC standardization process. While Classic McEliece serves as a ready-to-use candidate for many high-end applications, PQC solutions are also needed for low-end embedded devices. Embedded devices play an important role in our daily life. Despite their typically constrained resources, these devices require strong security measures to protect them against cyber attacks. Towards securing this type of devices, the second research presented in this dissertation focuses on the hash-based digital signature scheme XMSS. This research is the first that explores and presents practical hardware based XMSS solution for low-end embedded devices. In the design of XMSS hardware, a heterogenous software-hardware co-design approach was adopted, which combined the flexibility of the soft core with the acceleration from the hard core. The practicability and efficiency of the XMSS software-hardware co-design is further demonstrated by providing a hardware prototype on an open-source RISC-V based System-on-a-Chip (SoC) platform. The third research direction covered in this dissertation focuses on lattice-based cryptography, which represents one of the most promising and popular alternatives to today\u27s widely adopted public key solutions. Prior research has presented hardware designs targeting the computing blocks that are necessary for the implementation of lattice-based systems. However, a recurrent issue in most existing designs is that these hardware designs are not fully scalable or parameterized, hence limited to specific cryptographic primitives and security parameter sets. The research presented in this dissertation is the first that develops hardware accelerators that are designed to be fully parameterized to support different lattice-based schemes and parameters. Further, these accelerators are utilized to realize the first software-harware co-design of provably-secure instances of qTESLA, which is a lattice-based digital signature scheme. This dissertation demonstrates that even demanding, provably-secure schemes can be realized efficiently with proper use of software-hardware co-design. The final research presented in this dissertation is focused on the isogeny-based scheme SIKE, which recently made it to the final round of the PQC standardization process. This research shows that hardware accelerators can be designed to offload compute-intensive elliptic curve and isogeny computations to hardware in a versatile fashion. These hardware accelerators are designed to be fully parameterized to support different security parameter sets of SIKE as well as flexible hardware configurations targeting different user applications. This research is the first that presents versatile hardware accelerators for SIKE that can be mapped efficiently to both FPGA and ASIC platforms. Based on these accelerators, an efficient software-hardwareco-design is constructed for speeding up SIKE. In the end, this dissertation demonstrates that, despite being embedded with expensive arithmetic, the isogeny-based SIKE scheme can be run efficiently by exploiting specialized hardware. These four research directions combined demonstrate the practicability of building efficient hardware architectures for complex PQC algorithms. The exploration of efficient PQC solutions for different hardware platforms will eventually help migrate high-end servers and low-end embedded devices towards the post-quantum era

Yale University

High-Speed and Unified ECC Processor for Generic Weierstrass Curves over GF(p) on FPGA

Author: Asep Muhamad Awaludin
Harashta Tatimma Larasati
Howon Kim
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 18/01/2022
Field of study

In this paper, we present a high-speed, unified elliptic curve cryptography (ECC) processor for arbitrary Weierstrass curves over GF(p), which to the best of our knowledge, outperforms other similar works in terms of execution time. Our approach employs the combination of the schoolbook long and Karatsuba multiplication algorithm for the elliptic curve point multiplication (ECPM) to achieve better parallelization while retaining low complexity. In the hardware implementation, the substantial gain in speed is also contributed by our n-bit pipelined Montgomery Modular Multiplier (pMMM), which is constructed from our n-bit pipelined multiplier-accumulators that utilizes digital signal processor (DSP) primitives as digit multipliers. Additionally, we also introduce our unified, pipelined modular adder-subtractor (pMAS) for the underlying field arithmetic, and leverage a more efficient yet compact scheduling of the Montgomery ladder algorithm. The implementation for 256-bit modulus size on the 7-series FPGA: Virtex-7, Kintex-7, and XC7Z020 yields 0.139, 0.138, and 0.206 ms of execution time, respectively. Furthermore, since our pMMM module is generic for any curve in Weierstrass form, we support multi-curve parameters, resulting in a unified ECC architecture. Lastly, our method also works in constant time, making it suitable for applications requiring high speed and SCA-resistant characteristics

Cryptology ePrint Archive

Speed reading in the dark : Accelerating functional encryption for quadratic functions with reprogrammable hardware

Author: Bahadori Milad
Järvinen Kimmo
Marc Tilen
Stopar Miha
Publication venue: 'Universitatsbibliothek der Ruhr-Universitat Bochum'
Publication date: 01/01/2021
Field of study

Functional encryption is a new paradigm for encryption where decryption does not give the entire plaintext but only some function of it. Functional encryption has great potential in privacy-enhancing technologies but suffers from excessive computational overheads. We introduce the first hardware accelerator that supports functional encryption for quadratic functions. Our accelerator is implemented on a reprogrammable system-on-chip following the hardware/software codesign methogol-ogy. We benchmark our implementation for two privacy-preserving machine learning applications: (1) classification of handwritten digits from the MNIST database and (2) classification of clothes images from the Fashion MNIST database. In both cases, classification is performed with encrypted images. We show that our implementation offers speedups of over 200 times compared to a published software implementation and permits applications which are unfeasible with software-only solutions.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Speed reading in the dark : Accelerating functional encryption for quadratic functions with reprogrammable hardware

Author: Bahadori Milad
Järvinen Kimmo
Marc Tilen
Stopar Miha
Publication venue
Publication date: 01/01/2021
Field of study

Helsingin yliopiston digitaalinen arkisto

Ruhr-Universität Bochum (RUB): Open Journal Systems