2,700 research outputs found
Quantum resource estimates for computing elliptic curve discrete logarithms
We give precise quantum resource estimates for Shor's algorithm to compute
discrete logarithms on elliptic curves over prime fields. The estimates are
derived from a simulation of a Toffoli gate network for controlled elliptic
curve point addition, implemented within the framework of the quantum computing
software tool suite LIQ. We determine circuit implementations for
reversible modular arithmetic, including modular addition, multiplication and
inversion, as well as reversible elliptic curve point addition. We conclude
that elliptic curve discrete logarithms on an elliptic curve defined over an
-bit prime field can be computed on a quantum computer with at most qubits using a quantum circuit of at most Toffoli gates. We are able to classically simulate the
Toffoli networks corresponding to the controlled elliptic curve point addition
as the core piece of Shor's algorithm for the NIST standard curves P-192,
P-224, P-256, P-384 and P-521. Our approach allows gate-level comparisons to
recent resource estimates for Shor's factoring algorithm. The results also
support estimates given earlier by Proos and Zalka and indicate that, for
current parameters at comparable classical security levels, the number of
qubits required to tackle elliptic curves is less than for attacking RSA,
suggesting that indeed ECC is an easier target than RSA.Comment: 24 pages, 2 tables, 11 figures. v2: typos fixed and reference added.
ASIACRYPT 201
Hardware implementation of elliptic curve Diffie-Hellman key agreement scheme in GF(p)
With the advent of technology there are many applications that require secure communication. Elliptic Curve Public-key Cryptosystems are increasingly becoming popular due to their small key size and efficient algorithm. Elliptic curves are widely used in various key exchange techniques including Diffie-Hellman Key Agreement scheme. Modular multiplication and modular division are one of the basic operations in elliptic curve cryptography. Much effort has been made in developing efficient modular multiplication designs, however few works has been proposed for the modular division. Nevertheless, these operations are needed in various cryptographic systems. This thesis examines various scalable implementations of elliptic curve scalar multiplication employing multiplicative inverse or field division in GF(p) focussing mainly on modular divison architectures. Next, this thesis presents a new architecture for modular division based on the variant of Extended Binary GCD algorithm. The main contribution at system level architecture to the modular division unit is use of counters in place of shift registers that are basis of the algorithm and modifying the algorithm to introduce a modular correction unit for the output logic. This results in 62% increase in speed with respect to a prototype design. Finally, using the modular division architecture an Elliptic Curve ALU in GF(p) was implemented which can be used as the core arithmetic unit of an elliptic curve processor. The resulting architecture was targeted to Xilinx Vertex2v6000-bf957 FPGA device and can be implemented for different elliptic curves for almost all practical values of field p. The frequency of the ALU is 58.8 MHz for 128-bits utilizing 20% of the device at 27712 gates which is 30% faster than a prototype implementation with a 2% increase in area utilization. The ALU was tested to perform Diffie-Hellman Key Agreement Scheme and is suitable for other public-key cryptographic algorithms
Analysis of Parallel Montgomery Multiplication in CUDA
For a given level of security, elliptic curve cryptography (ECC) offers improved efficiency over classic public key implementations. Point multiplication is the most common operation in ECC and, consequently, any significant improvement in perfor- mance will likely require accelerating point multiplication. In ECC, the Montgomery algorithm is widely used for point multiplication. The primary purpose of this project is to implement and analyze a parallel implementation of the Montgomery algorithm as it is used in ECC. Specifically, the performance of CPU-based Montgomery multiplication and a GPU-based implementation in CUDA are compared
Efficient unified Montgomery inversion with multibit shifting
Computation of multiplicative inverses in finite fields GF(p) and GF(2/sup n/) is the most time-consuming operation in elliptic curve cryptography, especially when affine co-ordinates are used. Since the existing algorithms based on the extended Euclidean algorithm do not permit a fast software implementation, projective co-ordinates, which eliminate almost all of the inversion operations from the curve arithmetic, are preferred. In the paper, the authors demonstrate that affine co-ordinate implementation provides a comparable speed to that of projective co-ordinates with careful hardware realisation of existing algorithms for calculating inverses in both fields without utilising special moduli or irreducible polynomials. They present two inversion algorithms for binary extension and prime fields, which are slightly modified versions of the Montgomery inversion algorithm. The similarity of the two algorithms allows the design of a single unified hardware architecture that performs the computation of inversion in both fields. They also propose a hardware structure where the field elements are represented using a multi-word format. This feature allows a scalable architecture able to operate in a broad range of precision, which has certain advantages in cryptographic applications. In addition, they include statistical comparison of four inversion algorithms in order to help choose the best one amongst them for implementation onto hardware
Hardware and Software Multi-precision Implementations of Cryptographic Algorithms
The software implementations of cryptographic algorithms are considered to be very slow, when there are requirements of multi-precision arithmetic operations on very long integers. These arithmetic operations may include addition, subtraction, multiplication, division and exponentiation. Several research papers have been published providing different solutions to make these operations faster. Digital Signature Algorithm (DSA) is a cryptographic application that requires multi-precision arithmetic operations. These arithmetic operations are mostly based upon modular multiplication and exponentiation on integers of the size of 1024 bits. The use of such numbers is an essential part of providing high security against the cryptanalytic attacks on the authenticated messages. When these operations are implemented in software, performance in terms of speed becomes very low. The major focus of the thesis is the study of various arithmetic operations for public key cryptography and selecting the fast multi-precision arithmetic algorithms for hardware implementation. These selected algorithms are implemented in hardware and software for performance comparison and they are used to implement Digital Signature Algorithm for performance analysis
Aspects of hardware methodologies for the NTRU public-key cryptosystem
Cryptographic algorithms which take into account requirements for varying levels of security and reduced power consumption in embedded devices are now receiving additional attention. The NTRUEncrypt algorithm has been shown to provide certain advantages when designing low power and resource constrained systems, while still providing comparable security levels to higher complexity algorithms. The research presented in this thesis starts with an examination of the general NTRUEncrypt system, followed by a more practical examination with respect to the IEEE 1363.1 draft standard. In contrast to previous research, the focus is shifted away from specific optimizations but rather provides a study of many of the recommended practices and suggested optimizations with particular emphasis on polynomial arithmetic and parameter selection. Various methods are examined for storing, inverting and multiplying polynomials used in the system. Recommendations for algorithm and parameter selection are made regarding implementation in software and hardware with respect to the resources available. Although the underlying mathematical principles have not been significantly questioned, stable recommended practices are still being developed for the NTRUEncrypt system. As a further complication, recommended optimizations have come from various researchers and have been split between hardware and software implementations. In this thesis, a generic VHDL model is presented, based on the IEEE 1363.1 draft standard, which is designed for adaptation to software or hardware implementation while providing flexibility for changes in recommended practices
- …