103 research outputs found

    RESOURCE EFFICIENT DESIGN OF QUANTUM CIRCUITS FOR CRYPTANALYSIS AND SCIENTIFIC COMPUTING APPLICATIONS

    Get PDF
    Quantum computers offer the potential to extend our abilities to tackle computational problems in fields such as number theory, encryption, search and scientific computation. Up to a superpolynomial speedup has been reported for quantum algorithms in these areas. Motivated by the promise of faster computations, the development of quantum machines has caught the attention of both academics and industry researchers. Quantum machines are now at sizes where implementations of quantum algorithms or their components are now becoming possible. In order to implement quantum algorithms on quantum machines, resource efficient circuits and functional blocks must be designed. In this work, we propose quantum circuits for Galois and integer arithmetic. These quantum circuits are necessary building blocks to realize quantum algorithms. The design of resource efficient quantum circuits requires the designer takes into account the gate cost, quantum bit (qubit) cost, depth and garbage outputs of a quantum circuit. Existing quantum machines do not have many qubits meaning that circuits with high qubit cost cannot be implemented. In addition, quantum circuits are more prone to errors and garbage output removal adds to overall cost. As more gates are used, a quantum circuit sees an increased rate of failure. Failures and error rates can be countered by using quantum error correcting codes and fault tolerant implementations of universal gate sets (such as Clifford+T gates). However, Clifford+T gates are costly to implement with the T gate being significantly more costly than the Clifford gates. As a result, designers working with Clifford+T gates seek to minimize the number of T gates (T-count) and the depth of T gates (T-depth). In this work, we propose quantum circuits for Galois and integer arithmetic with lower T-count, T-depth and qubit cost than existing work. This work presents novel quantum circuits for squaring and exponentiation over binary extension fields (Galois fields of form GF(2 m )). The proposed circuits are shown to have lower depth, qubit and gate cost to existing work. We also present quantum circuits for the core operations of multiplication and division which enjoy lower T-count, T-depth and qubit costs compared to existing work. This work also illustrates the design of a T-count and qubit cost efficient design for the square root. This work concludes with an illustration of how the arithmetic circuits can be combined into a functional block to implement quantum image processing algorithms

    Tamper-Resistant Arithmetic for Public-Key Cryptography

    Get PDF
    Cryptographic hardware has found many uses in many ubiquitous and pervasive security devices with a small form factor, e.g. SIM cards, smart cards, electronic security tokens, and soon even RFIDs. With applications in banking, telecommunication, healthcare, e-commerce and entertainment, these devices use cryptography to provide security services like authentication, identification and confidentiality to the user. However, the widespread adoption of these devices into the mass market, and the lack of a physical security perimeter have increased the risk of theft, reverse engineering, and cloning. Despite the use of strong cryptographic algorithms, these devices often succumb to powerful side-channel attacks. These attacks provide a motivated third party with access to the inner workings of the device and therefore the opportunity to circumvent the protection of the cryptographic envelope. Apart from passive side-channel analysis, which has been the subject of intense research for over a decade, active tampering attacks like fault analysis have recently gained increased attention from the academic and industrial research community. In this dissertation we address the question of how to protect cryptographic devices against this kind of attacks. More specifically, we focus our attention on public key algorithms like elliptic curve cryptography and their underlying arithmetic structure. In our research we address challenges such as the cost of implementation, the level of protection, and the error model in an adversarial situation. The approaches that we investigated all apply concepts from coding theory, in particular the theory of cyclic codes. This seems intuitive, since both public key cryptography and cyclic codes share finite field arithmetic as a common foundation. The major contributions of our research are (a) a generalization of cyclic codes that allow embedding of finite fields into redundant rings under a ring homomorphism, (b) a new family of non-linear arithmetic residue codes with very high error detection probability, (c) a set of new low-cost arithmetic primitives for optimal extension field arithmetic based on robust codes, and (d) design techniques for tamper resilient finite state machines

    Concurrent Error Detection in Finite Field Arithmetic Operations

    Get PDF
    With significant advances in wired and wireless technologies and also increased shrinking in the size of VLSI circuits, many devices have become very large because they need to contain several large units. This large number of gates and in turn large number of transistors causes the devices to be more prone to faults. These faults specially in sensitive and critical applications may cause serious failures and hence should be avoided. On the other hand, some critical applications such as cryptosystems may also be prone to deliberately injected faults by malicious attackers. Some of these faults can produce erroneous results that can reveal some important secret information of the cryptosystems. Furthermore, yield factor improvement is always an important issue in VLSI design and fabrication processes. Digital systems such as cryptosystems and digital signal processors usually contain finite field operations. Therefore, error detection and correction of such operations have become an important issue recently. In most of the work reported so far, error detection and correction are applied using redundancies in space (hardware), time, and/or information (coding theory). In this work, schemes based on these redundancies are presented to detect errors in important finite field arithmetic operations resulting from hardware faults. Finite fields are used in a number of practical cryptosystems and channel encoders/decoders. The schemes presented here can detect errors in arithmetic operations of finite fields represented in different bases, including polynomial, dual and/or normal basis, and implemented in various architectures, including bit-serial, bit-parallel and/or systolic arrays

    Private and Public-Key Side-Channel Threats Against Hardware Accelerated Cryptosystems

    Get PDF
    Modern side-channel attacks (SCA) have the ability to reveal sensitive data from non-protected hardware implementations of cryptographic accelerators whether they be private or public-key systems. These protocols include but are not limited to symmetric, private-key encryption using AES-128, 192, 256, or public-key cryptosystems using elliptic curve cryptography (ECC). Traditionally, scalar point (SP) operations are compelled to be high-speed at any cost to reduce point multiplication latency. The majority of high-speed architectures of contemporary elliptic curve protocols rely on non-secure SP algorithms. This thesis delivers a novel design, analysis, and successful results from a custom differential power analysis attack on AES-128. The resulting SCA can break any 16-byte master key the sophisticated cipher uses and it\u27s direct applications towards public-key cryptosystems will become clear. Further, the architecture of a SCA resistant scalar point algorithm accompanied by an implementation of an optimized serial multiplier will be constructed. The optimized hardware design of the multiplier is highly modular and can use either NIST approved 233 & 283-bit Kobliz curves utilizing a polynomial basis. The proposed architecture will be implemented on Kintex-7 FPGA to later be integrated with the ARM Cortex-A9 processor on the Zynq-7000 AP SoC (XC7Z045) for seamless data transfer and analysis of the vulnerabilities SCAs can exploit

    Reducing the Depth of Quantum FLT-Based Inversion Circuit

    Get PDF
    Works on quantum computing and cryptanalysis has increased significantly in the past few years. Various constructions of quantum arithmetic circuits, as one of the essential components in the field, has also been proposed. However, there has only been a few studies on finite field inversion despite its essential use in realizing quantum algorithms, such as in Shor's algorithm for Elliptic Curve Discrete Logarith Problem (ECDLP). In this study, we propose to reduce the depth of the existing quantum Fermat's Little Theorem (FLT)-based inversion circuit for binary finite field. In particular, we propose follow a complete waterfall approach to translate the Itoh-Tsujii's variant of FLT to the corresponding quantum circuit and remove the inverse squaring operations employed in the previous work by Banegas et al., lowering the number of CNOT gates (CNOT count), which contributes to reduced overall depth and gate count. Furthermore, compare the cost by firstly constructing our method and previous work's in Qiskit quantum computer simulator and perform the resource analysis. Our approach can serve as an alternative for a time-efficient implementation.Comment: version 0.

    High-speed VLSI implementation of Digit-serial Gaussian normal basis Multiplication over GF(2m)

    Get PDF
    In this paper, by employing the logical effort technique an efficient and high-speed VLSI implementation of the digit-serial Gaussian normal basis multiplier is presented. It is constructed by using AND, XOR and XOR tree components. To have a low-cost implementation with low number of transistors, the block of AND gates are implemented by using NAND gates based on the property of the XOR gates in the XOR tree. To optimally decrease the delay and increase the drive ability of the circuit the logical effort method as an efficient method for sizing the transistors is employed. By using this method and also a 4-input XOR gate structure, the circuit is designed for minimum delay. The digit-serial Gaussian normal basis multiplier is implemented over two binary finite fields GF(2163) and GF(2233) in 0.18μm CMOS technology for three different digit sizes. The results show that the proposed structures, compared to previous structures, have been improved in terms of delay and area parameters

    Bit Serial Systolic Architectures for Multiplicative Inversion and Division over GF(2<sup>m</sup>)

    Get PDF
    Systolic architectures are capable of achieving high throughput by maximizing pipelining and by eliminating global data interconnects. Recursive algorithms with regular data flows are suitable for systolization. The computation of multiplicative inversion using algorithms based on EEA (Extended Euclidean Algorithm) are particularly suitable for systolization. Implementations based on EEA present a high degree of parallelism and pipelinability at bit level which can be easily optimized to achieve local data flow and to eliminate the global interconnects which represent most important bottleneck in todays sub-micron design process. The net result is to have high clock rate and performance based on efficient systolic architectures. This thesis examines high performance but also scalable implementations of multiplicative inversion or field division over Galois fields GF(2m) in the specific case of cryptographic applications where field dimension m may be very large (greater than 400) and either m or defining irreducible polynomial may vary. For this purpose, many inversion schemes with different basis representation are studied and most importantly variants of EEA and binary (Stein's) GCD computation implementations are reviewed. A set of common as well as contrasting characteristics of these variants are discussed. As a result a generalized and optimized variant of EEA is proposed which can compute division, and multiplicative inversion as its subset, with divisor in either polynomial or triangular basis representation. Further results regarding Hankel matrix formation for double-basis inversion is provided. The validity of using the same architecture to compute field division with polynomial or triangular basis representation is proved. Next, a scalable unidirectional bit serial systolic array implementation of this proposed variant of EEA is implemented. Its complexity measures are defined and these are compared against the best known architectures. It is shown that assuming the requirements specified above, this proposed architecture may achieve a higher clock rate performance w. r. t. other designs while being more flexible, reliable and with minimum number of inter-cell interconnects. The main contribution at system level architecture is the substitution of all counter or adder/subtractor elements with a simpler distributed and free of carry propagation delays structure. Further a novel restoring mechanism for result sequences of EEA is proposed using a double delay element implementation. Finally, using this systolic architecture a CMD (Combined Multiplier Divider) datapath is designed which is used as the core of a novel systolic elliptic curve processor. This EC processor uses affine coordinates to compute scalar point multiplication which results in having a very small control unit and negligible with respect to the datapath for all practical values of m. The throughput of this EC based on this bit serial systolic architecture is comparable with designs many times larger than itself reported previously

    Quantum resource estimates for computing elliptic curve discrete logarithms

    Get PDF
    We give precise quantum resource estimates for Shor's algorithm to compute discrete logarithms on elliptic curves over prime fields. The estimates are derived from a simulation of a Toffoli gate network for controlled elliptic curve point addition, implemented within the framework of the quantum computing software tool suite LIQUiUi|\rangle. We determine circuit implementations for reversible modular arithmetic, including modular addition, multiplication and inversion, as well as reversible elliptic curve point addition. We conclude that elliptic curve discrete logarithms on an elliptic curve defined over an nn-bit prime field can be computed on a quantum computer with at most 9n+2log2(n)+109n + 2\lceil\log_2(n)\rceil+10 qubits using a quantum circuit of at most 448n3log2(n)+4090n3448 n^3 \log_2(n) + 4090 n^3 Toffoli gates. We are able to classically simulate the Toffoli networks corresponding to the controlled elliptic curve point addition as the core piece of Shor's algorithm for the NIST standard curves P-192, P-224, P-256, P-384 and P-521. Our approach allows gate-level comparisons to recent resource estimates for Shor's factoring algorithm. The results also support estimates given earlier by Proos and Zalka and indicate that, for current parameters at comparable classical security levels, the number of qubits required to tackle elliptic curves is less than for attacking RSA, suggesting that indeed ECC is an easier target than RSA.Comment: 24 pages, 2 tables, 11 figures. v2: typos fixed and reference added. ASIACRYPT 201
    corecore