22 research outputs found

    DEVELOPMENT OF THE SEARCH METHOD FOR NON-LINEAR SHIFT REGISTERS USING HARDWARE, IMPLEMENTED ON FIELD PROGRAMMABLE GATE ARRAYS

    Get PDF
    The nonlinear feedback shift registers of the second order inare considered, because based on them it can be developed a generator of stream ciphers with enhanced cryptographic strength. Feasibility of nonlinear feedback shift register search is analyzed. These registers form a maximal length sequence, using programmable logic devices. Performance evaluation of programmable logic devices in the generation of pseudo-random sequence by nonlinear feedback shift registers is given. Recommendations to increase this performance are given. The dependence of the maximum generation rate (clock frequency), programmable logic devices on the number of concurrent nonlinear registers is analyzed. A comparison of the generation rate of the sequences that are generated by nonlinear feedback shift registers is done using hardware and software. The author suggests, describes and explores the search method of nonlinear feedback shift registers, generating a sequence with a maximum period. As the main result are found non-linear 26, 27, 28 and 29 degrees polynomials

    Scalable method of searching for full-period Nonlinear Feedback Shift Registers with GPGPU. New List of Maximum Period NLFSRs.

    Get PDF
    This paper addresses the problem of efficient searching for Nonlinear Feedback Shift Registers (NLFSRs) with a guaranteed full period. The maximum possible period for an nn-bit NLFSR is 2n12^n-1 (all-zero state is omitted). %but omitting all-0 state makes the period 2n12^n-1 in their longest cycle of states. A multi-stages hybrid algorithm which utilizes Graphics Processor Units (GPU) power was developed for processing data-parallel throughput computation.Usage of abovementioned algorithm allows to give an extended list of n-bit NLFSR with maximum period for 7 cryptographically applicable types of feedback functions

    A Scalable Method for Constructing Galois NLFSRs with Period 2n12^n-1 using Cross-Join Pairs

    Get PDF
    This paper presents a method for constructing nn-stage Galois NLFSRs with period 2n12^n-1 from nn-stage maximum length LFSRs. We introduce nonlinearity into state cycles by adding a nonlinear Boolean function to the feedback polynomial of the LFSR. Each assignment of variables for which this function evaluates to 1 acts as a crossing point for the LFSR state cycle. By adding a copy of the same function to a later stage of the register, we cancel the effect of nonlinearity and join the state cycles back. The presented method requires no extra time steps and it has a smaller area overhead compared to the previous approaches based on cross-join pairs. It is feasible for large nn. However, it has a number of limitations. One is that the resulting NLFSRs can have at most n/2\lfloor n/2 \rfloor-1 stages with a nonlinear update. Another is that feedback functions depend only on state variables which are updated linearly. The latter implies that sequences generated by the presented method can also be generated using a nonlinear filter generator

    Design and Analysis of Cryptographic Pseudorandom Number/Sequence Generators with Applications in RFID

    Get PDF
    This thesis is concerned with the design and analysis of strong de Bruijn sequences and span n sequences, and nonlinear feedback shift register (NLFSR) based pseudorandom number generators for radio frequency identification (RFID) tags. We study the generation of span n sequences using structured searching in which an NLFSR with a class of feedback functions is employed to find span n sequences. Some properties of the recurrence relation for the structured search are discovered. We use five classes of functions in this structured search, and present the number of span n sequences for 6 <= n <= 20. The linear span of a new span n sequence lies between near-optimal and optimal. According to our empirical studies, a span n sequence can be found in the structured search with a better probability of success. Newly found span n sequences can be used in the composited construction and in designing lightweight pseudorandom number generators. We first refine the composited construction based on a span n sequence for generating long de Bruijn sequences. A de Bruijn sequence produced by the composited construction is referred to as a composited de Bruijn sequence. The linear complexity of a composited de Bruijn sequence is determined. We analyze the feedback function of the composited construction from an approximation point of view for producing strong de Bruijn sequences. The cycle structure of an approximated feedback function and the linear complexity of a sequence produced by an approximated feedback function are determined. A few examples of strong de Bruijn sequences with the implementation issues of the feedback functions of an (n+16)-stage NLFSR are presented. We propose a new lightweight pseudorandom number generator family, named Warbler family based on NLFSRs for smart devices. Warbler family is comprised of a combination of modified de Bruijn blocks (CMDB) and a nonlinear feedback Welch-Gong (WG) generator. We derive the randomness properties such as period and linear complexity of an output sequence produced by the Warbler family. Two instances, Warbler-I and Warbler-II, of the Warbler family are proposed for passive RFID tags. The CMDBs of both Warbler-I and Warbler-II contain span n sequences that are produced by the structured search. We analyze the security properties of Warbler-I and Warbler-II by considering the statistical tests and several cryptanalytic attacks. Hardware implementations of both instances in VHDL show that Warbler-I and Warbler-II require 46 slices and 58 slices, respectively. Warbler-I can be used to generate 16-bit random numbers in the tag identification protocol of the EPC Class 1 Generation 2 standard, and Warbler-II can be employed as a random number generator in the tag identification as well as an authentication protocol for RFID systems.1 yea

    Secure and Unclonable Integrated Circuits

    Get PDF
    Semiconductor manufacturing is increasingly reliant in offshore foundries, which has raised concerns with counterfeiting, piracy, and unauthorized overproduction by the contract foundry. The recent shortage of semiconductors has aggravated such problems, with the electronic components market being flooded by recycled, remarked, or even out-of-spec, and defective parts. Moreover, modern internet connected applications require mechanisms that enable secure communication, which must be protected by security countermeasures to mitigate various types of attacks. In this thesis, we describe techniques to aid counterfeit prevention, and mitigate secret extraction attacks that exploit power consumption information. Counterfeit prevention requires simple and trustworthy identification. Physical unclonable functions (PUFs) harvest process variation to create a unique and unclonable digital fingerprint of an IC. However, learning attacks can model the PUF behavior, invalidating its unclonability claims. In this thesis, we research circuits and architectures to make PUFs more resilient to learning attacks. First, we propose the concept of non-monotonic response quantization, where responses not always encode the best performing circuit structure. Then, we explore the design space of PUF compositions, assessing the trade-off between stability and resilience to learning attacks. Finally, we introduce a lightweight key based challenge obfuscation technique that uses a chip unique secret to construct PUFs which are more resilient to learning attacks. Modern internet protocols demand message integrity, confidentiality, and (often) non-repudiation. Adding support for such mechanisms requires on-chip storage of a secret key. Even if the key is produced by a PUF, it will be subject to key extraction attacks that use power consumption information. Secure integrated circuits must address power analysis attacks with appropriate countermeasures. Traditional mitigation techniques have limited scope of protection, and impose several restrictions on how sensitive data must be manipulated. We demonstrate a bit-serial RISC-V microprocessor implementation with no plain-text data in the clear, where all values are protected using Boolean masking and differential domino logic. Software can run with little to no countermeasures, reducing code size and performance overheads. Our methodology is fully automated and can be applied to designs of arbitrary size or complexity. We also provide details on other key components such as clock randomizer, memory protection, and random number generator

    Automated Design Space Exploration and Datapath Synthesis for Finite Field Arithmetic with Applications to Lightweight Cryptography

    Get PDF
    Today, emerging technologies are reaching astronomical proportions. For example, the Internet of Things has numerous applications and consists of countless different devices using different technologies with different capabilities. But the one invariant is their connectivity. Consequently, secure communications, and cryptographic hardware as a means of providing them, are faced with new challenges. Cryptographic algorithms intended for hardware implementations must be designed with a good trade-off between implementation efficiency and sufficient cryptographic strength. Finite fields are widely used in cryptography. Examples of algorithm design choices related to finite field arithmetic are the field size, which arithmetic operations to use, how to represent the field elements, etc. As there are many parameters to be considered and analyzed, an automation framework is needed. This thesis proposes a framework for automated design, implementation and verification of finite field arithmetic hardware. The underlying motif throughout this work is “math meets hardware”. The automation framework is designed to bring the awareness of underlying mathematical structures to the hardware design flow. It is implemented in GAP, an open source computer algebra system that can work with finite fields and has symbolic computation capabilities. The framework is roughly divided into two phases, the architectural decisions and the automated design genera- tion. The architectural decisions phase supports parameter search and produces a list of candidates. The automated design generation phase is invoked for each candidate, and the generated VHDL files are passed on to conventional synthesis tools. The candidates and their implementation results form the design space, and the framework allows rapid design space exploration in a systematic way. In this thesis, design space exploration is focused on finite field arithmetic. Three distinctive features of the proposed framework are the structure of finite fields, tower field support, and on the fly submodule generation. Each finite field used in the design is represented as both a field and its corresponding vector space. It is easy for a designer to switch between fields and vector spaces, but strict distinction of the two is necessary for hierarchical designs. When an expression is defined over an extension field, the top-level module contains element signals and submodules for arithmetic operations on those signals. The submodules are generated with corresponding vector signals and the arithmetic operations are now performed on the coordinates. For tower fields, the submodules are generated for the subfield operations, and the design is generated in a top-down fashion. The binding of expressions to the appropriate finite fields or vector spaces and a set of customized methods allow the on the fly generation of expressions for implementation of arithmetic operations, and hence submodule generation. In the light of NIST Lightweight Cryptography Project (LWC), this work focuses mainly on small finite fields. The thesis illustrates the impact of hardware implementation results during the design process of WAGE, a Round 2 candidate in the NIST LWC standardization competition. WAGE is a hardware oriented authenticated encryption scheme. The parameter selection for WAGE was aimed at balancing the security and hardware implementation area, using hardware implementation results for many design decisions, for example field size, representation of field elements, etc. In the proposed framework, the components of WAGE are used as an example to illustrate different automation flows and demonstrate the design space exploration on a real-world algorithm

    Optimizations and Hardware Implementations for Composited de Bruijn Sequence Generators

    Get PDF
    A binary de Bruijn sequence with period 2^n is a sequence in which every length-n sub-sequence occurs exactly once. de Bruijn sequences have randomness properties that make them attractive for pseudorandom number generators. Unfortunately, it is very difficult to find de Bruijn sequence generators with large periods (e.g., 2^{64}) and most known de Bruijn sequence construction techniques are computationally quite expensive. In this thesis we present a set of optimizations that reduces the computational complexity of the de Bruijn sequence generators constructed by the composited construction technique, which is the most effective one we know. We call optimized composited de Bruijn sequence generators "OcDeb". An original (k, n)-composited de Bruijn sequence generator generates a sequence with period 2^{n+k} and uses O(k^2 + nk) bit operations. Our optimizations reduce this to O(klog (k) + log (n)) operations, allow retiming, and enable parallel implementations that produce multiple bits per clock cycle while reusing some combinational hardware. Our optimizations are formulated in lemmas and theorems with proofs. The benefits of OcDeb-k-n over (k, n)-composited de Bruijn sequence generators are demonstrate by comprehensive results in a 65nm CMOS ASIC library. For example, before place-and-route, an instance of OcDeb-32-32 has a period of 2^{64}, an area of 656 GE and a maximum performance of 1.67 Gbps, representing 1.7X and 29.4X improvement on area and performance respectively over the previous implementation method presented by Mandal and Gong; with parallelization, this instance can achieve 8.30 Gbps with an area of 1229 GE. An instance of OcDeb-512-32 has a period of 2^{544}, an area of 7949 GE, and a maximum performance of 1.43 Gbps

    Randomness Generation for Secure Hardware Masking - Unrolled Trivium to the Rescue

    Get PDF
    Masking is a prominent strategy to protect cryptographic implementations against side-channel analysis. Its popularity arises from the exponential security gains that can be achieved for (approximately) quadratic resource utilization. Many variants of the countermeasure tailored for different optimization goals have been proposed over the past decades. The common denominator among all of them is the implicit demand for robust and high entropy randomness. Simply assuming that uniformly distributed random bits are available, without taking the cost of their generation into account, leads to a poor understanding of the efficiency and performance of secure implementations. This is especially relevant in case of hardware masking schemes which are known to consume large amounts of random bits per cycle due to parallelism. Currently, there seems to be no consensus on how to most efficiently derive many pseudo-random bits per clock cycle from an initial seed and with properties suitable for masked hardware implementations. In this work, we evaluate a number of building blocks for this purpose and find that hardware-oriented stream ciphers like Trivium and its reduced-security variant Bivium B outperform all competitors when implemented in an unrolled fashion. Unrolled implementations of these primitives enable the flexible generation of many bits per cycle while maintaining high performance, which is crucial for satisfying the large randomness demands of state-of-the-art masking schemes. According to our analysis, only Linear Feedback Shift Registers (LFSRs), when also unrolled, are capable of producing long non-repetitive sequences of random-looking bits at a high rate per cycle even more efficiently than Trivium and Bivium B. Yet, these instances do not provide black-box security as they generate only linear outputs. We experimentally demonstrate that using multiple output bits from an LFSR in the same masked implementation can violate probing security and even lead to harmful randomness cancellations. Circumventing these problems, and enabling an independent analysis of randomness generation and masking scheme, requires the use of cryptographically stronger primitives like stream ciphers. As a result of our studies, we provide an evidence-based estimate for the cost of securely generating n fresh random bits per cycle. Depending on the desired level of black-box security and operating frequency, this cost can be as low as 20n to 30n ASIC gate equivalents (GE) or 3n to 4n FPGA look-up tables (LUTs), where n is the number of random bits required. Our results demonstrate that the cost per bit is (sometimes significantly) lower than estimated in previous works, incentivizing parallelism whenever exploitable and potentially moving low randomness usage in hardware masking research from a primary to secondary design goal
    corecore