22 research outputs found

    Implementation of a Generic Modular Cryptosystem for the RSA on Reconfigurable Hardware

    Get PDF
    This report summarizes the work that was initiated from the summer of 2008, on the study and analysis of cryptographic design techniques and their implementation on an FPGA board,i.e. the Virtex II pro. The study began with the understanding of a popular HDL language, namely, Verilog. Based on the study an implementation of a modular cryptosystem based on the RSA and generic upto a 256 bit modulus was realized. Optimal techniques for developing a high speed RSA cryptosystem is presented in this work. Through out the thesis the primary tool was the Xilinx based ISE toolkit. However for validation purposes other simulators such as ModelSim was also used. However, the simulations presented in this work utilizes the Xilinx ISE 10.1 Simulator environment. The Xilinx XST 10.1 was used in the synthesis of the implementation. The division technique utilized a modified non-restoring division scheme. The multiplication scheme used the Karatsuba-Ofman technique. The exponentiation scheme used was the Montgomery Modular exponentiation. The inversion scheme used a modified form of the Extended Euclidean Algorithm which involves no division or multiplication as suggested by Laszlo Hars. The thesis concludes with suggestions on extending the present implementation of RSA on FPGA

    PROTEUS: A Tool to generate pipelined Number Theoretic Transform Architectures for FHE and ZKP applications

    Get PDF
    Emerging cryptographic algorithms such as fully homomorphic encryption (FHE) and zero-knowledge proof (ZKP) perform arithmetic involving very large polynomials. One fundamental and time-consuming polynomial operation is the Number theoretic transform (NTT) which is a generalization of the fast Fourier transform. Hardware platforms such as FPGAs could be used to accelerate the NTTs in FHE and ZKP protocols. One major problem is that the FHE and ZKP protocols require different parameter sets, e.g., polynomial degree and coefficient size, depending on their applications. Therefore, a basic research question is: How to design scalable hardware architectures for accelerating NTTs in the FHE and ZKP protocols? In this paper, we present ‘PROTEUS’, an open-source and parametric tool that generates synthesizable bandwidth-efficient NTT architectures for user-specified parameter sets. The architectures can be tuned to utilize different memory bandwidths and parameters which is a very important design requirement in both FHE and ZKP protocols. The generated NTT architectures show a significant performance speedup compared to similar NTT architectures on FPGA. Further comparisons with state-of-the-art show a reduction of up to 23% and 35% in terms of DSP and BRAM utilization

    VLSI Implementation of RSA Cryptosystem

    Get PDF
    In the age of information, security issues play a crucial role. Security comes with three points’ confidentiality, integrity and availability. The entire above said thing will come from an efficient cryptographic algorithm. We need special hardware to implement these cryptographic algorithms to provide higher throughput. This hardware should have high flexibility since the cryptographic algorithms are constantly changing. To achieve this goal VLSI implementation of these cryptosystem is the best solution. Our work is mainly based on designing architectures for Rivest–Shamir–Adleman (RSA), one of the most well-known public key cryptosystem. This work started with the understanding of basics of cryptography and modular arithmetic which is essential to understand cryptographic algorithm. Then in the cryptographic module RSA, a public key cryptographic module is chosen as the algorithm for implementation. The design goal is to increase the speed or in other words the throughput of RSA cryptosystem

    Design of a novel hybrid cryptographic processor

    Get PDF
    viii, 87 leaves : ill. (some col.) ; 28 cm.A new multiplier that supports fields GF(p) and GF (2n) for the public-key cryptography, and fields GF (28) for the secret-key cryptography is proposed in this thesis. Based on the core multiplier and other extracted common operations, a novel hybrid crypto-processor is built which processes both public-key and secret-key cryptosystems. The corresponding instruction set is also presented. Three cryptographic algorithms: the Elliptic Curve Cryptography (ECC), AES and RC5 are focused to run in the processor. To compute scalar multiplication kP efficiently, a blend of efficient algorthms on elliptic curves and coordinates selections and of hardware architecture that supports arithmetic operations on finite fields is requried. The Nonadjacent Form (NAF) of k is used in Jacobian projective coordinates over GF(p); Montgomery scalar multiplication is utilized in projective coordinates over GF(2n). The dual-field multiplier is used to support multiplications over GF(p) and GF(2n) according to multiple-precision Montgomery multiplications algorithms. The design ideas of AES and RC5 are also described. The proposed hybrid crypto-processor increases the flexibility of security schemes and reduces the total cost of cryptosystems

    Hardware Architectures for Post-Quantum Cryptography

    Get PDF
    The rapid development of quantum computers poses severe threats to many commonly-used cryptographic algorithms that are embedded in different hardware devices to ensure the security and privacy of data and communication. Seeking for new solutions that are potentially resistant against attacks from quantum computers, a new research field called Post-Quantum Cryptography (PQC) has emerged, that is, cryptosystems deployed in classical computers conjectured to be secure against attacks utilizing large-scale quantum computers. In order to secure data during storage or communication, and many other applications in the future, this dissertation focuses on the design, implementation, and evaluation of efficient PQC schemes in hardware. Four PQC algorithms, each from a different family, are studied in this dissertation. The first hardware architecture presented in this dissertation is focused on the code-based scheme Classic McEliece. The research presented in this dissertation is the first that builds the hardware architecture for the Classic McEliece cryptosystem. This research successfully demonstrated that complex code-based PQC algorithm can be run efficiently on hardware. Furthermore, this dissertation shows that implementation of this scheme on hardware can be easily tuned to different configurations by implementing support for flexible choices of security parameters as well as configurable hardware performance parameters. The successful prototype of the Classic McEliece scheme on hardware increased confidence in this scheme, and helped Classic McEliece to get recognized as one of seven finalists in the third round of the NIST PQC standardization process. While Classic McEliece serves as a ready-to-use candidate for many high-end applications, PQC solutions are also needed for low-end embedded devices. Embedded devices play an important role in our daily life. Despite their typically constrained resources, these devices require strong security measures to protect them against cyber attacks. Towards securing this type of devices, the second research presented in this dissertation focuses on the hash-based digital signature scheme XMSS. This research is the first that explores and presents practical hardware based XMSS solution for low-end embedded devices. In the design of XMSS hardware, a heterogenous software-hardware co-design approach was adopted, which combined the flexibility of the soft core with the acceleration from the hard core. The practicability and efficiency of the XMSS software-hardware co-design is further demonstrated by providing a hardware prototype on an open-source RISC-V based System-on-a-Chip (SoC) platform. The third research direction covered in this dissertation focuses on lattice-based cryptography, which represents one of the most promising and popular alternatives to today\u27s widely adopted public key solutions. Prior research has presented hardware designs targeting the computing blocks that are necessary for the implementation of lattice-based systems. However, a recurrent issue in most existing designs is that these hardware designs are not fully scalable or parameterized, hence limited to specific cryptographic primitives and security parameter sets. The research presented in this dissertation is the first that develops hardware accelerators that are designed to be fully parameterized to support different lattice-based schemes and parameters. Further, these accelerators are utilized to realize the first software-harware co-design of provably-secure instances of qTESLA, which is a lattice-based digital signature scheme. This dissertation demonstrates that even demanding, provably-secure schemes can be realized efficiently with proper use of software-hardware co-design. The final research presented in this dissertation is focused on the isogeny-based scheme SIKE, which recently made it to the final round of the PQC standardization process. This research shows that hardware accelerators can be designed to offload compute-intensive elliptic curve and isogeny computations to hardware in a versatile fashion. These hardware accelerators are designed to be fully parameterized to support different security parameter sets of SIKE as well as flexible hardware configurations targeting different user applications. This research is the first that presents versatile hardware accelerators for SIKE that can be mapped efficiently to both FPGA and ASIC platforms. Based on these accelerators, an efficient software-hardwareco-design is constructed for speeding up SIKE. In the end, this dissertation demonstrates that, despite being embedded with expensive arithmetic, the isogeny-based SIKE scheme can be run efficiently by exploiting specialized hardware. These four research directions combined demonstrate the practicability of building efficient hardware architectures for complex PQC algorithms. The exploration of efficient PQC solutions for different hardware platforms will eventually help migrate high-end servers and low-end embedded devices towards the post-quantum era

    Efficient Multiplication Architectures for Truncated Polynomial Ring

    Get PDF
    In this thesis, four efficient multiplication architectures, named as Multipliers I, II, III, and IV, respectively, for truncated polynomial ring are proposed. Their FPGA implementation results are presented. All of the four proposed multipliers can be used for implementation of NTRUEncrypt public key system. All new multiplication architectures are based on certain extensions to Linear Feedback Shift Register (LFSR). Multiplier I uses x^2-net structure for LFSR, which scans two consecutive coefficients in the control input polynomial r(x) during one clock cycle. In Multiplier II, three consecutive zeros in the control input polynomial r(x) can be processed during one clock cycle. Multiplier III takes advantage of consecutive zeros in the control input polynomial r(x). Multiplier IV is resistant to certain side-channel attacks through controlling the operations for each clock cycle. An FPGA complexity comparison among the proposed multipliers and the existing similar works is made, including number of adaptive logic modules (ALMs), number of registers, number of cycles, maximum operating frequency (FMax) and latency. The FPGA comparison results are given as follows. Multiplier I has smaller latency than any existing works when the first set of parameters from every security level is used (ees401ep1, ees449ep1, ees677ep1, ees1087ep2). Multiplier II is the second best in speed compared to existing works, but has better area-latency product compared to the fastest existing work for the first set of parameters at security level 112-bit, 128-bit and 192-bit. As an enhanced version of Multiplier II, Multiplier III is faster than any existing works in comparison for all IEEE recommended parameter sets. Multiplier IV, designed to be resistant to side channel attacks, also has high speed property that it outperforms all the existing works in terms of latency for all three parameter sets to which it is applicable

    Energy Efficient Neocortex-Inspired Systems with On-Device Learning

    Get PDF
    Shifting the compute workloads from cloud toward edge devices can significantly improve the overall latency for inference and learning. On the contrary this paradigm shift exacerbates the resource constraints on the edge devices. Neuromorphic computing architectures, inspired by the neural processes, are natural substrates for edge devices. They offer co-located memory, in-situ training, energy efficiency, high memory density, and compute capacity in a small form factor. Owing to these features, in the recent past, there has been a rapid proliferation of hybrid CMOS/Memristor neuromorphic computing systems. However, most of these systems offer limited plasticity, target either spatial or temporal input streams, and are not demonstrated on large scale heterogeneous tasks. There is a critical knowledge gap in designing scalable neuromorphic systems that can support hybrid plasticity for spatio-temporal input streams on edge devices. This research proposes Pyragrid, a low latency and energy efficient neuromorphic computing system for processing spatio-temporal information natively on the edge. Pyragrid is a full-scale custom hybrid CMOS/Memristor architecture with analog computational modules and an underlying digital communication scheme. Pyragrid is designed for hierarchical temporal memory, a biomimetic sequence memory algorithm inspired by the neocortex. It features a novel synthetic synapses representation that enables dynamic synaptic pathways with reduced memory usage and interconnects. The dynamic growth in the synaptic pathways is emulated in the memristor device physical behavior, while the synaptic modulation is enabled through a custom training scheme optimized for area and power. Pyragrid features data reuse, in-memory computing, and event-driven sparse local computing to reduce data movement by ~44x and maximize system throughput and power efficiency by ~3x and ~161x over custom CMOS digital design. The innate sparsity in Pyragrid results in overall robustness to noise and device failure, particularly when processing visual input and predicting time series sequences. Porting the proposed system on edge devices can enhance their computational capability, response time, and battery life

    Simulation and implementation of novel deep learning hardware architectures for resource constrained devices

    Get PDF
    Corey Lammie designed mixed signal memristive-complementary metal–oxide–semiconductor (CMOS) and field programmable gate arrays (FPGA) hardware architectures, which were used to reduce the power and resource requirements of Deep Learning (DL) systems; both during inference and training. Disruptive design methodologies, such as those explored in this thesis, can be used to facilitate the design of next-generation DL systems

    The Integrated Realization of Materials, Products and Associated Manufacturing Processes

    Get PDF
    Problem: A materials design revolution is underway in the recent past where the focus is to design (not select) the material microstructure and processing paths to achieve multiple property or performance requirements that are often in conflict. The advancements in computer simulations have resulted in the speeding up of the process of discovering new materials and has paved way for rapid assessment of process-structure-property-performance relationships of materials, products, and processes. This has led to the simulation-based design of material microstructure (microstructure-mediated design) to satisfy multiple property or performance goals of the product/process/system thereby replacing the classical material design and selection approaches. The foundational premise for this dissertation is that systems-based materials design techniques offer the potential for tailoring materials, their processing paths and the end products that employ these materials in an integrated fashion for challenging applications to satisfy conflicting product and process level property and performance requirements. The primary goal in this dissertation is to establish some of the scientific foundations and tools that are needed for the integrated realization of materials, products and manufacturing processes using simulation models that are typically incomplete, inaccurate and not of equal fidelity by managing the uncertainty associated. Accordingly, the interest in this dissertation lies in establishing a systems-based design architecture that includes system-level synthesis methods and tools that are required for the integrated design of complex materials, products and associated manufacturing processes starting from the end requirements. Hence the primary research question: What are the theoretical, mathematical and computational foundations needed for establishing a comprehensive systems-based design architecture to realize the integrated design of the product, its environment, manufacturing processes and material as a system? Major challenges to be addressed here are: a) integration of models (material, process and product) to establish processing-structure-property-performance relationships, b) goal-oriented inverse design of material microstructures and processing paths to meet multiple conflicting performance/property requirements, c) robust concept exploration by managing uncertainty across process chains and d) systematic, domain-independent, modular, reconfigurable, reusable, computer interpretable, archivable, and multi-objective decision support in the early stages of design to different users. Approach: In order to address these challenges, the primary hypothesis in this dissertation is to establish the theoretical, mathematical and computational foundations for: 1) forward material, product and process workflows through systematic identification and integration of models to define the processing-structure-property-performance relationships; 2) a concept exploration framework supporting systematic formulation of design problems facilitating robust design exploration by bringing together robust design principles and multi-objective decision making protocols; 3) a generic, goal-oriented, inverse decision-based design method that uses 1) and 2) to facilitate the systems-based inverse design of material microstructures and processing paths to meet multiple product level performance/property requirements, thereby generating the problem-specific inverse decision workflow; and 4) integrating the workflows with a knowledge-based platform anchored in modeling decision-related knowledge facilitating capture, execution and reuse of the knowledge associated with 1), 2) and 3). This establishes a comprehensive systems-based design architecture to realize the integrated design of the product, its environment, manufacturing processes and material as a system. Validation: The systems-based design architecture for the integrated realization of materials, products and associated manufacturing processes is validated using the validation-square approach that consists of theoretical and empirical validation. Empirical validation of the design architecture is carried out using an industry driven problem namely the ‘Integrated Design of Steel (Material), Manufacturing Processes (Rolling and Cooling) and Hot Rolled Rods (Product) for Automotive Gears’. Specific sub-problems are formulated within this problem domain to address various research questions identified in this dissertation. Contributions: The contributions from the dissertation are categorized into new knowledge in four research domains: a) systematic model integration (vertical and horizontal) for integrated material and product workflows, b) goal-oriented, inverse decision support, c) robust concept exploration of process chains with multiple conflicting goals and d) knowledge-based decision support for rapid and robust design exploration in simulation-based integrated material, product and process design. The creation of new knowledge in this dissertation is associated with the development of a systems-based design architecture involving systematic function-based approach of formulating forward material workflows, a concept exploration framework for systematic design exploration, an inverse decision-based design method, and robust design metrics, all integrated with a knowledge-based platform for decision support. The theoretical, mathematical and computational foundations for the design architecture are proposed in this dissertation to facilitate rapid and robust exploration of the design and solution spaces to identify material microstructures and processing paths that satisfy conflicting property and performance for complex materials, products and processes by managing uncertainty
    corecore