66 research outputs found

    FACCT: FAst, Compact, and Constant-Time Discrete Gaussian Sampler over Integers

    Get PDF
    The discrete Gaussian sampler is one of the fundamental tools in implementing lattice-based cryptosystems. However, a naive discrete Gaussian sampling implementation suffers from side-channel vulnerabilities, and the existing countermeasures usually introduce significant overhead in either the running speed or the memory consumption. In this paper, we propose a fast, compact, and constant-time implementation of the binary sampling algorithm, originally introduced in the BLISS signature scheme. Our implementation adapts the Rényi divergence and the transcendental function polynomial approximation techniques. The efficiency of our scheme is independent of the standard deviation, and we show evidence that our implementations are either faster or more compact than several existing constant-time samplers. In addition, we show the performance of our implementation techniques applied to and integrated with two existing signature schemes: qTesla and Falcon. On the other hand, the convolution theorems are typically adapted to sample from larger standard deviations, by combining samples with much smaller standard deviations. As an additional contribution, we show better parameters for the convolution theorems

    Isochronous Gaussian Sampling: From Inception to Implementation

    Get PDF
    Gaussian sampling over the integers is a crucial tool in lattice-based cryptography, but has proven over the recent years to be surprisingly challenging to perform in a generic, efficient and provable secure manner. In this work, we present a modular framework for generating discrete Gaussians with arbitrary center and standard deviation. Our framework is extremely simple, and it is precisely this simplicity that allowed us to make it easy to implement, provably secure, portable, efficient, and provably resistant against timing attacks. Our sampler is a good candidate for any trapdoor sampling and it is actually the one that has been recently implemented in the Falcon signature scheme. Our second contribution aims at systematizing the detection of implementation errors in Gaussian samplers. We provide a statistical testing suite for discrete Gaussians called SAGA (Statistically Acceptable GAussian). In a nutshell, our two contributions take a step towards trustable and robust Gaussian sampling real-world implementations

    Generic, Efficient and Isochronous Gaussian Sampling over the Integers

    Get PDF
    Gaussian sampling over the integers is one of the fundamental building blocks of lattice-based cryptography. Among the extensively used trapdoor sampling algorithms, it\u27s ineluctable until now. Under the influence of numerous side-channel attacks, it\u27s still challenging to construct a Gaussian sampler that is generic, efficient, and resistant to timing attacks. In this paper, our contribution is three-fold. First, we propose a secure, efficient exponential Bernoulli sampling algorithm. It can be applied to Gaussian samplers based on rejection samplings. We apply it to FALCON, a candidate of round 3 of the NIST post-quantum cryptography standardization project, and reduce its signature generation time by 13%-14%. Second, we develop an isochronous Gaussian sampler based on rejection sampling. Our Algorithm can securely sample from Gaussian distributions with different standard deviations and arbitrary centers. We apply it to PALISADE (S&P 2018), an open-source lattice cryptography library. During the online phase of trapdoor sampling, the running time of the G-lattice sampling algorithm is reduced by 44.12% while resisting timing attacks. Third, we improve the efficiency of the COSAC sampler (PQC 2020). The new COSAC sampler is 1.46x-1.63x faster than the original and has the lowest expected number of trials among all Gaussian samplers based on rejection samplings. But it needs a more efficient algorithm sampling from the normal distribution to improve its performance

    Envisioning the Future of Cyber Security in Post-Quantum Era: A Survey on PQ Standardization, Applications, Challenges and Opportunities

    Full text link
    The rise of quantum computers exposes vulnerabilities in current public key cryptographic protocols, necessitating the development of secure post-quantum (PQ) schemes. Hence, we conduct a comprehensive study on various PQ approaches, covering the constructional design, structural vulnerabilities, and offer security assessments, implementation evaluations, and a particular focus on side-channel attacks. We analyze global standardization processes, evaluate their metrics in relation to real-world applications, and primarily focus on standardized PQ schemes, selected additional signature competition candidates, and PQ-secure cutting-edge schemes beyond standardization. Finally, we present visions and potential future directions for a seamless transition to the PQ era

    How to Sample a Discrete Gaussian (and more) from a Random Oracle

    Get PDF
    The random oracle methodology is central to the design of many practical cryptosystems. A common challenge faced in several systems is the need to have a random oracle that outputs from a structured distribution D\mathcal{D}, even though most heuristic implementations such as SHA-3 are best suited for outputting bitstrings. Our work explores the problem of sampling from discrete Gaussian (and related) distributions in a manner that they can be programmed into random oracles. We make the following contributions: -We provide a definitional framework for our results. We say that a sampling algorithm Sample\mathsf{Sample} for a distribution is explainable if there exists an algorithm Explain\mathsf{Explain} where, for a xx in the domain, we have that Explain(x)→r∈{0,1}n\mathsf{Explain}(x) \rightarrow r \in \{0,1\}^n such that Sample(r)=x\mathsf{Sample}(r)=x. Moreover, if xx is sampled from D\mathcal{D} the explained distribution is statistically close to choosing rr uniformly at random. We consider a variant of this definition that allows the statistical closeness to be a precision parameter\u27\u27 given to the Explain\mathsf{Explain} algorithm. We show that sampling algorithms which satisfy our `explainability\u27 property can be programmed as a random oracle. -We provide a simple algorithm for explaining \emph{any} sampling algorithm that works over distributions with polynomial sized ranges. This includes discrete Gaussians with small standard deviations. -We show how to transform a (not necessarily explainable) sampling algorithm Sample\mathsf{Sample} for a distribution into a new Sample2˘7\mathsf{Sample}\u27 that is explainable. The requirements for doing this is that (1) the probability density function is efficiently computable (2) it is possible to efficiently uniformly sample from all elements that have a probability density above a given threshold pp, showing the equivalence of random oracles to these distributions and random oracles to uniform bitstrings. This includes a large class of distributions, including all discrete Gaussians. -A potential drawback of the previous approach is that the transformation requires an additional computation of the density function. We provide a more customized approach that shows the Miccancio-Walter discrete Gaussian sampler is explainable as is. This suggests that other discrete Gaussian samplers in a similar vein might also be explainable as is

    Hardware Architectures for Post-Quantum Cryptography

    Get PDF
    The rapid development of quantum computers poses severe threats to many commonly-used cryptographic algorithms that are embedded in different hardware devices to ensure the security and privacy of data and communication. Seeking for new solutions that are potentially resistant against attacks from quantum computers, a new research field called Post-Quantum Cryptography (PQC) has emerged, that is, cryptosystems deployed in classical computers conjectured to be secure against attacks utilizing large-scale quantum computers. In order to secure data during storage or communication, and many other applications in the future, this dissertation focuses on the design, implementation, and evaluation of efficient PQC schemes in hardware. Four PQC algorithms, each from a different family, are studied in this dissertation. The first hardware architecture presented in this dissertation is focused on the code-based scheme Classic McEliece. The research presented in this dissertation is the first that builds the hardware architecture for the Classic McEliece cryptosystem. This research successfully demonstrated that complex code-based PQC algorithm can be run efficiently on hardware. Furthermore, this dissertation shows that implementation of this scheme on hardware can be easily tuned to different configurations by implementing support for flexible choices of security parameters as well as configurable hardware performance parameters. The successful prototype of the Classic McEliece scheme on hardware increased confidence in this scheme, and helped Classic McEliece to get recognized as one of seven finalists in the third round of the NIST PQC standardization process. While Classic McEliece serves as a ready-to-use candidate for many high-end applications, PQC solutions are also needed for low-end embedded devices. Embedded devices play an important role in our daily life. Despite their typically constrained resources, these devices require strong security measures to protect them against cyber attacks. Towards securing this type of devices, the second research presented in this dissertation focuses on the hash-based digital signature scheme XMSS. This research is the first that explores and presents practical hardware based XMSS solution for low-end embedded devices. In the design of XMSS hardware, a heterogenous software-hardware co-design approach was adopted, which combined the flexibility of the soft core with the acceleration from the hard core. The practicability and efficiency of the XMSS software-hardware co-design is further demonstrated by providing a hardware prototype on an open-source RISC-V based System-on-a-Chip (SoC) platform. The third research direction covered in this dissertation focuses on lattice-based cryptography, which represents one of the most promising and popular alternatives to today\u27s widely adopted public key solutions. Prior research has presented hardware designs targeting the computing blocks that are necessary for the implementation of lattice-based systems. However, a recurrent issue in most existing designs is that these hardware designs are not fully scalable or parameterized, hence limited to specific cryptographic primitives and security parameter sets. The research presented in this dissertation is the first that develops hardware accelerators that are designed to be fully parameterized to support different lattice-based schemes and parameters. Further, these accelerators are utilized to realize the first software-harware co-design of provably-secure instances of qTESLA, which is a lattice-based digital signature scheme. This dissertation demonstrates that even demanding, provably-secure schemes can be realized efficiently with proper use of software-hardware co-design. The final research presented in this dissertation is focused on the isogeny-based scheme SIKE, which recently made it to the final round of the PQC standardization process. This research shows that hardware accelerators can be designed to offload compute-intensive elliptic curve and isogeny computations to hardware in a versatile fashion. These hardware accelerators are designed to be fully parameterized to support different security parameter sets of SIKE as well as flexible hardware configurations targeting different user applications. This research is the first that presents versatile hardware accelerators for SIKE that can be mapped efficiently to both FPGA and ASIC platforms. Based on these accelerators, an efficient software-hardwareco-design is constructed for speeding up SIKE. In the end, this dissertation demonstrates that, despite being embedded with expensive arithmetic, the isogeny-based SIKE scheme can be run efficiently by exploiting specialized hardware. These four research directions combined demonstrate the practicability of building efficient hardware architectures for complex PQC algorithms. The exploration of efficient PQC solutions for different hardware platforms will eventually help migrate high-end servers and low-end embedded devices towards the post-quantum era

    Enhanced Lattice-Based Signatures on Reconfigurable Hardware

    Get PDF
    The recent Bimodal Lattice Signature Scheme (BLISS) showed that lattice-based constructions have evolved to practical alternatives to RSA or ECC. It offers small signatures of 5600 bits for a 128-bit level of security, and proved to be very fast in software. However, due to the complex sampling of Gaussian noise with high precision, it is not clear whether this scheme can be mapped efficiently to embedded devices. Even though the authors of BLISS also proposed a new sampling algorithm using Bernoulli variables this approach is more complex than previous methods using large precomputed tables. The clear disadvantage of using large tables for high performance is that they cannot be used on constrained computing environments, such as FPGAs, with limited memory. In this work we thus present techniques for an efficient Cumulative Distribution Table (CDT) based Gaussian sampler on reconfigurable hardware involving Peikert\u27s convolution lemma and the Kullback-Leibler divergence. Based on our enhanced sampler design, we provide a scalable implementation of BLISS signing and verification on a Xilinx Spartan-6 FPGA supporting either 128-bit, 160-bit, or 192-bit security. For high speed we integrate fast FFT/NTT-based polynomial multiplication, parallel sparse multiplication, Huffman compression of signatures, and Keccak as hash function. Additionally, we compare the CDT with the Bernoulli approach and show that for the particular BLISS-I parameter set the improved CDT approach is faster with lower area consumption. Our BLISS-I core uses 2,291 slices, 5.5 BRAMs, and 5 DSPs and performs a signing operation in 114.1 us on average. Verification is even faster with a latency of 61.2 us and 17,101 supported verification operations per second

    Design Techniques for High Performance Wireline Communication and Security Systems

    Full text link
    As the amount of data traffic grows exponentially on the internet, towards thousands of exabytes by 2020, high performance and high efficiency communication and security solutions are constantly in high demand, calling for innovative solutions. Within server communication dominates todays network data transfer, outweighing between-server and server-to-user data transfer by an order of magnitude. Solutions for within-server communication tend to be very wideband, i.e. on the order of tens of gigahertz, equalizers are widely deployed to provide extended bandwidth at reasonable cost. However, using equalizers typically costs the available signal-to-noise ratio (SNR) at the receiver side. What is worse is that the SNR available at the channel becomes worse as data rate increases, making it harder to meet the tight constraint on error rate, delay, and power consumption. In this thesis, two equalization solutions that address optimal equalizer implementations are discussed. One is a low-power high-speed maximum likelihood sequence detection (MLSD) that achieves record energy efficiency, below 10 pico-Joule per bit. The other one is a phase-shaping equalizer design that suppresses inter-symbol interference at almost zero cost of SNR. The growing amount of communication use also challenges the design of security subsystems, and the emerging need for post-quantum security adds to the difficulties. Most of currently deployed cryptographic primitives rely on the hardness of discrete logarithms that could potentially be solved efficiently with a powerful enough quantum computer. Efficient post-quantum encryption solutions have become of substantial value. In this thesis a fast and efficient lattice encryption application-specific integrated circuit is presented that surpasses the energy efficiency of embedded processors by 4 orders of magnitude.PHDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/146092/1/shisong_1.pd
    • …
    corecore