155 research outputs found

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Architectures for Code-based Post-Quantum Cryptography

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum

    LIPIcs, Volume 258, SoCG 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 258, SoCG 2023, Complete Volum

    A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion

    Get PDF
    The extended GCD (XGCD) calculation, which computes Bézout coefficients ba, bb such that ba ∗ a0 + bb ∗ b0 = GCD(a0, b0), is a critical operation in many cryptographic applications. In particular, large-integer XGCD is computationally dominant for two applications of increasing interest: verifiable delay functions that square binary quadratic forms within a class group and constant-time modular inversion for elliptic curve cryptography. Most prior work has focused on fast software implementations. The few works investigating hardware acceleration build on variants of Euclid’s division-based algorithm, following the approach used in optimized software. We show that adopting variants of Stein’s subtraction-based algorithm instead leads to significantly faster hardware. We quantify this advantage by performing a large-integer XGCD accelerator design space exploration comparing Euclid- and Stein-based algorithms for various application requirements. This exploration leads us to an XGCD hardware accelerator that is flexible and efficient, supports fast average and constant-time evaluation, and is easily extensible for polynomial GCD. Our 16nm ASIC design calculates 1024-bit XGCD in 294ns (8x faster than the state-of-the-art ASIC) and constant-time 255-bit XGCD for inverses in the field of integers modulo the prime 2255−19 in 85ns (31× faster than state-of-the-art software). We believe our design is the first high-performance ASIC for the XGCD computation that is also capable of constant-time evaluation. Our work is publicly available at https://github.com/kavyasreedhar/sreedhar-xgcd-hardware-ches2022

    Side-Channel Analysis and Cryptography Engineering : Getting OpenSSL Closer to Constant-Time

    Get PDF
    As side-channel attacks reached general purpose PCs and started to be more practical for attackers to exploit, OpenSSL adopted in 2005 a flagging mechanism to protect against SCA. The opt-in mechanism allows to flag secret values, such as keys, with the BN_FLG_CONSTTIME flag. Whenever a flag is checked and detected, the library changes its execution flow to SCA-secure functions that are slower but safer, protecting these secret values from being leaked. This mechanism favors performance over security, it is error-prone, and is obscure for most library developers, increasing the potential for side-channel vulnerabilities. This dissertation presents an extensive side-channel analysis of OpenSSL and criticizes its fragile flagging mechanism. This analysis reveals several flaws affecting the library resulting in multiple side-channel attacks, improved cache-timing attack techniques, and a new side channel vector. The first part of this dissertation introduces the main topic and the necessary related work, including the microarchitecture, the cache hierarchy, and attack techniques; then it presents a brief troubled history of side-channel attacks and defenses in OpenSSL, setting the stage for the related publications. This dissertation includes seven original publications contributing to the area of side-channel analysis, microarchitecture timing attacks, and applied cryptography. From an SCA perspective, the results identify several vulnerabilities and flaws enabling protocol-level attacks on RSA, DSA, and ECDSA, in addition to full SCA of the SM2 cryptosystem. With respect to microarchitecture timing attacks, the dissertation presents a new side-channel vector due to port contention in the CPU execution units. And finally, on the applied cryptography front, OpenSSL now enjoys a revamped code base securing several cryptosystems against SCA, favoring a secure-by-default protection against side-channel attacks, instead of the insecure opt-in flagging mechanism provided by the fragile BN_FLG_CONSTTIME flag

    Unmanned Aerial Vehicle (UAV)-Enabled Wireless Communications and Networking

    Get PDF
    The emerging massive density of human-held and machine-type nodes implies larger traffic deviatiolns in the future than we are facing today. In the future, the network will be characterized by a high degree of flexibility, allowing it to adapt smoothly, autonomously, and efficiently to the quickly changing traffic demands both in time and space. This flexibility cannot be achieved when the network’s infrastructure remains static. To this end, the topic of UAVs (unmanned aerial vehicles) have enabled wireless communications, and networking has received increased attention. As mentioned above, the network must serve a massive density of nodes that can be either human-held (user devices) or machine-type nodes (sensors). If we wish to properly serve these nodes and optimize their data, a proper wireless connection is fundamental. This can be achieved by using UAV-enabled communication and networks. This Special Issue addresses the many existing issues that still exist to allow UAV-enabled wireless communications and networking to be properly rolled out

    Cryptography and Its Applications in Information Security

    Get PDF
    Nowadays, mankind is living in a cyber world. Modern technologies involve fast communication links between potentially billions of devices through complex networks (satellite, mobile phone, Internet, Internet of Things (IoT), etc.). The main concern posed by these entangled complex networks is their protection against passive and active attacks that could compromise public security (sabotage, espionage, cyber-terrorism) and privacy. This Special Issue “Cryptography and Its Applications in Information Security” addresses the range of problems related to the security of information in networks and multimedia communications and to bring together researchers, practitioners, and industrials interested by such questions. It consists of eight peer-reviewed papers, however easily understandable, that cover a range of subjects and applications related security of information

    Rethinking Cache Hierarchy And Interconnect Design For Next-Generation Gpus

    Get PDF
    To match the increasing computational demands of GPGPU applications and to improve peak compute throughput, the core counts in GPUs have been increasing with every generation. However, the famous memory wall is a major performance determinant in GPUs. In other words, in most cases, peak throughput in GPUs is ultimately dictated by memory bandwidth. Therefore, to serve the memory demands of thousands of concurrently executing threads, GPUs are equipped with several sources of bandwidth such as on-chip private/shared caching resources and off-chip high bandwidth memories. However, the existing sources of bandwidth are often not sufficient for achieving optimal GPU performance. Therefore, it is important to conserve and improve memory bandwidth utilization. To achieve the aforementioned goal, this dissertation focuses on improving on-chip cache bandwidth by managing cache line (data) replication across L1 caches via rethinking the cache hierarchy and the interconnect design. Such data replication stems from the private nature of the L1 caches and inter-core locality. Specifically, each GPU core can independently request and store a given cache line (in its local L1 cache) while being oblivious to the previous requests of other cores. This dissertation treats inter-core locality (i.e., data replication) as a double-edged sword, and proposes the following. First, this dissertation shows that efficient inter-core communication can exploit data replication across the L1 caches to unlock an additional potential source of on-chip bandwidth, which we call as remote-core bandwidth. We propose to efficiently coordinate the data movement across GPU cores to exploit this remote-core bandwidth by investigating: a) which data is replicated across cores, b) which cores have the replicated data, and c) how to fetch the replicated data as soon as possible. Second, this dissertation shows that if data replication is eliminated (or reduced), then the L1 caches can effectively cache more data, leading to higher hit rates and more on-chip bandwidth. We propose designing a shared L1 cache organization, which restricts each core to cache only a unique slice of the address range, eliminating data replication. We develop lightweight mechanisms to: a) reduce the inter-core communication overheads and b) to identify applications that prefer the private L1 organization and hence execute them accordingly. Third, to improve the performance, area, and energy efficiency of the shared L1 organization, this dissertation proposes DC-L1 (DeCoupled-L1) cache, an L1 cache separated from the GPU core. We show how the decoupled nature of the DC-L1 caches provides an opportunity to aggregate the L1 caches, and enables low-overhead efficient data placement designs. These optimizations reduce data replication across the L1s and increase their bandwidth utilization. Altogether, this dissertation develops several innovative techniques to improve the efficiency of the GPU on-chip memory system, which are necessary to address the memory wall problem. The future work will explore other designs and techniques to improve on-chip bandwidth utilization by considering other bandwidth sources (e.g., scratchpad and L2 cache)
    corecore