29 research outputs found

    A Masked Pure-Hardware Implementation of Kyber Cryptographic Algorithm

    Get PDF
    Security against side-channel assisted attacks remains a focus and concern in the ongoing standardization process of quantum-computer-resistant cryptography algorithms. Hiding and masking techniques are currently under investigation to protect the Post-Quantum Cryptography (PQC) algorithms in the NIST PQC standardization process against sophisticated side-channel attacks. Between hiding and masking, masking is emerging as a popular option due to its simplicity and minimized cost of implementation compared with hiding, which often requires duplication of hardware resources and advanced analysis and design techniques to implement correctly. This work presents a pure hardware implementation of masked CCA2-secure Kyber-512, a candidate chosen by NIST to be standardized. A novel hiding technique that leverages the advantages of FPGAs over micro-controllers and is demonstrably secure against Simple Power Analysis (SPA) and Differential Power Analysis (DPA) side-channel attacks is presented. Finally, a novel hybrid hiding-masking approach is presented that achieves a reduced hardware resource and clock-cycle penalty compared with previously reported figures for similar PQC candidates. The Test Vector Leakage Assessment (TVLA) is adopted to demonstrate the absence of side-channel leakage

    Parameterized Hardware Design on Reconfigurable Computers: An Image Processing Case Study

    Get PDF
    Reconfigurable Computers (RCs) with hardware (FPGA) co-processors can achieve significant performance improvement compared with traditional microprocessor (μP)-based computers for many scientific applications. The potential amount of speedup depends on the intrinsic parallelism of the target application as well as the characteristics of the target platform. In this work, we use image processing applications as a case study to demonstrate how hardware designs are parameterized by the co-processor architecture, particularly the data I/O, i.e., the local memory of the FPGA device and the interconnect between the FPGA and the μP. The local memory has to be used by applications that access data randomly. A typical case belonging to this category is image registration. On the other hand, an application such as edge detection can directly read data through the interconnect in a sequential fashion. Two different algorithms of image registration, the exhaustive search algorithm and the Discrete Wavelet Transform (DWT)-based search algorithm, are implemented on hardware, i.e., Xilinx Vertex-IIPro 50 on the Cray XD1 reconfigurable computer. The performance improvements of hardware implementations are 10× and 2×, respectively. Regarding the category of applications that directly access the interconnect, the hardware implementation of Canny edge detection can achieve 544× speedup

    Accelerating LSTM-based High-Rate Dynamic System Models

    Full text link
    In this paper, we evaluate the use of a trained Long Short-Term Memory (LSTM) network as a surrogate for a Euler-Bernoulli beam model, and then we describe and characterize an FPGA-based deployment of the model for use in real-time structural health monitoring applications. The focus of our efforts is the DROPBEAR (Dynamic Reproduction of Projectiles in Ballistic Environments for Advanced Research) dataset, which was generated as a benchmark for the study of real-time structural modeling applications. The purpose of DROPBEAR is to evaluate models that take vibration data as input and give the initial conditions of the cantilever beam on which the measurements were taken as output. DROPBEAR is meant to serve an exemplar for emerging high-rate "active structures" that can be actively controlled with feedback latencies of less than one microsecond. Although the Euler-Bernoulli beam model is a well-known solution to this modeling problem, its computational cost is prohibitive for the time scales of interest. It has been previously shown that a properly structured LSTM network can achieve comparable accuracy with less workload, but achieving sub-microsecond model latency remains a challenge. Our approach is to deploy the LSTM optimized specifically for latency on FPGA. We designed the model using both high-level synthesis (HLS) and hardware description language (HDL). The lowest latency of 1.42 μ\muS and the highest throughput of 7.87 Gops/s were achieved on Alveo U55C platform for HDL design.Comment: Accepted at 33rd International Conference on Field-Programmable Logic and Applications (FPL

    FPGA Processor In Memory Architectures (PIMs): Overlay or Overhaul ?

    Full text link
    The dominance of machine learning and the ending of Moore's law have renewed interests in Processor in Memory (PIM) architectures. This interest has produced several recent proposals to modify an FPGA's BRAM architecture to form a next-generation PIM reconfigurable fabric. PIM architectures can also be realized within today's FPGAs as overlays without the need to modify the underlying FPGA architecture. To date, there has been no study to understand the comparative advantages of the two approaches. In this paper, we present a study that explores the comparative advantages between two proposed custom architectures and a PIM overlay running on a commodity FPGA. We created PiCaSO, a Processor in/near Memory Scalable and Fast Overlay architecture as a representative PIM overlay. The results of this study show that the PiCaSO overlay achieves up to 80% of the peak throughput of the custom designs with 2.56x shorter latency and 25% - 43% better BRAM memory utilization efficiency. We then show how several key features of the PiCaSO overlay can be integrated into the custom PIM designs to further improve their throughput by 18%, latency by 19.5%, and memory efficiency by 6.2%.Comment: Accepted in 2023 33rd International Conference on Field-Programmable Logic and Applications (FPL

    Power-based Side Channel Attack Analysis on PQC Algorithms

    Get PDF
    Power-based side channel attacks have been successfully conducted against proven cryptographic algorithms including standardized algorithms such as AES and RSA. These algorithms are now supported by best practices in hardware and software to defend against malicious attacks. As NIST conducts the third round of the post-quantum cryptography (PQC) standardization process, a key feature is to identify the security candidate algorithms have against side channel attacks, and the tradeoffs that must be made to obtain that level of protection. In this work, we document the development of a multi-target and multi-tool platform to conduct test vector leakage assessment of the candidate algorithms. The long-term goals of the platform are to 1) quantify test vector leakage of each of the primary and alternate candidates, 2) quantify test vector leakage of each of the candidates when adjustments and adaptations (e.g., masking) are applied, and 3) assess the equivalent security levels when tools of varying sophistication are used in the attack (e.g., commodity vs. specialized hardware). The goal of this work is to document the progress towards that standardized platform and to invite discussion in how to extend, refine, and distribute our tools

    Identification of efferocytosis-related subtypes in gliomas and elucidating their characteristics and clinical significance

    Get PDF
    Introduction: Gliomas, the most prevalent tumors of the central nervous system, are known for their aggressive nature and poor prognosis. The heterogeneity among gliomas leads to varying responses to the same treatments, even among similar glioma types. In our study, we efferocytosis-related subtypes and explored their characteristics in terms of immune landscape, intercellular communication, and metabolic processes, ultimately elucidating their potential clinical implications.Methods and Results: We first identified efferocytosis-related subtypes in Bulk RNA-seq using the NMF algorithm. We then preliminarily demonstrated the correlation of these subtypes with efferocytosis by examining enrichment scores of cell death pathways, macrophage infiltration, and the expression of immune ligands. Our analysis of single-cell RNA-seq data further supported the association of these subtypes with efferocytosis. Through enrichment analysis, we found that efferocytosis-related subtypes differ from other types of gliomas in terms of immune landscape, intercellular communication, and substance metabolism. Moreover, we found that the efferocytosis-related classification is a prognostic factor with robust predictive performance by calculating the AUC values. We also found that efferocytosis-related subtypes, when compared with other gliomas in drug sensitivity, survival, and TIDE scores, show a clear link to the effectiveness of chemotherapy, radiotherapy, and immunotherapy in glioma patients.Discussion: We identified efferocytosis-related subtypes in gliomas by analyzing the expression of 137 efferocytosis-associated genes, exploring their characteristics in immune landscape, intercellular communication, metabolic processes, and genomic variations. Moreover, we discovered that the classification of efferocytosis-related subtypes has a strong prognostic predictive power and holds potential significance in guiding clinical treatment

    The Promise of High-Performance Reconfigurable Computing

    Full text link

    A modified radix-2 Montgomery modular multiplication with new recoding method

    No full text

    Análisis de la información medioambiental de las empresas españolas

    No full text
    [Abstract not available
    corecore