29 research outputs found

    Energy-Efficiency Evaluation of FPGAs for Floating-Point Intensive Workloads

    Get PDF
    In this work we describe a method to measure the computing performance and energy-efficiency to be expected of an FPGA device. The motivation of this work is given by their possible usage as accelerators in the context of floating-point intensive HPC workloads. In fact, FPGA devices in the past were not considered an efficient option to address floating-point intensive computations, but more recently, with the advent of dedicated DSP units and the increased amount of resources in each chip, the interest towards these devices raised. Another obstacle to a wide adoption of FPGAs in the HPC field has been the low level hardware knowledge commonly required to program them, using Hardware Description Languages (HDLs). Also this issue has been recently mitigated by the introduction of higher level programming framework, adopting so called High Level Synthesis approaches, reducing the development time and shortening the gap between the skills required to program FPGAs wrt the skills commonly owned by HPC software developers. In this work we apply the proposed method to estimate the maximum floating-point performance and energy-efficiency of the FPGA embedded in a Xilinx Zynq Ultrascale+ MPSoC hosted on a Trenz board

    Porting a Lattice Boltzmann Simulation to FPGAs Using OmpSs

    Get PDF
    Reconfigurable computing, exploiting Field Programmable Gate Arrays (FPGA), has become of great interest for both academia and industry research thanks to the possibility to greatly accelerate a variety of applications. The interest has been further boosted by recent developments of FPGA programming frameworks which allows to design applications at a higher-level of abstraction, for example using directive based approaches. In this work we describe our first experiences in porting to FPGAs an HPC application, used to simulate Rayleigh-Taylor instability of fluids with different density and temperature using Lattice Boltzmann Methods. This activity is done in the context of the FET HPC H2020 EuroEXA project which is developing an energyefficient HPC system, at exa-scale level, based on Arm processors and FPGAs. In this work we use the OmpSs directive based programming model, one of the models available within the EuroEXA project. OmpSs is developed by the Barcelona Supercomputing Center (BSC) and allows to target FPGA devices as accelerators, but also commodity CPUs and GPUs, enabling code portability across different architectures. In particular, we describe the initial porting of this application, evaluating the programming efforts required, and assessing the preliminary performances on a Trenz development board hosting a Xilinx Zynq UltraScale+ MPSoC embedding a 16nm FinFET+ programmable logic and a multi-core Arm CPU

    An Experimentally Verified Attack on Full Grain-128 Using Dedicated Reconfigurable Hardware

    Full text link
    In this paper we describe the first single-key attack which can recover the full key of the full version of Grain-128 for arbitrary keys by an algorithm which is significantly faster than exhaustive search (by a factor of about 238). It is based on a new version of a cube tester, which uses an improved choice of dynamic variables to eliminate the previously made assumption that ten particular key bits are zero. In addition, the new attack is much faster than the previous weak-key attack, and has a simpler key recovery process. Since it is extremely difficult to mathemat-ically analyze the expected behavior of such attacks, we implemented it on RIVYERA, which is a new massively parallel reconfigurable hardware, and tested its main components for dozens of random keys. These tests experimentally verified the correctness and expected complexity of the attack, by finding a very significant bias in our new cube tester for about 7.5 % of the keys we tested. This is the first time that the main compo-nents of a complex analytical attack are successfully realized against a full-size cipher with a special-purpose machine. Moreover, it is also the first attack that truly exploits the configurable nature of an FPGA-based cryptanalytical hardware

    Secure Algorithms for SAKA Protocol in the GSM Network

    Get PDF
    This paper deals with the security vulnerabilities of the cryptographic algorithms A3, A8, and A5 existing in the GSM network. We review these algorithms and propose new secure algorithms named NewA3, NewA8, and NewA5 algorithms with respect to the A3, A8, and A5 algorithms. Our NewA5 algorithm is based on block ciphers, but we also propose NewA5 algorithm with Cipher Feedback, Counter, and Output Feedback modes to convert block cipher into stream cipher. However, stream cipher algorithms are slower than the block cipher algorithm. These new algorithms are proposed to use with a secure and efficient authentication and key agreement (AKA) protocol in the GSM network. The proposed architecture is secure against partition attack, narrow pipe attack, collision attack, interleaving attack, and man-in-the-middle attack. The security analysis of the proposed algorithms are discussed with respect to the cryptanalysis, brute force analysis, and operational analysis. We choose the NewA3 and NewA8 algorithms for challenge-response and key generation, respectively. Furthermore, the NewA5 is suitable for encryption as it is efficient than the existing A5/1 and A5/2 algorithms. In case when stream cipher algorithms are required to use, our new algorithms, NewA5-CTR, NewA5-CFB, and NewA5-OFB can be used for specific applications. These algorithms are completely secure and better than the existing A5/1 and A5/2 in terms of resistant to attacks

    HIGH THROUGHPUT IMPLEMENTATION OF 64 BIT MODIFIED WALLANCE MAC USING MULTIOPERAND ADDERS

    Get PDF
    Although redundant addition is widely used to design parallel multioperand adders for ASIC implementations, the use of redundant adders on Field Programmable Gate Arrays (FPGAs) has generally been avoided. The main reasons are the efficient implementation of carry propagate adders (CPAs) on these devices (due to their specialized carry-chain resources) as well as the area overhead of the redundant adders when they are implemented on FPGAs. This project presents different approaches to the efficient implementation of generic carry-save compressor trees. In computing, especially digital signal processing, the multiply–accumulate operation is a common step that computes the product of two numbers and adds that product to an accumulator. The hardware unit that performs the operation is known as a multiplier–accumulator (MAC, or MAC unit); the operation itself is also often called a MAC or a MAC operation. Power dissipation is one of the most important design objectives in integrated circuit, after speed. Digital signal processing (DSP) circuits whose main building block is a Multiplier-Accumulator (MAC) unit. High speed and low power MAC unit is desirable for any DSP processor. This is because speed and throughput rate are always the concerns of DSP system. MAC unit consists of adder, multiplier, and an accumulator it preserves a unique mapping between input and output vector of the particular circuit. In this MAC operation is performed in two parts Partial Product Generation (PPG) circuit and Multi-Operand Addition (MOA) circui

    On the Security of the (F)HMQV Protocol

    No full text
    International audienceThe HMQV protocol is under consideration for IEEE P1363 standardization. We provide a complementary analysis of the HMQV protocol. Namely, we point a Key Compromise Impersonation (KCI) attack showing that the two and three pass HMQV protocols cannot achieve their security goals. Next, we revisit the FHMQV building blocks, design and security arguments; we clarify the security and efficiency separation between HMQV and FHMQV, showing the advantages of FH-MQV over HMQV

    Quantum Attacks on Modern Cryptography and Post-Quantum Cryptosystems

    Get PDF
    Cryptography is a critical technology in the modern computing industry, but the security of many cryptosystems relies on the difficulty of mathematical problems such as integer factorization and discrete logarithms. Large quantum computers can solve these problems efficiently, enabling the effective cryptanalysis of many common cryptosystems using such algorithms as Shor’s and Grover’s. If data integrity and security are to be preserved in the future, the algorithms that are vulnerable to quantum cryptanalytic techniques must be phased out in favor of quantum-proof cryptosystems. While quantum computer technology is still developing and is not yet capable of breaking commercial encryption, these steps can be taken immediately to ensure that the impending development of large quantum computers does not compromise sensitive data
    corecore