9 research outputs found

    Simple AEAD Hardware Interface (SÆHI) in a SoC: Implementing an On-Chip Keyak/WhirlBob Coprocessor

    Get PDF
    Simple AEAD Hardware Interface (SÆHI) is a hardware cryptographic interface aimed at CAESAR Authenticated Encryption with Associated Data (AEAD) algorithms. Cryptographic acceleration is typically achieved either with a coprocessor or via instruction set extensions. ISA modifications require re-engineering the CPU core, making the approach inapplicable outside the realm of open source processor cores. At minimum, we suggest implementing CAESAR AEADs as universal memory-mapped cryptographic coprocessors, synthesizable even on low end FPGA platforms. AEADs complying to SÆHI must also include C language API drivers targeting low-end MCUs that directly utilize the memory mapping in a ``bare metal\u27\u27 fashion. This can also be accommodated on MMU-equipped mid-range CPUs. Extended battery life and bandwidth resulting from dedicated cryptographic hardware is vital for currently dominant computing and communication devices: mobile phones, tablets, and Internet-of-Things (IoT) applications. We argue that these should be priority hardware optimization targets for AEAD algorithms with realistic payload profiles. We demonstrate a fully integrated implementation of WhirlBob and Keyak AEADs on the FPGA fabric of Xilinx Zynq 7010. This low-cost System-on-Chip (SoC) also houses a dual-core Cortex-A9 CPU, closely matching the architecture of many embedded devices. The on-chip coprocessor is accessible from user space with a Linux kernel driver. An integration path exists all the way to end-user applications

    NEON PQCryto: Fast and Parallel Ring-LWE Encryption on ARM NEON Architecture

    Get PDF
    Recently, ARM NEON architecture has occupied a significant share of tablet and smartphone markets due to its low cost and high performance. This paper studies efficient techniques of lattice-based cryptography on ARM processor and presents the first implementation of ring-LWE encryption on ARM NEON architecture. In particular, we propose a vectorized version of Iterative Number Theoretic Transform (NTT) for high-speed computation. We present a 32-bit variant of SAMS2 technique, original proposed in CHES’15, for fast reduction. A combination of proposed and previous optimizations results in a very efficient implementation. For 128-bit security level, our ring-LWE implementation requires only 145; 200 clock cycles for encryption and 32; 800 cycles for decryption. These result are more than 17:6 times faster than the fastest ECC implementation on ARM NEON with same security level

    Ohjelmistopohjainen etÀtodentaminen asioiden internetissÀ

    Get PDF
    When in the old days the Internet consisted mostly of workstations, servers, mainframes and networking devices, the rise of the Internet of Things has brought along smart embedded systems that are aware of their surroundings, make their own decisions and communicate with each other accordingly. These systems can be anything and anywhere from a lamp to a refrigerator. These systems require mutual trust and their integrity has to be monitored. One way to achieve this is to use attestation. Attestation is a process that is used for ensuring trust and integrity of a device. Another important factor in designing IoT devices is their cost-effectiveness. It is desirable for the devices to be cheap to manufacture so any extra hardware might become costly. One mechanism that helps to create attestation without extra hardware is to use software based attestation. The replacement of hardware attestation with software mechanisms enable faster provisioning of IoT devices to the network. One problem is that usually in IoT case the attestation traffic is communicated over insecure channels where an attacker might be listening. Another thing to be taken into consideration is the physical security, the theft of the device and its effects. One good thing of software-based attestation is the platform agnosticism

    Exponential S-Boxes: a Link Between the S-Boxes of BelT and Kuznyechik/Streebog

    Get PDF
    The block cipher Kuznyechik and the hash function Streebog were recently standardized by the Russian Federation. These primitives use a common 8-bit S-Box, denoted , which is given only as a look-up table. The rationale behind its design is, for all practical purposes, kept secret by its authors. In a paper presented at Eurocrypt 2016, Biryukov et al. reverse-engineered this S-Box and recovered an unusual Feistel-like structure relying on finite field multiplications. In this paper, we provide a new decomposition of this S-Box and describe how we obtained it. The first step was the analysis of the 8-bit S-Box of the current standard block cipher of Belarus, BelT. This S-Box is a variant of a so-called exponential substitution, a concept we generalize into pseudo-exponential substitution. We derive distinguishers for such permutations based on properties of their linear approximation tables and notice that shares some of them. We then show that indeed has a decomposition based on a pseudo-exponential substitution. More precisely, we obtain a simpler structure based on an 8-bit finite field exponentiation, one 4-bit S-Box, a linear layer and a few modular arithmetic operations. We also make several observations which may help cryptanalysts attempting to reverse-engineer other S-Boxes. For example, the visual pattern used in the previous work as a starting point to decompose is mathematically formalized and the use of differential patterns involving operations other than exclusive-or is explored

    Exponential S-Boxes: a Link Between the S-Boxes of BelT and Kuznyechik/Streebog

    Get PDF
    The block cipher Kuznyechik and the hash function Streebog were recently standardized by the Russian Federation. These primitives use a common 8-bit S-Box, denote

    Partitions in the S-Box of Streebog and Kuznyechik

    Get PDF
    Streebog and Kuznyechik are the latest symmetric cryptographic primitives standardized by the Russian GOST. They share the same S-Box, π, whose design process was not described by its authors. In previous works, Biryukov, Perrin and Udovenko recovered two completely different decompositions of this S-Box. We revisit their results and identify a third decomposition of π. It is an instance of a fairly small family of permutations operating on 2m bits which we call TKlog and which is closely related to finite field logarithms. Its simplicity and the small number of components it uses lead us to claim that it has to be the structure intentionally used by the designers of Streebog and Kuznyechik. The 2m-bit permutations of this type have a very strong algebraic structure: they map multiplicative cosets of the subfield GF(2m)* to additive cosets of GF(2m)*. Furthermore, the function relating each multiplicative coset to the corresponding additive coset is always essentially the same. To the best of our knowledge, we are the first to expose this very strong algebraic structure. We also investigate other properties of the TKlog and show in particular that it can always be decomposed in a fashion similar to the first decomposition of Biryukov et al., thus explaining the relation between the two previous decompositions. It also means that it is always possible to implement a TKlog efficiently in hardware and that it always exhibits a visual pattern in its LAT similar to the one present in π. While we could not find attacks based on these new results, we discuss the impact of our work on the security of Streebog and Kuznyechik. To this end, we provide a new simpler representation of the linear layer of Streebog as a matrix multiplication in the exact same field as the one used to define π. We deduce that this matrix interacts in a non-trivial way with the partitions preserved by π

    A multi-layer approach to designing secure systems: from circuit to software

    Get PDF
    In the last few years, security has become one of the key challenges in computing systems. Failures in the secure operations of these systems have led to massive information leaks and cyber-attacks. Case in point, the identity leaks from Equifax in 2016, Spectre and Meltdown attacks to Intel and AMD processors in 2017, Cyber-attacks on Facebook in 2018. These recent attacks have shown that the intruders attack different layers of the systems, from low-level hardware to software as a service(SaaS). To protect the systems, the defense mechanisms should confront the attacks in the different layers of the systems. In this work, we propose four security mechanisms for computing systems: (i ) using backside imaging to detect Hardware Trojans (HTs) in Application Specific Integrated Circuits (ASICs) chips, (ii ) developing energy-efficient reconfigurable cryptographic engines, (iii) examining the feasibility of malware detection using Hardware Performance Counters (HPC). Most of the threat models assume that the root of trust is the hardware running beneath the software stack. However, attackers can insert malicious hardware blocks, i.e. HTs, into the Integrated Circuits (ICs) that provide back-doors to the attackers or leak confidential information. HTs inserted during fabrication are extremely hard to detect since their overheads in performance and power are below the variations in the performance and power caused by manufacturing. In our work, we have developed an optical method that identifies modified or replaced gates in the ICs. We use the near-infrared light to image the ICs because silicon is transparent to near-infrared light and metal reflects infrared light. We leverage the near-infrared imaging to identify the locations of each gate, based on the signatures of metal structures reflected by the lowest metal layer. By comparing the imaged results to the pre-fabrication design, we can identify any modifications, shifts or replacements in the circuits to detect HTs. With the trust of the silicon, the computing system must use secure communication channels for its applications. The low-energy cost devices, such as the Internet of Things (IoT), leverage strong cryptographic algorithms (e.g. AES, RSA, and SHA) during communications. The cryptographic operations cause the IoT devices a significant amount of power. As a result, the power budget limits their applications. To mitigate the high power consumption, modern processors embed these cryptographic operations into hardware primitives. This also improves system performance. The hardware unit embedded into the processor provides high energy-efficiency, low energy cost. However, hardware implementations limit flexibility. The longevity of theIoTs can exceed the lifetime of the cryptographic algorithms. The replacement of the IoT devices is costly and sometimes prohibitive, e.g., monitors in nuclear reactors.In order to reconfigure cryptographic algorithms into hardware, we have developed a system with a reconfigurable encryption engine on the Zedboard platform. The hardware implementation of the engine ensures fast, energy-efficient cryptographic operations. With reliable hardware and secure communication channels in place, the computing systems should detect any malicious behaviors in the processes. We have explored the use of the Hardware Performance Counters (HPCs) in malware detection. HPCs are hardware units that count micro-architectural events, such as cache hits/misses and floating point operations. Anti-virus software is commonly used to detect malware but it also introduces performance overhead. To reduce anti-virus performance overhead, many researchers propose to use HPCs with machine learning models in malware detection. However, it is counter-intuitive that the high-level program behaviors can manifest themselves in low-level statics. We perform experiments using 2 ∌ 3 × larger program counts than the previous works and perform a rigorous analysis to determine whether HPCs can be used to detect malware. Our results show that the False Discovery Rate of malware detection can reach 20%. If we deploy this detection system on a fresh installed Windows 7 systems, among 1,323 binaries, 198 binaries would be flagged as malware

    Cryptanalysis, Reverse-Engineering and Design of Symmetric Cryptographic Algorithms

    Get PDF
    In this thesis, I present the research I did with my co-authors on several aspects of symmetric cryptography from May 2013 to December 2016, that is, when I was a PhD student at the university of Luxembourg under the supervision of Alex Biryukov. My research has spanned three different areas of symmetric cryptography. In Part I of this thesis, I present my work on lightweight cryptography. This field of study investigates the cryptographic algorithms that are suitable for very constrained devices with little computing power such as RFID tags and small embedded processors such as those used in sensor networks. Many such algorithms have been proposed recently, as evidenced by the survey I co-authored on this topic. I present this survey along with attacks against three of those algorithms, namely GLUON, PRINCE and TWINE. I also introduce a new lightweight block cipher called SPARX which was designed using a new method to justify its security: the Long Trail Strategy. Part II is devoted to S-Box reverse-engineering, a field of study investigating the methods recovering the hidden structure or the design criteria used to build an S-Box. I co-invented several such methods: a statistical analysis of the differential and linear properties which was applied successfully to the S-Box of the NSA block cipher Skipjack, a structural attack against Feistel networks called the yoyo game and the TU-decomposition. This last technique allowed us to decompose the S-Box of the last Russian standard block cipher and hash function as well as the only known solution to the APN problem, a long-standing open question in mathematics. Finally, Part III presents a unifying view of several fields of symmetric cryptography by interpreting them as purposefully hard. Indeed, several cryptographic algorithms are designed so as to maximize the code size, RAM consumption or time taken by their implementations. By providing a unique framework describing all such design goals, we could design modes of operations for building any symmetric primitive with any form of hardness by combining secure cryptographic building blocks with simple functions with the desired form of hardness called plugs. Alex Biryukov and I also showed that it is possible to build plugs with an asymmetric hardness whereby the knowledge of a secret key allows the privileged user to bypass the hardness of the primitive

    WHIRLBOB, the Whirlpool Based Variant of STRIBOB: Lighter, Faster, and Constant Time

    Get PDF
    WHIRLBOB, also known as STRIBOBr2, is an AEAD (Authenticated Encryption with Associated Data) algorithm derived from STRIBOBr1 and the Whirlpool hash algorithm. WHIRLBOB/STRIBOBr2 is a second round candidate in the CAESAR competition. As with STRIBOBr1, the reduced-size Sponge design has a strong provable security link with a standardized hash algorithm. The new design utilizes only the LPS or ρ\rho component of Whirlpool in flexibly domain-separated BLNK Sponge mode. The number of rounds is increased from 10 to 12 as a countermeasure against Rebound Distinguishing attacks. The 8×88 \times 8 - bit S-Box used by Whirlpool and WHIRLBOB is constructed from 4×44 \times 4 - bit ``MiniBoxes\u27\u27. We report on fast constant-time Intel SSSE3 and ARM NEON SIMD WHIRLBOB implementations that keep full miniboxes in registers and access them via SIMD shuffles. This is an efficient countermeasure against AES-style cache timing side-channel attacks. Another main advantage of WHIRLBOB over STRIBOBr1 (and most other AEADs) is its greatly reduced implementation footprint on lightweight platforms. On many lower-end microcontrollers the total software footprint of π\pi+BLNK = WHIRLBOB AEAD is less than half a kilobyte. We also report an FPGA implementation that requires 4,946 logic units for a single round of WHIRLBOB, which compares favorably to 7,972 required for Keccak / Keyak on the same target platform. The relatively small S-Box gate count also enables efficient 64-bit bitsliced straight-line implementations. We finally present some discussion and analysis on the relationships between WHIRLBOB, Whirlpool, the Russian GOST Streebog hash, and the recent draft Russian Encryption Standard Kuznyechik
    corecore