25 research outputs found

    On the Easiness of Turning Higher-Order Leakages into First-Order

    Get PDF
    Applying random and uniform masks to the processed intermediate values of cryptographic algorithms is arguably the most common countermeasure to thwart side-channel analysis attacks. So-called masking schemes exist in various shapes but are mostly used to prevent side-channel leakages up to a certain statistical order. Thus, to learn any information about the key-involving computations a side-channel adversary has to estimate the higher-order statistical moments of the leakage distributions. However, the complexity of this approach increases exponentially with the statistical order to be estimated and the precision of the estimation suffers from an enormous sensitivity to the noise level. In this work we present an alternative procedure to exploit higher-order leakages which captivates by its simplicity and effectiveness. Our approach, which focuses on (but is not limited to) univariate leakages of hardware masking schemes, is based on categorizing the power traces according to the distribution of leakage points. In particular, at each sample point an individual subset of traces is considered to mount ordinary first-order attacks. We present the theoretical concept of our approach based on simulation traces and examine its efficiency on noisy real-world measurements taken from a first-order secure threshold implementation of the block cipher PRESENT-80, implemented on a 150nm CMOS ASIC prototype chip. Our analyses verify that the proposed technique is indeed a worthy alternative to conventional higher-order attacks and suggest that it might be able to relax the sensitivity of higher-order evaluations to the noise level

    Low-Latency Hardware Masking of PRINCE

    Get PDF
    Efficient implementation of Boolean masking in terms of low latency has evolved into a hot topic due to the necessity of embedding a physically secure and at-the-same-time fast implementation of cryptographic primitives in e.g., the memory encryption of pervasive devices. Instead of fully minimizing the circuit\u27s area and randomness requirements at the cost of latency, the focus has changed into finding optimal tradeoffs between the circuit area and the execution time. The main latency bottleneck in hardware masking lies in the need for registers to stop the propagation of glitches and maintain non-completeness. Usually, an exponentially growing number of shares (hence an extremely large circuit), as well as a high demand for fresh randomness, are the result of avoiding registers in a securely masked hardware implementation of a block cipher. In this paper, we present several first-order secure and low-latency implementations of PRINCE. In particular, we show how to realize the masked variant of round-based PRINCE with only a single register stage per cipher round. We compare the resulting architectures, based on the popular TI and GLM masking scheme based on the area, latency, and randomness requirements and point out that both designs are suited for specific use cases

    Static Power Side-Channel Analysis of a Threshold Implementation Prototype Chip

    Get PDF
    The static power consumption of modern CMOS devices has become a substantial concern in the context of the side-channel security of cryptographic hardware. The continuous growth of the leakage power dissipation in nanometer-scaled CMOS technologies is not only inconvenient for effective low power designs, but does also create a new target for power analysis adversaries. In this paper, we present the first experimental results of a static power side-channel analysis targeting an ASIC implementation of a provably first-order secure hardware masking scheme. The investigated 150 nm CMOS prototype chip realizes the PRESENT-80 lightweight block cipher as a threshold implementation and allows us to draw a comparison between the information leakage through its dynamic and static power consumption. By employing a sophisticated measurement setup dedicated to static power analysis, including a very low-noise DC amplifier as well as a climate chamber, we are able to recover the key of our target implementation with significantly less traces compared to the corresponding dynamic power analysis attack. In particular, for a successful third-order attack exploiting the static currents, less than 200 thousand traces are needed. Whereas for the same attack in the dynamic power domain around 5 million measurements are required. Furthermore, we are able to show that only-first-order resistant approaches like the investigated threshold implementation do not significantly increase the complexity of a static power analysis. Therefore, we firmly believe that this side channel can actually become the target of choice for real-world adversaries against masking countermeasures implemented in advanced CMOS technologies

    Static Power Side-Channel Analysis - An Investigation of Measurement Factors

    Get PDF
    The static power consumption of modern CMOS devices has become a substantial concern in the context of the side-channel security of cryptographic hardware. Its continuous growth in nanometer-scaled technologies is not only inconvenient for effective low power designs, but does also create a new target for power analysis adversaries. Additionally, it has to be noted that several of the numerous sources of static power dissipation in CMOS circuits exhibit an exponential dependency on environmental factors which a classical power analysis adversary is in control of. These factors include the operating conditions temperature and supply voltage. Furthermore, in case of clock control, the measurement interval can be adjusted arbitrarily. Our experiments on a 150nm CMOS ASIC reveal that with respect to the signal-to-noise ratio in static power side-channel analyses, stretching the measurement interval decreases the noise exponentially and even more importantly that raising the working temperature increases the signal exponentially. Control over the supply voltage has a far smaller, but still noticeable, positive impact as well. In summary, a static power analysis adversary can physically force a device to leak more information by controlling its operating environment and furthermore measure these leakages with arbitrary precision by modifying the interval length

    BSPL: Balanced Static Power Logic

    Get PDF
    The down-scaling of circuit technology has led to stronger leakage currents in CMOS standard cells. This source of power consumption is data dependent and can be utilized to extract secrets from cryptographic devices. We propose Balanced Static Power Logic (BSPL), the first leakage-balancing approach that achieves optimal data-independence with respect to drain-source leakage. We re-design fundamental standard cells in such a way that their leakage current is essentially constant, irrespective of inputs and outputs, barring process variations. Even in presence of considerable intra-die variations, modeled by Monte Carlo simulations, BSPL gates still maintain a significantly reduced mutual information between the circuit’s input and conducted leakage current

    The Risk of Outsourcing: Hidden SCA Trojans in Third-Party IP-Cores Threaten Cryptographic ICs

    Get PDF
    Side-channel analysis (SCA) attacks – especially power analysis – are powerful ways to extract the secrets stored in and processed by cryptographic devices. In recent years, researchers have shown interest in utilizing on-chip measurement facilities to perform such SCA attacks remotely. It was shown that simple voltage-monitoring sensors can be constructed from digital elements and put on multi-tenant FPGAs to perform remote attacks on neighbouring cryptographic co-processors. A similar threat is the unsuspecting integration of third-party IPCores into an IC design. Even if the function of an acquired IP-Core is not security critical by itself, it may contain an onchip sensor as a Trojan that can eavesdrop on cryptographic operations across the whole device. In contrast to all FPGAbased investigations reported in the literature so far, we examine the efficiency of such on-chip sensors as a source of information leakage in an ASIC-based case study for the first time. To this end, in addition to a cryptographic core (lightweight block cipher PRESENT) we designed and implemented a voltage-monitoring sensor on an ASIC fabricated by a 40nm commercial standard cell library. Despite the physical distance between the sensor and the PRESENT core, we show the possibility of fully recovering the secret key of the PRESENT core by processing the sensor’s output. Our results imply that the hidden insertion of such a sensor – for example by a malicious third party IP-Core vendor – can endanger the security of embedded systems which deal with sensitive information, even if the device cannot be physically accessed by the adversary

    Effective and Efficient Masking with Low Noise using Small-Mersenne-Prime Ciphers

    Get PDF
    Embedded devices used in security applications are natural targets for physical attacks. Thus, enhancing their side-channel resistance is an important research challenge. A standard solution for this purpose is the use of Boolean masking schemes, as they are well adapted to current block ciphers with efficient bitslice representations. Boolean masking guarantees that the security of an implementation grows exponentially in the number of shares under the assumption that leakages are sufficiently noisy (and independent). Unfortunately, it has been shown that this noise assumption is hardly met on low-end devices. In this paper, we therefore investigate techniques to mask cryptographic algorithms in such a way that their resistance can survive an almost complete lack of noise. Building on seed theoretical results of Dziembowski et al., we put forward that arithmetic encodings in prime fields can reach this goal. We first exhibit the gains that such encodings lead to thanks to a simulated information theoretic analysis of their leakage (with up to six shares). We then provide figures showing that on platforms where optimized arithmetic adders and multipliers are readily available (i.e., most MCUs and FPGAs), performing masked operations in small to medium Mersenne-prime fields as opposed to binary extension fields will not lead to notable implementation overheads. We compile these observations into a new AES-like block cipher, called AES-prime, which is well-suited to illustrate the remarkable advantages of masking in prime fields. We also confirm the practical relevance of our findings by evaluating concrete software (ARM Cortex-M3) and hardware (Xilinx Spartan-6) implementations. Our experimental results show that security gains over Boolean masking (and, more generally, binary encodings) can reach orders of magnitude despite the same amount of information being leaked per share

    Effective and Efficient Masking with Low Noise Using Small-Mersenne-Prime Ciphers

    Get PDF
    Embedded devices used in security applications are natural targets for physical attacks. Thus, enhancing their side-channel resistance is an important research challenge. A standard solution for this purpose is the use of Boolean masking schemes, as they are well adapted to current block ciphers with efficient bitslice representations. Boolean masking guarantees that the security of an implementation grows exponentially in the number of shares under the assumption that leakages are sufficiently noisy (and independent). Unfortunately, it has been shown that this noise assumption is hardly met on low-end devices. In this paper, we therefore investigate techniques to mask cryptographic algorithms in such a way that their resistance can survive an almost complete lack of noise. Building on seed theoretical results of Dziembowski et al., we put forward that arithmetic encodings in prime fields can reach this goal. We first exhibit the gains that such encodings lead to thanks to a simulated information theoretic analysis of their leakage (with up to six shares). We then provide figures showing that on platforms where optimized arithmetic adders and multipliers are readily available (i.e., most MCUs and FPGAs), performing masked operations in small to medium Mersenne-prime fields as opposed to binary extension fields will not lead to notable implementation overheads. We compile these observations into a new AES-like block cipher, called AES-prime, which is well-suited to illustrate the remarkable advantages of masking in prime fields. We also confirm the practical relevance of our findings by evaluating concrete software (ARM Cortex-M3) and hardware (Xilinx Spartan-6) implementations. Our experimental results show that security gains over Boolean masking (and, more generally, binary encodings) can reach orders of magnitude despite the same amount of information being leaked per share

    Red Team vs. Blue Team: A Real-World Hardware Trojan Detection Case Study Across Four Modern CMOS Technology Generations

    Get PDF
    Verifying the absence of maliciously inserted Trojans in ICs is a crucial task – especially for security-enabled products. Depending on the concrete threat model, different techniques can be applied for this purpose. Assuming that the original IC layout is benign and free of backdoors, the primary security threats are usually identified as the outsourced manufacturing and transportation. To ensure the absence of Trojans in commissioned chips, one straightforward solution is to compare the received semiconductor devices to the design files that were initially submitted to the foundry. Clearly, conducting such a comparison requires advanced laboratory equipment and qualified experts. Nevertheless, the fundamental techniques to detect Trojans which require evident changes to the silicon layout are nowadays well-understood. Despite this, there is a glaring lack of public case studies describing the process in its entirety while making the underlying datasets publicly available. In this work, we aim to improve upon this state of the art by presenting a public and open hardware Trojan detection case study based on four different digital ICs using a Red Team vs. Blue Team approach. Hereby, the Red Team creates small changes acting as surrogates for inserted Trojans in the layouts of 90 nm, 65 nm, 40 nm, and 28 nm ICs. The quest of the Blue Team is to detect all differences between digital layout and manufactured device by means of a GDSII–vs–SEM-image comparison. Can the Blue Team perform this task efficiently? Our results spark optimism for the Trojan seekers and answer common questions about the efficiency of such techniques for relevant IC sizes. Further, they allow to draw conclusions about the impact of technology scaling on the detection performance

    Randomness Generation for Secure Hardware Masking - Unrolled Trivium to the Rescue

    Get PDF
    Masking is a prominent strategy to protect cryptographic implementations against side-channel analysis. Its popularity arises from the exponential security gains that can be achieved for (approximately) quadratic resource utilization. Many variants of the countermeasure tailored for different optimization goals have been proposed over the past decades. The common denominator among all of them is the implicit demand for robust and high entropy randomness. Simply assuming that uniformly distributed random bits are available, without taking the cost of their generation into account, leads to a poor understanding of the efficiency and performance of secure implementations. This is especially relevant in case of hardware masking schemes which are known to consume large amounts of random bits per cycle due to parallelism. Currently, there seems to be no consensus on how to most efficiently derive many pseudo-random bits per clock cycle from an initial seed and with properties suitable for masked hardware implementations. In this work, we evaluate a number of building blocks for this purpose and find that hardware-oriented stream ciphers like Trivium and its reduced-security variant Bivium B outperform all competitors when implemented in an unrolled fashion. Unrolled implementations of these primitives enable the flexible generation of many bits per cycle while maintaining high performance, which is crucial for satisfying the large randomness demands of state-of-the-art masking schemes. According to our analysis, only Linear Feedback Shift Registers (LFSRs), when also unrolled, are capable of producing long non-repetitive sequences of random-looking bits at a high rate per cycle even more efficiently than Trivium and Bivium B. Yet, these instances do not provide black-box security as they generate only linear outputs. We experimentally demonstrate that using multiple output bits from an LFSR in the same masked implementation can violate probing security and even lead to harmful randomness cancellations. Circumventing these problems, and enabling an independent analysis of randomness generation and masking scheme, requires the use of cryptographically stronger primitives like stream ciphers. As a result of our studies, we provide an evidence-based estimate for the cost of securely generating n fresh random bits per cycle. Depending on the desired level of black-box security and operating frequency, this cost can be as low as 20n to 30n ASIC gate equivalents (GE) or 3n to 4n FPGA look-up tables (LUTs), where n is the number of random bits required. Our results demonstrate that the cost per bit is (sometimes significantly) lower than estimated in previous works, incentivizing parallelism whenever exploitable and potentially moving low randomness usage in hardware masking research from a primary to secondary design goal
    corecore