281 research outputs found

    NASCTY: Neuroevolution to Attack Side-channel Leakages Yielding Convolutional Neural Networks

    Full text link
    Side-channel analysis (SCA) can obtain information related to the secret key by exploiting leakages produced by the device. Researchers recently found that neural networks (NNs) can execute a powerful profiling SCA, even on targets protected with countermeasures. This paper explores the effectiveness of Neuroevolution to Attack Side-channel Traces Yielding Convolutional Neural Networks (NASCTY-CNNs), a novel genetic algorithm approach that applies genetic operators on architectures' hyperparameters to produce CNNs for side-channel analysis automatically. The results indicate that we can achieve performance close to state-of-the-art approaches on desynchronized leakages with mask protection, demonstrating that similar neuroevolution methods provide a solid venue for further research. Finally, the commonalities among the constructed NNs provide information on how NASCTY builds effective architectures and deals with the applied countermeasures.Comment: 19 pages, 6 figures, 4 table

    Automated Side-Channel Attacks using Black-Box Neural Architecture Search

    Get PDF
    The usage of convolutional neural networks (CNNs) to break cryptographic systems through hardware side-channels has enabled fast and adaptable attacks on devices like smart cards and TPMs. Current literature proposes fixed CNN architectures designed by domain experts to break such systems, which is time-consuming and unsuitable for attacking a new system. Recently, an approach using neural architecture search (NAS), which is able to acquire a suitable architecture automatically, has been explored. These works use the secret key information in the attack dataset for optimization and only explore two different search strategies using one-dimensional CNNs. We propose a NAS approach that relies only on using the profiling dataset for optimization, making it fully black-box. Using a large-scale experimental parameter study, we explore which choices for NAS, such as 1-D or 2-D CNNs and search strategy, produce the best results on 10 state-of-the-art datasets for Hamming weight and identity leakage models. We show that applying the random search strategy on 1-D inputs results in a high success rate and retrieves the correct secret key using a single attack trace on two of the datasets. This combination matches the attack efficiency of fixed CNN architectures, outperforming them in 4 out of 10 datasets. Our experiments also point toward the need for repeated attack evaluations of machine learning-based solutions in order to avoid biased performance estimates

    Deep Learning based Side-Channel Attack: a New Profiling Methodology based on Multi-Label Classification

    Get PDF
    Deep Learning based Side-Channel Attacks (DL-SCA) are an emerging security assessment method increasingly being adopted by the majority of certification schemes and certification bodies to assess the resistance of cryptographic implementations. The related published investigations have demonstrated that DL-SCA are very efficient when targeting cryptographic designs protected with the common side-channel countermeasures. Furthermore, these attacks allow to streamline the evaluation process as the pre-processing of the traces (\emph{e.g.} alignment, dimensionality reduction, \dots) is no longer mandatory. In practice, the DL-SCA are applied following the divide-and-conquer strategy such that the target, for the training and the attack phases, only depends on 88 key bits at most (to avoid high time complexity especially during the training). Then, the same process is repeated to recover the remaining bits of the key. To mitigate this practical issue, we propose in this work a new profiling methodology for DL-SCA based on the so-called multi-label classification. We argue that our new profiling methodology allows applying DL-SCA to target a bigger chunk of the key (typically 1616 bits) without introducing a learning time overhead and while guaranteeing a similar attack efficiency compared to the commonly used training strategy. As a side benefit, we demonstrate that our leaning strategy can be applied as well to train several intermediate operations at once. Interestingly, we show that, in this context, our methodology is even faster than the classical training and leads to a more efficient key recovery phase. We validated the soundness of our proposal on simulated traces and experimental data-sets; amongst them, some are publicly available side-channel databases. The obtained results have proven that our profiling methodology is of great practical interest especially in the context of performing penetration tests with high attack potential (\emph{e.g.} Common Criteria, EMVCO) where the time required to perform the attack has an impact on its final rating

    Gambling for Success: The Lottery Ticket Hypothesis in Deep Learning-based SCA

    Get PDF
    Deep learning-based side-channel analysis (SCA) represents a strong approach for profiling attacks. Still, this does not mean it is trivial to find neural networks that perform well for any setting. Based on the developed neural network architectures, we can distinguish between small neural networks that are easier to tune and less prone to overfitting but could have insufficient capacity to model the data. On the other hand, large neural networks have sufficient capacity but can overfit and are more difficult to tune. This brings an interesting trade-off between simplicity and performance. This work proposes to use a pruning strategy and recently proposed Lottery Ticket Hypothesis (LTH) as an efficient method to tune deep neural networks for profiling SCA. Pruning provides a regularization effect on deep neural networks and reduces the overfitting posed by overparameterized models. We demonstrate that we can find pruned neural networks that perform on the level of larger networks, where we manage to reduce the number of weights by more than 90% on average. This way, pruning and LTH approaches become alternatives to costly and difficult hyperparameter tuning in profiling SCA. Our analysis is conducted over different masked AES datasets and for different neural network topologies. Our results indicate that pruning, and more specifically LTH, can result in competitive deep learning models

    Resolving the Doubts: On the Construction and Use of ResNets for Side-channel Analysis

    Get PDF
    The deep learning-based side-channel analysis gave some of the most prominent side-channel attacks against protected targets in the past few years. To this end, the research community\u27s focus has been on creating 1) powerful and 2) (if possible) minimal multilayer perceptron or convolutional neural network architectures. Currently, we see that computationally intensive hyperparameter tuning methods (e.g., Bayesian optimization or reinforcement learning) provide the best results. However, as targets with more complex countermeasures become available, these minimal architectures may be insufficient, and we will require novel deep learning approaches. This work explores how residual neural networks (ResNets) perform in side-channel analysis and how to construct deeper ResNets capable of working with larger input sizes and requiring minimal tuning. The resulting architectures obtained by following our guidelines are significantly deeper than commonly seen in side-channel analysis, require minimal hyperparameter tuning for specific datasets, and offer competitive performance with state-of-the-art methods across several datasets. Additionally, the results indicate that ResNets work especially well when the number of profiling traces and features in a trace is large

    Label Correlation in Deep Learning-based Side-channel Analysis

    Get PDF
    The efficiency of the profiling side-channel analysis can be significantly improved with machine learning techniques. Although powerful, a fundamental machine learning limitation of being data-hungry received little attention in the side-channel community. In practice, the maximum number of leakage traces that evaluators/attackers can obtain is constrained by the scheme requirements or the limited accessibility of the target. Even worse, various countermeasures in modern devices increase the conditions on the profiling size to break the target. This work demonstrates a practical approach to dealing with the lack of profiling traces. Instead of learning from a one-hot encoded label, transferring the labels to their distribution can significantly speed up the convergence of guessing entropy. By studying the relationship between all possible key candidates, we propose a new metric, denoted Label Correlation (LC), to evaluate the generalization ability of the profiling model. We validate LC with two common use cases: early stopping and network architecture search, and the results indicate its superior performance

    Shift-invariance Robustness of Convolutional Neural Networks in Side-channel Analysis

    Get PDF
    Convolutional neural networks (CNNs) offer unrivaled performance in profiling side-channel analysis. This claim is corroborated by numerous results where CNNs break targets protected with masking and hiding countermeasures. One hiding countermeasure is commonly investigated in related works - desynchronization (misalignment). The conclusions usually state that CNNs can break desynchronization as they are shift-invariant. This paper investigates that claim in more detail and reveals that the situation is more complex. While CNNs have certain shift-invariance, it is insufficient for commonly encountered scenarios in deep learning-based side-channel analysis. We propose to use data augmentation to improve the shift-invariance and, in a more powerful version, ensembles of data augmentation. Our results show the proposed techniques work very well and improve the attack significantly, even for an order of magnitude

    The Need for Speed: A Fast Guessing Entropy Calculation for Deep Learning-based SCA

    Get PDF
    The adoption of deep neural networks for profiling side-channel attacks (SCA) opened new perspectives for leakage detection. Recent publications showed that cryptographic implementations featuring different countermeasures could be broken without feature selection or trace preprocessing. This success comes with a high price: extensive hyperparameter search to find optimal deep learning models. As deep learning models usually suffer from overfitting due to their high fitting capacity, it is crucial to avoid over-training regimes, which require a correct number of epochs. For that, \textit{early stopping} is employed as an efficient regularization method that requires a consistent validation metric. Although guessing entropy is a highly informative metric for profiling SCA, it is time-consuming, especially if computed for all epochs during training and the number of validation traces is significantly large. This paper shows that guessing entropy can be efficiently computed during training by reducing the number of validation traces without affecting the efficiency of early stopping decisions. Our solution significantly speeds up the process, impacting hyperparameter search and overall profiling attack performances. Our fast guessing entropy calculation is up to 16×\times faster, resulting in more hyperparameter tuning experiments and allowing security evaluators to find more efficient deep learning model

    I Know What Your Layers Did: Layer-wise Explainability of Deep Learning Side-channel Analysis

    Get PDF
    Masked cryptographic implementations can be vulnerable to higher-order attacks. For instance, deep neural networks have proven effective for second-order profiling side-channel attacks even in a black-box setting (no prior knowledge of masks and implementation details). While such attacks have been successful, no explanations were provided for understanding why a variety of deep neural networks can (or cannot) learn high-order leakages and what the limitations are. In other words, we lack the explainability on neural network layers combining (or not) unknown and random secret shares, which is a necessary step to defeat, e.g., Boolean masking countermeasures. In this paper, we use information-theoretic metrics to explain the internal activities of deep neural network layers. We propose a novel methodology for the explainability of deep learning-based profiling side-channel analysis (denoted ExDL-SCA) to understand the processing of secret masks. Inspired by the Information Bottleneck theory, our explainability methodology uses perceived information to explain and detect the different phenomena that occur in deep neural networks, such as fitting, compression, and generalization. We provide experimental results on masked AES datasets showing where, what, and why deep neural networks learn relevant features from input trace sets while compressing irrelevant ones, including noise. This paper opens new perspectives for understanding the role of different neural network layers in profiling side-channel attacks
    corecore