18 research outputs found

    On the Importance of Pooling Layer Tuning for Profiling Side-Channel Analysis

    Get PDF
    In recent years, the advent of deep neural networks opened new perspectives for security evaluations with side-channel analysis. Profiling attacks now benefit from capabilities offered by convolutional neural networks, such as dimensionality reduction and the inherent ability to reduce the trace desynchronization effects. These neural networks contain at least three types of layers: convolutional, pooling, and dense layers. Although the definition of pooling layers causes a large impact on neural network performance, a study on pooling hyperparameters effect on side-channel analysis is still not provided in the academic community. This paper provides extensive experimental results to demonstrate how pooling layer types and pooling stride and size affect the profiling attack performance with convolutional neural networks. Additionally, we demonstrate that pooling hyperparameters can be larger than usually used in related works and still keep good performance for profiling attacks on specific datasets.Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Cyber Securit

    Autoencoder-enabled model portability for reducing hyperparameter tuning efforts in side-channel analysis

    No full text
    Hyperparameter tuning represents one of the main challenges in deep learning-based profiling side-channel analysis. For each different side-channel dataset, the typical procedure to find a profiling model is applying hyperparameter tuning from scratch. The main reason is that side-channel measurements from various targets contain different underlying leakage distributions. Consequently, the same profiling model hyperparameters are usually not equally efficient for other targets. This paper considers autoencoders for dimensionality reduction to verify if encoded datasets from different targets enable the portability of profiling models and architectures. Successful portability reduces the hyperparameter tuning efforts as profiling model tuning is eliminated for the new dataset, and tuning autoencoders is simpler. We first search for the best autoencoder for each dataset and the best profiling model when the encoded dataset becomes the training set. Our results show no significant difference in tuning efforts using original and encoded traces, meaning that encoded data reliably represents the original data. Next, we verify how portable is the best profiling model among different datasets. Our results show that tuning autoencoders enables and improves portability while reducing the effort in hyperparameter search for profiling models. Lastly, we present a transfer learning case where dimensionality reduction might be necessary if the model is tuned for a dataset with fewer features than the new dataset. In this case, tuning of the profiling model is eliminated and training time reduced.Cyber Securit

    Gambling for Success: The Lottery Ticket Hypothesis in Deep Learning-Based Side-Channel Analysis

    No full text
    Deep learning-based side-channel analysis (SCA) represents a strong approach for profiling attacks. Still, this does not mean it is trivial to find neural networks that perform well for any setting. Based on the developed neural network architectures, we can distinguish between small neural networks that are easier to tune and less prone to overfitting but could have insufficient capacity to model the data. On the other hand, large neural networks have sufficient capacity but can overfit and are more difficult to tune. This brings an interesting trade-off between simplicity and performance. This work proposes to use a pruning strategy and recently proposed Lottery Ticket Hypothesis (LTH) as an efficient method to tune deep neural networks for profiling SCA. Pruning provides a regularization effect on deep neural networks and reduces the overfitting posed by overparameterized models. We demonstrate that we can find pruned neural networks that perform on the level of larger networks, where we manage to reduce the number of weights by more than 90% on average. This way, pruning and LTH approaches become alternatives to costly and difficult hyperparameter tuning in profiling SCA. Our analysis is conducted over different masked AES datasets and for different neural network topologies. Our results indicate that pruning, and more specifically LTH, can result in competitive deep learning models.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Cyber Securit

    Strength in Numbers: Improving Generalization with Ensembles in Machine Learning-based Profiled Side-channel Analysis

    No full text
    The adoption of deep neural networks for profiled side-channel attacks provides powerful options for leakage detection and key retrieval of secure products. When training a neural network for side-channel analysis, it is expected that the trained model can implement an approximation function that can detect leaking side-channel samples and, at the same time, be insensible to noisy (or non-leaking) samples. This outlines a generalization situation where the model can identify the main representations learned from the training set in a separate test set.This paper discusses how output class probabilities represent a strong metric when conducting the side-channel analysis. Further, we observe that these output probabilities are sensitive to small changes, like selecting specific test traces or weight initialization for a neural network. Next, we discuss the hyperparameter tuning, where one commonly uses only a single out of dozens of trained models, where each of those models will result in different output probabilities. We show how ensembles of machine learning models based on averaged class probabilities can improve generalization. Our results emphasize that ensembles increase a profiled side-channel attack’s performance and reduce the variance of results stemming from different hyperparameters, regardless of the selected dataset or leakage model.Cyber Securit

    To Overfit, or Not to Overfit: Improving the Performance of Deep Learning-Based SCA

    No full text
    Profiling side-channel analysis allows evaluators to estimate the worst-case security of a target. When security evaluations relax the assumptions about the adversary’s knowledge, profiling models may easily be sub-optimal due to the inability to extract the most informative points of interest from the side-channel measurements. When used for profiling attacks, deep neural networks can learn strong models without feature selection with the drawback of expensive hyperparameter tuning. Unfortunately, due to very large search spaces, one usually finds very different model behaviors, and a widespread situation is to face overfitting with typically poor generalization capacity. Usually, overfitting or poor generalization would be mitigated by adding more measurements to the profiling phase to reduce estimation errors. This paper provides a detailed analysis of different deep learning model behaviors and shows that adding more profiling traces as a single solution does not necessarily help improve generalization. We recognize the main problem to be the sub-optimal selection of hyperparameters, which is then difficult to resolve by simply adding more measurements. Instead, we propose to use small hyperparameter tweaks or regularization as techniques to resolve the problem.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Cyber Securit

    Exploring Feature Selection Scenarios for Deep Learning-based Side-channel Analysis

    Get PDF
    One of the main promoted advantages of deep learning in profiling side-channel analysis is the possibility of skipping the feature engineering process. Despite that, most recent publications consider feature selection as the attacked interval from the side-channel measurements is pre-selected. This is similar to the worst-case security assumptions in security evaluations when the random secret shares (e.g., mask shares) are known during the profiling phase: an evaluator can identify points of interest locations and efficiently trim the trace interval. To broadly understand how feature selection impacts the performance of deep learning-based profiling attacks, this paper investigates three different feature selection scenarios that could be realistically used in practical security evaluations. The scenarios range from the minimum possible number of features (worst-case security assumptions) to the whole available traces. Our results emphasize that deep neural networks as profiling models show successful key recovery independently of explored feature selection scenarios against first-order masked software implementations of AES-128. First, we show that feature selection with the worst-case security assumptions results in optimal profiling models that are highly dependent on the number of features and signal-to-noise ratio levels. Second, we demonstrate that attacking raw side-channel measurements with small deep neural networks also provides optimal models, that shortens the gap between worst-case security evaluations and online (realistic) profiling attacks. In all explored feature selection scenarios, the hyperparameter search always indicates a successful model with up to eight hidden layers for MLPs and CNNs, suggesting that complex models are not required for the considered datasets. Our results demonstrate the key recovery with less than ten attack traces for all datasets for at least one of the feature selection scenarios. Additionally, in several cases, we can recover the target key with a single attack trace.Cyber Securit

    On the Evaluation of Deep Learning-Based Side-Channel Analysis

    No full text
    Deep learning-based side-channel analysis is rapidly positioning itself as a de-facto standard for the most powerful profiling side-channel analysis.The results from the last few years show that deep learning techniques can efficiently break targets that are even protected with countermeasures. While there are constant improvements in making the deep learning-based attacks more powerful, little is done on evaluating the attacks’ performance. Indeed, how the evaluation process is done today is not different from what was done more than a decade ago from the perspective of evaluation metrics. This paper considers how to evaluate deep learning-based side-channel analysis and whether the commonly used approaches give the best results. To that end, we consider different summary statistics and the influence of algorithmic randomness on the stability of profiling models. Our results show that besides commonly used metrics like guessing entropy, one should also show the standard deviation results to assess the attack performance properly. Even more importantly, using the arithmetic mean for guessing entropy does not yield the best results, and instead, a median value should be used.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Cyber Securit

    Reinforcement Learning-Based Design of Side-Channel Countermeasures

    No full text
    Deep learning-based side-channel attacks are capable of breaking targets protected with countermeasures. The constant progress in the last few years makes the attacks more powerful, requiring fewer traces to break a target. Unfortunately, to protect against such attacks, we still rely solely on methods developed to protect against generic attacks. The works considering the protection perspective are few and usually based on the adversarial examples concepts, which are not always easy to translate to real-world hardware implementations. In this work, we ask whether we can develop combinations of countermeasures that protect against side-channel attacks. We consider several widely adopted hiding countermeasures and use the reinforcement learning paradigm to design specific countermeasures that show resilience against deep learning-based side-channel attacks. Our results show that it is possible to significantly enhance the target resilience to a point where deep learning-based attacks cannot obtain secret information. At the same time, we consider the cost of implementing such countermeasures to balance security and implementation costs. The optimal countermeasure combinations can serve as development guidelines for real-world hardware/software-based protection schemes.Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Cyber Securit

    The Best of Two Worlds: Deep Learning-assisted Template Attack

    Get PDF
    In the last decade, machine learning-based side-channel attacks have become a standard option when investigating profiling side-channel attacks. At the same time, the previous state-of-the-art technique, template attack, started losing its importance and was more considered a baseline to compare against. As such, most of the results reported that machine learning (and especially deep learning) could significantly outperform the template attack. Nevertheless, the template attack still has certain advantages even compared to deep learning. The most significant one is that it has only a few hyperparameters to tune, making it easier to use. We take another look at the template attack, and we devise a feature engineering phase allowing the template attack to compete or even outperform state-of-the-art deep learning-based side-channel attacks. More precisely, with a novel distance metric customized for side-channel analysis, we show how a deep learning technique called similarity learning can be used to find highly efficient embeddings of input data with one-epoch training, which can then be fed into the template attack resulting in powerful attacks.Cyber Securit

    Profiled Side-Channel Analysis in the Efficient Attacker Framework

    No full text
    Profiled side-channel attacks represent the most powerful category of side-channel attacks. There, the attacker has access to a clone device to profile its leaking behavior. Additionally, it is common to consider the attacker unbounded in power to allow the worst-case security analysis. This paper starts with a different premise where we are interested in the minimum power that the attacker requires to conduct a successful attack. We propose a new framework for profiled side-channel analysis that we call the Efficient Attacker Framework. With it, we require attacks to be as powerful as possible, but we also provide a setting that inherently allows a more objective analysis among attacks. To confirm our theoretical results, we provide an experimental evaluation of our framework in the context of deep learning-based side-channel analysis.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Cyber Securit
    corecore