40 research outputs found

    Breaking Cryptographic Implementations Using Deep Learning Techniques

    Get PDF
    Template attack is the most common and powerful profiled side channel attack. It relies on a realistic assumption regarding the noise of the device under attack: the probability density function of the data is a multivariate Gaussian distribution. To relax this assumption, a recent line of research has investigated new profiling approaches mainly by applying machine learning techniques. The obtained results are commensurate, and in some particular cases better, compared to template attack. In this work, we propose to continue this recent line of research by applying more sophisticated profiling techniques based on deep learning. Our experimental results confirm the overwhelming advantages of the resulting new attacks when targeting both unprotected and protected cryptographic implementations

    Classifiers of power patterns

    Get PDF
    V průběhu posledních několika let se z útoků postranními kanály stala významná hrozba pro bezpečnost kryptografických modulů. Existuje několik typů útoků postranními kanály, které lze použít pro prolomení většiny šifrovacích algoritmů (např. AES, DES, RSA). Tato diplomová práce se věnuje problematice proudových postranních kanálů, pro které existují různé metody proudové analýzy, např. jednoduchá proudová analýza (SPA), diferenciální proudová analýza (DPA), útok pomocí šablon, atd. Výše zmíněné metody jsou v práci podrobně popsány. Také je zde zkoumáno uplatnění technik strojového učení, konkrétně neuronových sítí a algoritmu SVM, v oblasti proudové analýzy. Praktická část práce se zaměřuje na prolomení maskovaného šifrovacího algoritmu AES. Jehož implementace je použita v soutěži DPA Contest.Over the last several years side-channel analysis has emerged as a major threat to securing sensitive information in cryptographic devices. Several side-channels have been discovered and used to break implementations of all major cryptographic algorithms (AES, DES, RSA). This thesis is focused on power analysis attacks. A variety of power analysis methods has been developed to perform these attacks. These methods include simple power analysis (SPA), differential power analysis (DPA), template attacks, etc. This work provides comprehensive survey of mentioned methods and also investigates the application of a machine learning techniques in power analysis. The considered learning techniques are neural networks and support vector machines. The final part of this thesis is dedicated to implemenation of the attack against protected software AES implementation which is used in the DPA Contest.

    GAP: Born to Break Hiding

    Get PDF
    Recently, Machine Learning (ML) is widely investigated in the side-channel analysis (SCA) community. As an artificial neural network can extract the feature without preprocessing, ML-based SCA methods relatively less rely on the attacker\u27s ability. Consequently, they outperform traditional methods. Hiding is a countermeasure against SCA that randomizes the moments of manipulating sensitive data. Since hiding could disturb the neural network\u27s learning, an attacker should design a proper architecture against hiding. In this paper, we propose inherently robust architecture against every kind of desynchronization. We demonstrated the proposed method with plenty of datasets, including open datasets. As a result, our method outperforms state-of-the-art on every dataset

    Methodology for Efficient CNN Architectures in Profiling Attacks -- Extended Version

    Get PDF
    The side-channel community recently investigated a new approach, based on deep learning, to significantly improve profiled attacks against embedded systems. Previous works have shown the benefit of using convolutional neural networks (CNN) to limit the effect of some countermeasures such as desynchronization. Compared with template attacks, deep learning techniques can deal with trace misalignment and the high dimensionality of the data. Pre-processing is no longer mandatory. However, the performance of attacks depends to a great extent on the choice of each hyperparameter used to configure a CNN architecture. Hence, we cannot perfectly harness the potential of deep neural networks without a clear understanding of the network’s inner-workings. To reduce this gap, we propose to clearly explain the role of each hyperparameters during the feature selection phase using some specific visualization techniques including Weight Visualization, Gradient Visualization and Heatmaps. By highlighting which features are retained by filters, heatmaps come in handy when a security evaluator tries to interpret and understand the efficiency of CNN. We propose a methodology for building efficient CNN architectures in terms of attack efficiency and network complexity, even in the presence of desynchronization. We evaluate our methodology using public datasets with and without desynchronization. In each case, our methodology outperforms the previous state-of-the-art CNN models while significantly reducing network complexity. Our networks are up to 25 times more efficient than previous state-of-the-art while their complexity is up to 31810 times smaller. Our results show that CNN networks do not need to be very complex to perform well in the side-channel context

    Learning when to stop: a mutual information approach to fight overfitting in profiled side-channel analysis

    Get PDF
    Today, deep neural networks are a common choice for conducting the profiled side-channel analysis. Such techniques commonly do not require pre-processing, and yet, they can break targets protected with countermeasures. Unfortunately, it is not trivial to find neural network hyper-parameters that would result in such top-performing attacks. The hyper-parameter leading the training process is the number of epochs during which the training happens. If the training is too short, the network does not reach its full capacity, while if the training is too long, the network overfits, and is not able to generalize to unseen examples. Finding the right moment to stop the training process is particularly difficult for side-channel analysis as there are no clear connections between machine learning and side-channel metrics that govern the training and attack phases, respectively. In this paper, we tackle the problem of determining the correct epoch to stop the training in deep learning-based side-channel analysis. We explore how information is propagated through the hidden layers of a neural network, which allows us to monitor how training is evolving. We demonstrate that the amount of information, or, more precisely, mutual information transferred to the output layer, can be measured and used as a reference metric to determine the epoch at which the network offers optimal generalization. To validate the proposed methodology, we provide extensive experimental results that confirm the effectiveness of our metric for avoiding overfitting in the profiled side-channel analysis

    Not a Free Lunch but a Cheap Lunch: Experimental Results for Training Many Neural Nets Efficiently

    Get PDF
    Neural Networks have become a much studied approach in the recent literature on profiled side channel attacks: many articles examine their use and performance in profiled single-target DPA style attacks. In this setting a single neural net is tweaked and tuned based on a training data set. The effort for this is considerable, as there a many hyper-parameters that need to be adjusted. A straightforward, but impractical, extension of such an approach to multi-target DPA style attacks requires deriving and tuning a network architecture for each individual target. Our contribution is to provide the first practical and efficient strategy for training many neural nets in the context of a multi target attack. We show how to configure a network with a set of hyper-parameters for a specific intermediate (SubBytes) that generalises well to capture the leakage of other intermediates as well. This is interesting because although we can\u27t beat the no free lunch theorem (i.e. we find that different profiling methods excel on different intermediates), we can still get ``good value for money\u27\u27 (i.e. good classification results across many intermediates with reasonable profiling effort)
    corecore