617 research outputs found

    Bias-variance Decomposition in Machine Learning-based Side-channel Analysis

    Get PDF
    Machine learning techniques represent a powerful option in profiling side-channel analysis. Still, there are many settings where their performance is far from expected. In such occasions, it is very important to understand the difficulty of the problem and the behavior of the machine learning algorithm. To that end, one needs to investigate not only the performance of machine learning but also to provide insights into its explainability. One tool enabling us to do this is the bias-variance decomposition where we are able to decompose the predictive error into bias, variance, and noise. With this technique, we can analyze various scenarios and recognize what are the sources of problem difficulty and how additional measurements/features or more complex machine learning models can alleviate the problem. While such results are promising, there are still drawbacks since often it is not easy to connect the performance of side-channel attack and performance of a machine learning classifier as given by the bias-variance decomposition. In this paper, we propose a new tool for analyzing the performance of machine learning-based side-channel attacks -- the Guessing Entropy Bias-Variance Decomposition. With it, we are able to better understand the performance of various machine learning techniques and understand how a change in a setting influences the performance of an attack. To validate our claims, we give extensive experimental results for a number of different settings

    An Optimal Energy Efficient Design of Artificial Noise for Preventing Power Leakage based Side-Channel Attacks

    Full text link
    Side-channel attacks (SCAs), which infer secret information (for example secret keys) by exploiting information that leaks from the implementation (such as power consumption), have been shown to be a non-negligible threat to modern cryptographic implementations and devices in recent years. Hence, how to prevent side-channel attacks on cryptographic devices has become an important problem. One of the widely used countermeasures to against power SCAs is the injection of random noise sequences into the raw leakage traces. However, the indiscriminate injection of random noise can lead to significant increases in energy consumption in device, and ways must be found to reduce the amount of energy in noise generation while keeping the side-channel invisible. In this paper, we propose an optimal energy-efficient design for artificial noise generation to prevent side-channel attacks. This approach exploits the sparsity among the leakage traces. We model the side-channel as a communication channel, which allows us to use channel capacity to measure the mutual information between the secret and the leakage traces. For a given energy budget in the noise generation, we obtain the optimal design of the artificial noise injection by solving the side-channel's channel capacity minimization problem. The experimental results also validate the effectiveness of our proposed scheme

    Assessing malware detection using hardware performance counters

    Get PDF
    Despite the use of modern anti-virus (AV) software, malware is a prevailing threat to today's computing systems. AV software cannot cope with the increasing number of evasive malware, calling for more robust malware detection techniques. Out of the many proposed methods for malware detection, researchers have suggested microarchitecture-based mechanisms for detection of malicious software in a system. For example, Intel embeds a shadow stack in their modern architectures that maintains the integrity between function calls and their returns by tracking the function's return address. Any malicious program that exploits an application to overflow the return addresses can be restrained using the shadow stack. Researchers also propose the use of Hardware Performance Counters (HPCs). HPCs are counters embedded in modern computing architectures that count the occurrence of architectural events, such as cache hits, clock cycles, and integer instructions. Malware detectors that leverage HPCs create a profile of an application by reading the counter values periodically. Subsequently, researchers use supervised machine learning-based (ML) classification techniques to differentiate malicious profiles amongst benign ones. It is important to note that HPCs count the occurrence of microarchitectural events during execution of the program. However, whether a program is malicious or benign is the high-level behavior of a program. Since HPCs do not surveil the high-level behavior of an application, we hypothesize that the counters may fail to capture the difference in the behavioral semantics of a malicious and benign software. To investigate whether HPCs capture the behavioral semantics of the program, we recreate the experimental setup from the previously proposed systems. To this end, we leverage HPCs to profile applications such as MS-Office and Chrome as benign applications and known malware binaries as malicious applications. Standard ML classifiers demand a normally distributed dataset, where the variance is independent of the mean of the data points. To transform the profile into more normal-like distribution and to avoid over-fitting the machine learning models, we employ power transform on the profiles of the applications. Moreover, HPCs can monitor a broad range of hardware-based events. We use Principal Component Analysis (PCA) for selecting the top performance events that show maximum variation in the least number of features amongst all the applications profiled. Finally, we train twelve supervised machine learning classifiers such as Support Vector Machine (SVM) and MultiLayer Perceptron (MLPs) on the profiles from the applications. We model each classifier as a binary classifier, where the two classes are 'Benignware' and 'Malware.' Our results show that for the 'Malware' class, the average recall and F2-score across the twelve classifiers is 0.22 and 0.70 respectively. The low recall score shows that the ML classifiers tag malware as benignware. Even though we exercise a statistical approach for selecting our features, the classifiers are not able to distinguish between malware and benignware based on the hardware-based events monitored by the HPCs. The incapability of the profiles from HPCs in capturing the behavioral characteristic of an application force us to question the use of HPCs as malware detectors

    Recommending with an Agenda: Active Learning of Private Attributes using Matrix Factorization

    Full text link
    Recommender systems leverage user demographic information, such as age, gender, etc., to personalize recommendations and better place their targeted ads. Oftentimes, users do not volunteer this information due to privacy concerns, or due to a lack of initiative in filling out their online profiles. We illustrate a new threat in which a recommender learns private attributes of users who do not voluntarily disclose them. We design both passive and active attacks that solicit ratings for strategically selected items, and could thus be used by a recommender system to pursue this hidden agenda. Our methods are based on a novel usage of Bayesian matrix factorization in an active learning setting. Evaluations on multiple datasets illustrate that such attacks are indeed feasible and use significantly fewer rated items than static inference methods. Importantly, they succeed without sacrificing the quality of recommendations to users.Comment: This is the extended version of a paper that appeared in ACM RecSys 201

    Machine Learning Methodologies For Low-Level Hardware-Based Malware Detection

    Get PDF
    Malicious software continues to be a pertinent threat to the security of critical infrastructures harboring sensitive information. The abundance in malware samples and the disclosure of newer vulnerability paths for exploitation necessitates intelligent machine learning techniques for effective and efficient malware detection and analysis. Software-based methods are suitable for in-depth forensic analysis, but their on-device implementations are slower and resource hungry. Alternatively, hardware-based approaches are emerging as an alternative approach against malware threats because of their trustworthiness, difficult evasion, and lower implementation costs. Modern processors have numerous hardware events such as power domains, voltage, frequency, accessible through software interfaces for performance monitoring and debugging. But, information leakage from these events are not explored for defenses against malware threats. This thesis demonstrates approach towards malware detection and analysis by leveraging low-level hardware signatures. The proposed research aims to develop machine learning methodology for detecting malware applications, classifying malware family and detecting shellcode exploits from low-level power signatures and electromagnetic emissions. This includes 1) developing a signature based detector by extracting features from DVFS states and using ML model to distinguish malware application from benign. 2) developing ML model operating on frequency and wavelet features to classify malware behaviors using EM emissions. 3) developing an Restricted Boltzmann Machine (RBM) model to detect anomalies in energy telemetry register values of malware infected application resulting from shellcode exploits. The evaluation of the proposed ML methodology on malware datasets indicate architecture-agnostic, pervasive, platform independent detectors that distinguishes malware against benign using DVFS signatures, classifies detected malware to characteristic family using EM signatures, and detect shellcode exploits on browser applications by identifying anomalies in energy telemetry register values using energy-based RBM model.Ph.D

    A Novel Completeness Test and its Application to Side Channel Attacks and Simulators

    Get PDF
    Today\u27s side channel attack targets are often complex devices in which instructions are processed in parallel and work on 32-bit data words. Consequently, the state that is involved in producing leakage in these modern devices is large, and basing evaluations (i.e. worst case attacks), simulators, and assumptions for (masking) countermeasures on a potentially incomplete state can lead to drastically wrong conclusions. We put forward a novel notion for the ``completeness\u27\u27 of an assumed state, together with an efficient statistical test that is based on ``collapsed models\u27\u27. Our novel test can be used to recover a state that contains multiple 32-bit variables in a grey box setting. We illustrate how our novel test can help to guide side channel attacks and we reveal new attack vectors for existing implementations. We also show how the application of our statistical test shows where even the most recent leakage simulators do not capture all available leakage of their respective target devices

    Deep Learning vs Template Attacks in front of fundamental targets: experimental study

    Get PDF
    This study compares the experimental results of Template Attacks (TA) and Deep Learning (DL) techniques called Multi Layer Perceptron (MLP) and Convolutional Neural Network (CNN), concurrently in front of classical use cases often encountered in the side-channel analysis of cryptographic devices (restricted to SK). The starting point regards their comparative effectiveness against masked encryption which appears as intrinsically vulnerable. Surprisingly TA improved with Principal Components Analysis (PCA) and normalization, honorably makes the grade versus the latest DL methods which demand more calculation power. Another result is that both approaches face high difficulties against static targets such as secret data transfers or key schedule. The explanation of these observations resides in cross-matching. Beyond masking, the effects of other protections like jittering, shuffling and coding size are also tested. At the end of the day the benefit of DL techniques, stands in the better resistance of CNN to misalignment

    SoK: Deep Learning-based Physical Side-channel Analysis

    Get PDF
    Side-channel attacks represent a realistic and serious threat to the security of embedded devices for almost three decades. The variety of attacks and targets they can be applied to have been introduced, and while the area of side-channel attacks and mitigations is very well-researched, it is yet to be consolidated. Deep learning-based side-channel attacks entered the field in recent years with the promise of more competitive performance and enlarged attackers\u27 capabilities compared to other techniques. At the same time, the new attacks bring new challenges and complexities to the domain, making a systematization of the existing knowledge ever more necessary. In this SoK, we do exactly that, and by bringing new insights, we systematically structure the current knowledge of deep learning in side-channel analysis. We first dissect deep learning-assisted attacks into different phases and map those phases to the efforts conducted so far in the domain. For each of the phases, we identify the weaknesses and challenges that triggered the known open problems. We connect the attacks to the existing threat models and evaluate their advantages and drawbacks. We finish by discussing other threat models that should be investigated and propose directions for future works

    Online Performance Evaluation of Deep Learning Networks for Side-Channel Analysis

    Get PDF
    Deep learning based side-channel analysis has seen a rise in popularity over the last few years. A lot of work is done to understand the inner workings of the neural networks used to perform the attacks and a lot is still left to do. However, finding a metric suitable for evaluating the capacity of the neural networks is an open problem that is discussed in many articles. We propose an answer to this problem by introducing an online evaluation metric dedicated to the context of side-channel analysis and use it to perform early stopping on existing convolutional neural networks found in the literature. This metric compares the performance of a network on the training set and on the validation set to detect underfitting and overfitting. Consequently, we improve the performance of the networks by finding their best training epoch and thus reduce the number of traces used by 30%. The training time is also reduced for most of the networks considered
    • …
    corecore