380 research outputs found

    Deep learning for the analysis of network traffic measurements

    Get PDF
    The application of machine learning models to the analysis of network traffic measurements has largely increased in recent years. In the networking domain, shallow models are usually applied, where a set of expert handcrafted features are needed to fix the data before training. There are two main problems associated with this approach: firstly, it requires expert domain knowledge to select the input features, and secondly, different sets of custom-made input features are generally needed according to the specific target (e.g., network security, anomaly detection, traffic classification). On the other hand, the power of machine learning models using deep architectures (i.e., deep learning) for networking has not been yet highly explored. These models have had huge success in various domains, notably in computer vision, natural language processing, machine translation, and more recently in gaming. The main goal of this work is to explore the power of deep learning models to enhance the analysis of network tra c measurements. To this end, the specific problem of detection and classi cation of network attacks is studied. As a major advantage with respect to the state-of-the-art in the field, the evaluation of different raw-traffic input representations, including packet and ow-level ones, is considered. Different deep learning architectures are explored, including convolutional neural networks and long short-term memory recurrent neural networks as core layers. In addition, three different datasets are crafted from publicly available network traffic captures and used for calibrating the considered input representations, as well as training and validating the proposed models. Different deep learning models are compared to a random forest model - commonly accepted as a highly accurate model for network traffic analysis, using the same raw input representations. In the malware detection task, a detection accuracy of 77.6% and 98.5% was achieved for packet and ow input representations respectively. For the malware classification task, an overall accuracy of 76.5% was achieved. In all evaluation tasks, the proposed deep learning models outperform the random forest ones. These initial results suggest that deep learning can be used to enhance malware detection without requiring expert domain knowledge to handcraft input features, opening the door to a broad set of potential applications for deep learning in networking

    Using Side Channel Information and Artificial Intelligence for Malware Detection

    Full text link
    Cybersecurity continues to be a difficult issue for society especially as the number of networked systems grows. Techniques to protect these systems range from rules-based to artificial intelligence-based intrusion detection systems and anti-virus tools. These systems rely upon the information contained in the network packets and download executables to function. Side channel information leaked from hardware has been shown to reveal secret information in systems such as encryption keys. This work demonstrates that side channel information can be used to detect malware running on a computing platform without access to the code involved.Comment: 7 page

    Neural malware detection

    Get PDF
    At the heart of today’s malware problem lies theoretically infinite diversity created by metamorphism. The majority of conventional machine learning techniques tackle the problem with the assumptions that a sufficiently large number of training samples exist and that the training set is independent and identically distributed. However, the lack of semantic features combined with the models under these wrong assumptions result largely in overfitting with many false positives against real world samples, resulting in systems being left vulnerable to various adversarial attacks. A key observation is that modern malware authors write a script that automatically generates an arbitrarily large number of diverse samples that share similar characteristics in program logic, which is a very cost-effective way to evade detection with minimum effort. Given that many malware campaigns follow this paradigm of economic malware manufacturing model, the samples within a campaign are likely to share coherent semantic characteristics. This opens up a possibility of one-to-many detection. Therefore, it is crucial to capture this non-linear metamorphic pattern unique to the campaign in order to detect these seemingly diverse but identically rooted variants. To address these issues, this dissertation proposes novel deep learning models, including generative static malware outbreak detection model, generative dynamic malware detection model using spatio-temporal isomorphic dynamic features, and instruction cognitive malware detection. A comparative study on metamorphic threats is also conducted as part of the thesis. Generative adversarial autoencoder (AAE) over convolutional network with global average pooling is introduced as a fundamental deep learning framework for malware detection, which captures highly complex non-linear metamorphism through translation invariancy and local variation insensitivity. Generative Adversarial Network (GAN) used as a part of the framework enables oneshot training where semantically isomorphic malware campaigns are identified by a single malware instance sampled from the very initial outbreak. This is a major innovation because, to the best of our knowledge, no approach has been found to this challenging training objective against the malware distribution that consists of a large number of very sparse groups artificially driven by arms race between attackers and defenders. In addition, we propose a novel method that extracts instruction cognitive representation from uninterpreted raw binary executables, which can be used for oneto- many malware detection via one-shot training against frequency spectrum of the Transformer’s encoded latent representation. The method works regardless of the presence of diverse malware variations while remaining resilient to adversarial attacks that mostly use random perturbation against raw binaries. Comprehensive performance analyses including mathematical formulations and experimental evaluations are provided, with the proposed deep learning framework for malware detection exhibiting a superior performance over conventional machine learning methods. The methods proposed in this thesis are applicable to a variety of threat environments here artificially formed sparse distributions arise at the cyber battle fronts.Doctor of Philosoph

    Hybrid Machine Learning Algorithms for Email and Malware Spam Filtering: A Review

    Get PDF
    In this paper, we presented a review of the state-of-the-art hybrid machine learning algorithms that were being used for email effective computing. For this reason, three research questions were formed, and the questions were answered by studying and analyzing related papers collected from some well-established scientific databases (Springer Link, IEEE Explore, Web of Science, and Scopus) based on some exclusion and inclusion criteria. The result presented the common Hybrid ML algorithms used to enhance email spam filtering. Also, the state-of-the-art datasets used for email and malware spam filtering were presented.&nbsp

    A Review on Cybersecurity based on Machine Learning and Deep Learning Algorithms

    Get PDF
    Machin learning (ML) and Deep Learning (DL) technique have been widely applied to areas like image processing and speech recognition so far. Likewise, ML and DL plays a critical role in detecting and preventing in the field of cybersecurity. In this review, we focus on recent ML and DL algorithms that have been proposed in cybersecurity, network intrusion detection, malware detection. We also discuss key elements of cybersecurity, main principle of information security and the most common methods used to threaten cybersecurity. Finally, concluding remarks are discussed including the possible research topics that can be taken into consideration to enhance various cyber security applications using DL and ML algorithms
    • …
    corecore