4,128 research outputs found

    DEVELOPMENT OF A FEDERATED LEARNING-BASED MALWARE DETECTION MODEL FOR INTERCONNECTED CLOUD INFRASTRUCTURES

    Get PDF
    Due to the large number of heterogeneous applications using the same infrastructure, enforcing security and reliability in the cloud is a difficult but crucial task. A security analysis system that detects threats for example malicious software (malware) should exist within the cloud infrastructure. Different malware techniques that bypass network-based and host-based security protections have led to the development of new methods for analysing and detecting malware, which have evolved over the past decades. Due to the complexity of learning the ever-changing configurations of malware, it is challenging for forensics investigators to keep up with the exponential rise in the number and variety of malware species. In this research work, a malware detection model was developed for interconnected cloud infrastructures based on federated learning. This technique enables collaboration between multiple devices in the training of machine learning models without exchanging data, thereby preserving the privacy of individual users. Three different deep-learning algorithms were selected and used in the training, validation, and testing of the models. By the model training with eight clients and twenty-five federation rounds, the FeedForward Neural Networks(FFNN) model provided the best performance with a precision of 84%, an F1-score of 84%, and an accuracy of 84% whereas the Multi-Layer Perceptron(MLP) model provided 83% of precision, 83% of F1-score, and 83% of accuracy and the Long Short-Term Memory(LSTM) model provided a performance with 80% of precision, 80% of F1-score, and 80% of accuracy as well

    Learning Fast and Slow: PROPEDEUTICA for Real-time Malware Detection

    Full text link
    In this paper, we introduce and evaluate PROPEDEUTICA, a novel methodology and framework for efficient and effective real-time malware detection, leveraging the best of conventional machine learning (ML) and deep learning (DL) algorithms. In PROPEDEUTICA, all software processes in the system start execution subjected to a conventional ML detector for fast classification. If a piece of software receives a borderline classification, it is subjected to further analysis via more performance expensive and more accurate DL methods, via our newly proposed DL algorithm DEEPMALWARE. Further, we introduce delays to the execution of software subjected to deep learning analysis as a way to "buy time" for DL analysis and to rate-limit the impact of possible malware in the system. We evaluated PROPEDEUTICA with a set of 9,115 malware samples and 877 commonly used benign software samples from various categories for the Windows OS. Our results show that the false positive rate for conventional ML methods can reach 20%, and for modern DL methods it is usually below 6%. However, the classification time for DL can be 100X longer than conventional ML methods. PROPEDEUTICA improved the detection F1-score from 77.54% (conventional ML method) to 90.25%, and reduced the detection time by 54.86%. Further, the percentage of software subjected to DL analysis was approximately 40% on average. Further, the application of delays in software subjected to ML reduced the detection time by approximately 10%. Finally, we found and discussed a discrepancy between the detection accuracy offline (analysis after all traces are collected) and on-the-fly (analysis in tandem with trace collection). Our insights show that conventional ML and modern DL-based malware detectors in isolation cannot meet the needs of efficient and effective malware detection: high accuracy, low false positive rate, and short classification time.Comment: 17 pages, 7 figure

    Malware Detection using Machine Learning and Deep Learning

    Full text link
    Research shows that over the last decade, malware has been growing exponentially, causing substantial financial losses to various organizations. Different anti-malware companies have been proposing solutions to defend attacks from these malware. The velocity, volume, and the complexity of malware are posing new challenges to the anti-malware community. Current state-of-the-art research shows that recently, researchers and anti-virus organizations started applying machine learning and deep learning methods for malware analysis and detection. We have used opcode frequency as a feature vector and applied unsupervised learning in addition to supervised learning for malware classification. The focus of this tutorial is to present our work on detecting malware with 1) various machine learning algorithms and 2) deep learning models. Our results show that the Random Forest outperforms Deep Neural Network with opcode frequency as a feature. Also in feature reduction, Deep Auto-Encoders are overkill for the dataset, and elementary function like Variance Threshold perform better than others. In addition to the proposed methodologies, we will also discuss the additional issues and the unique challenges in the domain, open research problems, limitations, and future directions.Comment: 11 Pages and 3 Figure
    • …
    corecore