4,128 research outputs found
DEVELOPMENT OF A FEDERATED LEARNING-BASED MALWARE DETECTION MODEL FOR INTERCONNECTED CLOUD INFRASTRUCTURES
Due to the large number of heterogeneous applications using the same infrastructure, enforcing
security and reliability in the cloud is a difficult but crucial task. A security analysis system that
detects threats for example malicious software (malware) should exist within the cloud
infrastructure. Different malware techniques that bypass network-based and host-based security
protections have led to the development of new methods for analysing and detecting malware,
which have evolved over the past decades. Due to the complexity of learning the ever-changing
configurations of malware, it is challenging for forensics investigators to keep up with the
exponential rise in the number and variety of malware species. In this research work, a malware
detection model was developed for interconnected cloud infrastructures based on federated
learning. This technique enables collaboration between multiple devices in the training of machine
learning models without exchanging data, thereby preserving the privacy of individual users. Three
different deep-learning algorithms were selected and used in the training, validation, and testing of
the models. By the model training with eight clients and twenty-five federation rounds, the
FeedForward Neural Networks(FFNN) model provided the best performance with a precision of
84%, an F1-score of 84%, and an accuracy of 84% whereas the Multi-Layer Perceptron(MLP)
model provided 83% of precision, 83% of F1-score, and 83% of accuracy and the Long Short-Term
Memory(LSTM) model provided a performance with 80% of precision, 80% of F1-score, and 80%
of accuracy as well
Learning Fast and Slow: PROPEDEUTICA for Real-time Malware Detection
In this paper, we introduce and evaluate PROPEDEUTICA, a novel methodology
and framework for efficient and effective real-time malware detection,
leveraging the best of conventional machine learning (ML) and deep learning
(DL) algorithms. In PROPEDEUTICA, all software processes in the system start
execution subjected to a conventional ML detector for fast classification. If a
piece of software receives a borderline classification, it is subjected to
further analysis via more performance expensive and more accurate DL methods,
via our newly proposed DL algorithm DEEPMALWARE. Further, we introduce delays
to the execution of software subjected to deep learning analysis as a way to
"buy time" for DL analysis and to rate-limit the impact of possible malware in
the system. We evaluated PROPEDEUTICA with a set of 9,115 malware samples and
877 commonly used benign software samples from various categories for the
Windows OS. Our results show that the false positive rate for conventional ML
methods can reach 20%, and for modern DL methods it is usually below 6%.
However, the classification time for DL can be 100X longer than conventional ML
methods. PROPEDEUTICA improved the detection F1-score from 77.54% (conventional
ML method) to 90.25%, and reduced the detection time by 54.86%. Further, the
percentage of software subjected to DL analysis was approximately 40% on
average. Further, the application of delays in software subjected to ML reduced
the detection time by approximately 10%. Finally, we found and discussed a
discrepancy between the detection accuracy offline (analysis after all traces
are collected) and on-the-fly (analysis in tandem with trace collection). Our
insights show that conventional ML and modern DL-based malware detectors in
isolation cannot meet the needs of efficient and effective malware detection:
high accuracy, low false positive rate, and short classification time.Comment: 17 pages, 7 figure
Malware Detection using Machine Learning and Deep Learning
Research shows that over the last decade, malware has been growing
exponentially, causing substantial financial losses to various organizations.
Different anti-malware companies have been proposing solutions to defend
attacks from these malware. The velocity, volume, and the complexity of malware
are posing new challenges to the anti-malware community. Current
state-of-the-art research shows that recently, researchers and anti-virus
organizations started applying machine learning and deep learning methods for
malware analysis and detection. We have used opcode frequency as a feature
vector and applied unsupervised learning in addition to supervised learning for
malware classification. The focus of this tutorial is to present our work on
detecting malware with 1) various machine learning algorithms and 2) deep
learning models. Our results show that the Random Forest outperforms Deep
Neural Network with opcode frequency as a feature. Also in feature reduction,
Deep Auto-Encoders are overkill for the dataset, and elementary function like
Variance Threshold perform better than others. In addition to the proposed
methodologies, we will also discuss the additional issues and the unique
challenges in the domain, open research problems, limitations, and future
directions.Comment: 11 Pages and 3 Figure
- …