4,974 research outputs found
Machine Learning Aided Static Malware Analysis: A Survey and Tutorial
Malware analysis and detection techniques have been evolving during the last
decade as a reflection to development of different malware techniques to evade
network-based and host-based security protections. The fast growth in variety
and number of malware species made it very difficult for forensics
investigators to provide an on time response. Therefore, Machine Learning (ML)
aided malware analysis became a necessity to automate different aspects of
static and dynamic malware investigation. We believe that machine learning
aided static analysis can be used as a methodological approach in technical
Cyber Threats Intelligence (CTI) rather than resource-consuming dynamic malware
analysis that has been thoroughly studied before. In this paper, we address
this research gap by conducting an in-depth survey of different machine
learning methods for classification of static characteristics of 32-bit
malicious Portable Executable (PE32) Windows files and develop taxonomy for
better understanding of these techniques. Afterwards, we offer a tutorial on
how different machine learning techniques can be utilized in extraction and
analysis of a variety of static characteristic of PE binaries and evaluate
accuracy and practical generalization of these techniques. Finally, the results
of experimental study of all the method using common data was given to
demonstrate the accuracy and complexity. This paper may serve as a stepping
stone for future researchers in cross-disciplinary field of machine learning
aided malware forensics.Comment: 37 Page
Malware Detection using Machine Learning and Deep Learning
Research shows that over the last decade, malware has been growing
exponentially, causing substantial financial losses to various organizations.
Different anti-malware companies have been proposing solutions to defend
attacks from these malware. The velocity, volume, and the complexity of malware
are posing new challenges to the anti-malware community. Current
state-of-the-art research shows that recently, researchers and anti-virus
organizations started applying machine learning and deep learning methods for
malware analysis and detection. We have used opcode frequency as a feature
vector and applied unsupervised learning in addition to supervised learning for
malware classification. The focus of this tutorial is to present our work on
detecting malware with 1) various machine learning algorithms and 2) deep
learning models. Our results show that the Random Forest outperforms Deep
Neural Network with opcode frequency as a feature. Also in feature reduction,
Deep Auto-Encoders are overkill for the dataset, and elementary function like
Variance Threshold perform better than others. In addition to the proposed
methodologies, we will also discuss the additional issues and the unique
challenges in the domain, open research problems, limitations, and future
directions.Comment: 11 Pages and 3 Figure
An investigation of a deep learning based malware detection system
We investigate a Deep Learning based system for malware detection. In the
investigation, we experiment with different combination of Deep Learning
architectures including Auto-Encoders, and Deep Neural Networks with varying
layers over Malicia malware dataset on which earlier studies have obtained an
accuracy of (98%) with an acceptable False Positive Rates (1.07%). But these
results were done using extensive man-made custom domain features and investing
corresponding feature engineering and design efforts. In our proposed approach,
besides improving the previous best results (99.21% accuracy and a False
Positive Rate of 0.19%) indicates that Deep Learning based systems could
deliver an effective defense against malware. Since it is good in automatically
extracting higher conceptual features from the data, Deep Learning based
systems could provide an effective, general and scalable mechanism for
detection of existing and unknown malware.Comment: 13 Pages, 4 figure
Survey of Machine Learning Techniques for Malware Analysis
Coping with malware is getting more and more challenging, given their
relentless growth in complexity and volume. One of the most common approaches
in literature is using machine learning techniques, to automatically learn
models and patterns behind such complexity, and to develop technologies for
keeping pace with the speed of development of novel malware. This survey aims
at providing an overview on the way machine learning has been used so far in
the context of malware analysis. We systematize surveyed papers according to
their objectives (i.e., the expected output, what the analysis aims to), what
information about malware they specifically use (i.e., the features), and what
machine learning techniques they employ (i.e., what algorithm is used to
process the input and produce the output). We also outline a number of problems
concerning the datasets used in considered works, and finally introduce the
novel concept of malware analysis economics, regarding the study of existing
tradeoffs among key metrics, such as analysis accuracy and economical costs
- …