1,530 research outputs found
Machine Learning Aided Static Malware Analysis: A Survey and Tutorial
Malware analysis and detection techniques have been evolving during the last
decade as a reflection to development of different malware techniques to evade
network-based and host-based security protections. The fast growth in variety
and number of malware species made it very difficult for forensics
investigators to provide an on time response. Therefore, Machine Learning (ML)
aided malware analysis became a necessity to automate different aspects of
static and dynamic malware investigation. We believe that machine learning
aided static analysis can be used as a methodological approach in technical
Cyber Threats Intelligence (CTI) rather than resource-consuming dynamic malware
analysis that has been thoroughly studied before. In this paper, we address
this research gap by conducting an in-depth survey of different machine
learning methods for classification of static characteristics of 32-bit
malicious Portable Executable (PE32) Windows files and develop taxonomy for
better understanding of these techniques. Afterwards, we offer a tutorial on
how different machine learning techniques can be utilized in extraction and
analysis of a variety of static characteristic of PE binaries and evaluate
accuracy and practical generalization of these techniques. Finally, the results
of experimental study of all the method using common data was given to
demonstrate the accuracy and complexity. This paper may serve as a stepping
stone for future researchers in cross-disciplinary field of machine learning
aided malware forensics.Comment: 37 Page
PDF-Malware Detection: A Survey and Taxonomy of Current Techniques
Portable Document Format, more commonly known as PDF, has become, in the last 20 years, a standard for document exchange and dissemination due its portable nature and widespread adoption. The flexibility and power of this format are not only leveraged by benign users, but from hackers as well who have been working to exploit various types of vulnerabilities, overcome security restrictions, and then transform the PDF format in one among the leading malicious code spread vectors. Analyzing the content of malicious PDF files to extract the main features that characterize the malware identity and behavior, is a fundamental task for modern threat intelligence platforms that need to learn how to automatically identify new attacks. This paper surveys existing state of the art about systems for the detection of malicious PDF files and organizes them in a taxonomy that separately considers the used approaches and the data analyzed to detect the presence of malicious code. © Springer International Publishing AG, part of Springer Nature 2018
Mal-Netminer: Malware Classification Approach based on Social Network Analysis of System Call Graph
As the security landscape evolves over time, where thousands of species of
malicious codes are seen every day, antivirus vendors strive to detect and
classify malware families for efficient and effective responses against malware
campaigns. To enrich this effort, and by capitalizing on ideas from the social
network analysis domain, we build a tool that can help classify malware
families using features driven from the graph structure of their system calls.
To achieve that, we first construct a system call graph that consists of system
calls found in the execution of the individual malware families. To explore
distinguishing features of various malware species, we study social network
properties as applied to the call graph, including the degree distribution,
degree centrality, average distance, clustering coefficient, network density,
and component ratio. We utilize features driven from those properties to build
a classifier for malware families. Our experimental results show that
influence-based graph metrics such as the degree centrality are effective for
classifying malware, whereas the general structural metrics of malware are less
effective for classifying malware. Our experiments demonstrate that the
proposed system performs well in detecting and classifying malware families
within each malware class with accuracy greater than 96%.Comment: Mathematical Problems in Engineering, Vol 201
- …