1,204 research outputs found
Machine Learning Aided Static Malware Analysis: A Survey and Tutorial
Malware analysis and detection techniques have been evolving during the last
decade as a reflection to development of different malware techniques to evade
network-based and host-based security protections. The fast growth in variety
and number of malware species made it very difficult for forensics
investigators to provide an on time response. Therefore, Machine Learning (ML)
aided malware analysis became a necessity to automate different aspects of
static and dynamic malware investigation. We believe that machine learning
aided static analysis can be used as a methodological approach in technical
Cyber Threats Intelligence (CTI) rather than resource-consuming dynamic malware
analysis that has been thoroughly studied before. In this paper, we address
this research gap by conducting an in-depth survey of different machine
learning methods for classification of static characteristics of 32-bit
malicious Portable Executable (PE32) Windows files and develop taxonomy for
better understanding of these techniques. Afterwards, we offer a tutorial on
how different machine learning techniques can be utilized in extraction and
analysis of a variety of static characteristic of PE binaries and evaluate
accuracy and practical generalization of these techniques. Finally, the results
of experimental study of all the method using common data was given to
demonstrate the accuracy and complexity. This paper may serve as a stepping
stone for future researchers in cross-disciplinary field of machine learning
aided malware forensics.Comment: 37 Page
Getting ahead of the arms race: hothousing the coevolution of VirusTotal with a Packer
Malware detection is in a coevolutionary arms race where the attackers and defenders are constantly seeking advantage. This arms race is asymmetric: detection is harder and more expensive than evasion. White hats must be conservative to avoid false positives when searching for malicious behaviour. We seek to redress this imbalance. Most of the time, black hats need only make incremental changes to evade them. On occasion, white hats make a disruptive move and find a new technique that forces black hats to work harder. Examples include system calls, signatures and machine learning. We present a method, called Hothouse, that combines simulation and search to accelerate the white hat’s ability to counter the black hat’s incremental moves, thereby forcing black hats to perform disruptive moves more often. To realise Hothouse, we evolve EEE, an entropy-based polymorphic packer for Windows executables. Playing the role of a black hat, EEE uses evolutionary computation to disrupt the creation of malware signatures. We enter EEE into the detection arms race with VirusTotal, the most prominent cloud service for running anti-virus tools on software. During our 6 month study, we continually improved EEE in response to VirusTotal, eventually learning a packer that produces packed malware whose evasiveness goes from an initial 51.8% median to 19.6%. We report both how well VirusTotal learns to detect EEE-packed binaries and how well VirusTotal forgets in order to reduce false positives. VirusTotal’s tools learn and forget fast, actually in about 3 days. We also show where VirusTotal focuses its detection efforts, by analysing EEE’s variants
A First Look at the Crypto-Mining Malware Ecosystem: A Decade of Unrestricted Wealth
Illicit crypto-mining leverages resources stolen from victims to mine
cryptocurrencies on behalf of criminals. While recent works have analyzed one
side of this threat, i.e.: web-browser cryptojacking, only commercial reports
have partially covered binary-based crypto-mining malware. In this paper, we
conduct the largest measurement of crypto-mining malware to date, analyzing
approximately 4.5 million malware samples (1.2 million malicious miners), over
a period of twelve years from 2007 to 2019. Our analysis pipeline applies both
static and dynamic analysis to extract information from the samples, such as
wallet identifiers and mining pools. Together with OSINT data, this information
is used to group samples into campaigns. We then analyze publicly-available
payments sent to the wallets from mining-pools as a reward for mining, and
estimate profits for the different campaigns. All this together is is done in a
fully automated fashion, which enables us to leverage measurement-based
findings of illicit crypto-mining at scale. Our profit analysis reveals
campaigns with multi-million earnings, associating over 4.4% of Monero with
illicit mining. We analyze the infrastructure related with the different
campaigns, showing that a high proportion of this ecosystem is supported by
underground economies such as Pay-Per-Install services. We also uncover novel
techniques that allow criminals to run successful campaigns.Comment: A shorter version of this paper appears in the Proceedings of 19th
ACM Internet Measurement Conference (IMC 2019). This is the full versio
On the Reverse Engineering of the Citadel Botnet
Citadel is an advanced information-stealing malware which targets financial
information. This malware poses a real threat against the confidentiality and
integrity of personal and business data. A joint operation was recently
conducted by the FBI and the Microsoft Digital Crimes Unit in order to take
down Citadel command-and-control servers. The operation caused some disruption
in the botnet but has not stopped it completely. Due to the complex structure
and advanced anti-reverse engineering techniques, the Citadel malware analysis
process is both challenging and time-consuming. This allows cyber criminals to
carry on with their attacks while the analysis is still in progress. In this
paper, we present the results of the Citadel reverse engineering and provide
additional insight into the functionality, inner workings, and open source
components of the malware. In order to accelerate the reverse engineering
process, we propose a clone-based analysis methodology. Citadel is an offspring
of a previously analyzed malware called Zeus; thus, using the former as a
reference, we can measure and quantify the similarities and differences of the
new variant. Two types of code analysis techniques are provided in the
methodology, namely assembly to source code matching and binary clone
detection. The methodology can help reduce the number of functions requiring
manual analysis. The analysis results prove that the approach is promising in
Citadel malware analysis. Furthermore, the same approach is applicable to
similar malware analysis scenarios.Comment: 10 pages, 17 figures. This is an updated / edited version of a paper
appeared in FPS 201
An analysis of malware evasion techniques against modern AV engines
This research empirically tested the response of antivirus applications to binaries that use virus-like evasion techniques. In order to achieve this, a number of binaries are processed using a number of evasion methods and are then deployed against several antivirus engines. The research also documents the process of setting up an environment for testing antivirus engines, including building the evasion techniques used in the tests. The results of the empirical tests illustrate that an attacker can evade multiple antivirus engines without much effort using well-known evasion techniques. Furthermore, some antivirus engines may respond to the occurrence of an evasion technique instead of the presence of any malicious code. In practical terms, this shows that while antivirus applications are useful for protecting against known threats, their effectiveness against unknown or modified threats is limited
- …