406 research outputs found
Intra-procedural Path-insensitive Grams (i-grams) and Disassembly Based Features for Packer Tool Classification and Detection
The DoD relies on over seven million computing devices worldwide to accomplish a wide range of goals and missions. Malicious software, or malware, jeopardizes these goals and missions. However, determining whether an arbitrary software executable is malicious can be difficult. Obfuscation tools, called packers, are often used to hide the malicious intent of malware from anti-virus programs. Therefore detecting whether or not an arbitrary executable file is packed is a critical step in software security. This research uses machine learning methods to build a system, the Polymorphic and Non-Polymorphic Packer Detection (PNPD) system, that detects whether an executable is packed using both sequences of instructions, called i-grams, and disassembly information as features for machine learning. Both i-grams and disassembly features successfully detect packed executables with top configurations achieving average accuracies above 99.5\%, average true positive rates above 0.977, and average false positive rates below 1.6e-3 when detecting polymorphic packers
Malware Resistant Data Protection in Hyper-connected Networks: A survey
Data protection is the process of securing sensitive information from being
corrupted, compromised, or lost. A hyperconnected network, on the other hand,
is a computer networking trend in which communication occurs over a network.
However, what about malware. Malware is malicious software meant to penetrate
private data, threaten a computer system, or gain unauthorised network access
without the users consent. Due to the increasing applications of computers and
dependency on electronically saved private data, malware attacks on sensitive
information have become a dangerous issue for individuals and organizations
across the world. Hence, malware defense is critical for keeping our computer
systems and data protected. Many recent survey articles have focused on either
malware detection systems or single attacking strategies variously. To the best
of our knowledge, no survey paper demonstrates malware attack patterns and
defense strategies combinedly. Through this survey, this paper aims to address
this issue by merging diverse malicious attack patterns and machine learning
(ML) based detection models for modern and sophisticated malware. In doing so,
we focus on the taxonomy of malware attack patterns based on four fundamental
dimensions the primary goal of the attack, method of attack, targeted exposure
and execution process, and types of malware that perform each attack. Detailed
information on malware analysis approaches is also investigated. In addition,
existing malware detection techniques employing feature extraction and ML
algorithms are discussed extensively. Finally, it discusses research
difficulties and unsolved problems, including future research directions.Comment: 30 pages, 9 figures, 7 tables, no where submitted ye
Towards Adversarial Malware Detection: Lessons Learned from PDF-based Attacks
Malware still constitutes a major threat in the cybersecurity landscape, also
due to the widespread use of infection vectors such as documents. These
infection vectors hide embedded malicious code to the victim users,
facilitating the use of social engineering techniques to infect their machines.
Research showed that machine-learning algorithms provide effective detection
mechanisms against such threats, but the existence of an arms race in
adversarial settings has recently challenged such systems. In this work, we
focus on malware embedded in PDF files as a representative case of such an arms
race. We start by providing a comprehensive taxonomy of the different
approaches used to generate PDF malware, and of the corresponding
learning-based detection systems. We then categorize threats specifically
targeted against learning-based PDF malware detectors, using a well-established
framework in the field of adversarial machine learning. This framework allows
us to categorize known vulnerabilities of learning-based PDF malware detectors
and to identify novel attacks that may threaten such systems, along with the
potential defense mechanisms that can mitigate the impact of such threats. We
conclude the paper by discussing how such findings highlight promising research
directions towards tackling the more general challenge of designing robust
malware detectors in adversarial settings
Adversarial Detection of Flash Malware: Limitations and Open Issues
During the past four years, Flash malware has become one of the most
insidious threats to detect, with almost 600 critical vulnerabilities targeting
Adobe Flash disclosed in the wild. Research has shown that machine learning can
be successfully used to detect Flash malware by leveraging static analysis to
extract information from the structure of the file or its bytecode. However,
the robustness of Flash malware detectors against well-crafted evasion attempts
- also known as adversarial examples - has never been investigated. In this
paper, we propose a security evaluation of a novel, representative Flash
detector that embeds a combination of the prominent, static features employed
by state-of-the-art tools. In particular, we discuss how to craft adversarial
Flash malware examples, showing that it suffices to manipulate the
corresponding source malware samples slightly to evade detection. We then
empirically demonstrate that popular defense techniques proposed to mitigate
evasion attempts, including re-training on adversarial examples, may not always
be sufficient to ensure robustness. We argue that this occurs when the feature
vectors extracted from adversarial examples become indistinguishable from those
of benign data, meaning that the given feature representation is intrinsically
vulnerable. In this respect, we are the first to formally define and
quantitatively characterize this vulnerability, highlighting when an attack can
be countered by solely improving the security of the learning algorithm, or
when it requires also considering additional features. We conclude the paper by
suggesting alternative research directions to improve the security of
learning-based Flash malware detectors
Effective methods to detect metamorphic malware: A systematic review
The succeeding code for metamorphic Malware is routinely rewritten to
remain stealthy and undetected within infected environments. This characteristic is
maintained by means of encryption and decryption methods, obfuscation through
garbage code insertion, code transformation and registry modification which makes
detection very challenging. The main objective of this study is to contribute an
evidence-based narrative demonstrating the effectiveness of recent proposals. Sixteen
primary studies were included in this analysis based on a pre-defined protocol. The
majority of the reviewed detection methods used Opcode, Control Flow Graph (CFG)
and API Call Graph. Key challenges facing the detection of metamorphic malware
include code obfuscation, lack of dynamic capabilities to analyse code and application
difficulty. Methods were further analysed on the basis of their approach, limitation,
empirical evidence and key parameters such as dataset, Detection Rate (DR) and
False Positive Rate (FPR)
The arms race: adversarial search defeats entropy used to detect malware
Malware creators have been getting their way for too long now. String-based similarity measures can leverage ground truth in a scalable way and can operate at a level of abstraction that is difficult to combat from the code level. At the string level, information theory and, specifically, entropy play an important role related to detecting patterns altered by concealment strategies, such as polymorphism or encryption. Controlling the entropy levels in different parts of a disk resident executable allows an analyst to detect malware or a black hat to evade the detection. This paper shows these two perspectives into two scalable entropy-based tools: EnTS and EEE. EnTS, the detection tool, shows the effectiveness of detecting entropy patterns, achieving 100% precision with 82% accuracy. It outperforms VirusTotal for accuracy on combined Kaggle and VirusShare malware. EEE, the evasion tool, shows the effectiveness of entropy as a concealment strategy, attacking binary-based state of the art detectors. It learns their detection patterns in up to 8 generations of its search process, and increments their false negative rate from range 0–9%, up to the range 90–98.7%
- …