655 research outputs found

    Metamorphic Code Generation from LLVM IR Bytecode

    Get PDF
    Metamorphic software changes its internal structure across generations with its functionality remaining unchanged. Metamorphism has been employed by malware writers as a means of evading signature detection and other advanced detection strate- gies. However, code morphing also has potential security benefits, since it increases the “genetic diversity” of software. In this research, we have created a metamorphic code generator within the LLVM compiler framework. LLVM is a three-phase compiler that supports multiple source languages and target architectures. It uses a common intermediate representation (IR) bytecode in its optimizer. Consequently, any supported high-level programming language can be transformed to this IR bytecode as part of the LLVM compila- tion process. Our metamorphic generator functions at the IR bytecode level, which provides many advantages over previously developed metamorphic generators. The morphing techniques that we employ include dead code insertion—where the dead code is actually executed within the morphed code—and subroutine permutation. We have tested the effectiveness of our code morphing using hidden Markov model analysis

    Pre-filters in-transit malware packets detection in the network

    Get PDF
    Conventional malware detection systems cannot detect most of the new malware in the network without the availability of their signatures. In order to solve this problem, this paper proposes a technique to detect both metamorphic (mutated malware) and general (non-mutated) malware in the network using a combination of known malware sub-signature and machine learning classification. This network-based malware detection is achieved through a middle path for efficient processing of non-malware packets. The proposed technique has been tested and verified using multiple data sets (metamorphic malware, non-mutated malware, and UTM real traffic), this technique can detect most of malware packets in the network-based before they reached the host better than the previous works which detect malware in host-based. Experimental results showed that the proposed technique can speed up the transmission of more than 98% normal packets without sending them to the slow path, and more than 97% of malware packets are detected and dropped in the middle path. Furthermore, more than 75% of metamorphic malware packets in the test dataset could be detected. The proposed technique is 37 times faster than existing technique

    Effective methods to detect metamorphic malware: A systematic review

    Get PDF
    The succeeding code for metamorphic Malware is routinely rewritten to remain stealthy and undetected within infected environments. This characteristic is maintained by means of encryption and decryption methods, obfuscation through garbage code insertion, code transformation and registry modification which makes detection very challenging. The main objective of this study is to contribute an evidence-based narrative demonstrating the effectiveness of recent proposals. Sixteen primary studies were included in this analysis based on a pre-defined protocol. The majority of the reviewed detection methods used Opcode, Control Flow Graph (CFG) and API Call Graph. Key challenges facing the detection of metamorphic malware include code obfuscation, lack of dynamic capabilities to analyse code and application difficulty. Methods were further analysed on the basis of their approach, limitation, empirical evidence and key parameters such as dataset, Detection Rate (DR) and False Positive Rate (FPR)

    Malware Resistant Data Protection in Hyper-connected Networks: A survey

    Full text link
    Data protection is the process of securing sensitive information from being corrupted, compromised, or lost. A hyperconnected network, on the other hand, is a computer networking trend in which communication occurs over a network. However, what about malware. Malware is malicious software meant to penetrate private data, threaten a computer system, or gain unauthorised network access without the users consent. Due to the increasing applications of computers and dependency on electronically saved private data, malware attacks on sensitive information have become a dangerous issue for individuals and organizations across the world. Hence, malware defense is critical for keeping our computer systems and data protected. Many recent survey articles have focused on either malware detection systems or single attacking strategies variously. To the best of our knowledge, no survey paper demonstrates malware attack patterns and defense strategies combinedly. Through this survey, this paper aims to address this issue by merging diverse malicious attack patterns and machine learning (ML) based detection models for modern and sophisticated malware. In doing so, we focus on the taxonomy of malware attack patterns based on four fundamental dimensions the primary goal of the attack, method of attack, targeted exposure and execution process, and types of malware that perform each attack. Detailed information on malware analysis approaches is also investigated. In addition, existing malware detection techniques employing feature extraction and ML algorithms are discussed extensively. Finally, it discusses research difficulties and unsolved problems, including future research directions.Comment: 30 pages, 9 figures, 7 tables, no where submitted ye

    Neural malware detection

    Get PDF
    At the heart of today’s malware problem lies theoretically infinite diversity created by metamorphism. The majority of conventional machine learning techniques tackle the problem with the assumptions that a sufficiently large number of training samples exist and that the training set is independent and identically distributed. However, the lack of semantic features combined with the models under these wrong assumptions result largely in overfitting with many false positives against real world samples, resulting in systems being left vulnerable to various adversarial attacks. A key observation is that modern malware authors write a script that automatically generates an arbitrarily large number of diverse samples that share similar characteristics in program logic, which is a very cost-effective way to evade detection with minimum effort. Given that many malware campaigns follow this paradigm of economic malware manufacturing model, the samples within a campaign are likely to share coherent semantic characteristics. This opens up a possibility of one-to-many detection. Therefore, it is crucial to capture this non-linear metamorphic pattern unique to the campaign in order to detect these seemingly diverse but identically rooted variants. To address these issues, this dissertation proposes novel deep learning models, including generative static malware outbreak detection model, generative dynamic malware detection model using spatio-temporal isomorphic dynamic features, and instruction cognitive malware detection. A comparative study on metamorphic threats is also conducted as part of the thesis. Generative adversarial autoencoder (AAE) over convolutional network with global average pooling is introduced as a fundamental deep learning framework for malware detection, which captures highly complex non-linear metamorphism through translation invariancy and local variation insensitivity. Generative Adversarial Network (GAN) used as a part of the framework enables oneshot training where semantically isomorphic malware campaigns are identified by a single malware instance sampled from the very initial outbreak. This is a major innovation because, to the best of our knowledge, no approach has been found to this challenging training objective against the malware distribution that consists of a large number of very sparse groups artificially driven by arms race between attackers and defenders. In addition, we propose a novel method that extracts instruction cognitive representation from uninterpreted raw binary executables, which can be used for oneto- many malware detection via one-shot training against frequency spectrum of the Transformer’s encoded latent representation. The method works regardless of the presence of diverse malware variations while remaining resilient to adversarial attacks that mostly use random perturbation against raw binaries. Comprehensive performance analyses including mathematical formulations and experimental evaluations are provided, with the proposed deep learning framework for malware detection exhibiting a superior performance over conventional machine learning methods. The methods proposed in this thesis are applicable to a variety of threat environments here artificially formed sparse distributions arise at the cyber battle fronts.Doctor of Philosoph

    Metamorphic Malware Detection Based on Support Vector Machine Classification of Malware Sub-Signatures

    Get PDF
    Achieving accurate and efficient metamorphic malware detection remains a challenge. Metamorphic malware is able to mutate and alter its code structure in each infection, with some vital functionality and codesegment remain unchanged. We exploit these unchanged features for detecting metamorphic malware detection using Support Vector Machine(SVM) classifier. n-gram features are extracted directly from sample malware binaries to avoid disassembly, which are then masked with the extracted Snort signature n-grams. These masked features reduce considerably the number of selected n-gram features. Our method is capable to accurately detect metamorphic malware with ~99 % accuracy and low false positive rate. The proposed method is also superior than commercially available anti-viruses in detecting metamorphicmalware
    • …
    corecore