551 research outputs found

    JavaScript Metamorphic Malware Detection Using Machine Learning Techniques

    Get PDF
    Various factors like defects in the operating system, email attachments from unknown sources, downloading and installing a software from non-trusted sites make computers vulnerable to malware attacks. Current antivirus techniques lack the ability to detect metamorphic viruses, which vary the internal structure of the original malware code across various versions, but still have the exact same behavior throughout. Antivirus software typically relies on signature detection for identifying a virus, but code morphing evades signature detection quite effectively. JavaScript is used to generate metamorphic malware by changing the code’s Abstract Syntax Tree without changing the actual functionality, making it very difficult to detect by antivirus software. As JavaScript is prevalent almost everywhere, it becomes an ideal candidate language for spreading malware. This research aims to detect metamorphic malware using various machine learning models like K Nearest Neighbors, Random Forest, Support Vector Machine, and Naïve Bayes. It also aims to test the effectiveness of various morphing techniques that can be used to reduce the accuracy of the classification model. Thus, this involves improvement on both fronts of generation and detection of the malware helping antivirus software detect morphed codes with better accuracy. In this research, JavaScript based metamorphic engine reduces the accuracy of a trained malware detector. While N-gram frequency based feature vectors give good accuracy results for classifying metamorphic malware, HMM feature vectors provide the best results

    Measuring Malware Evolution

    Get PDF
    In this research, we simulate the effect of code evolution by applying a variety of code morphing strategies. Specifically, we consider code substitution, transposition, insertion, and deletion. We then analyze the effect of these code morphing strategies relative to a variety of malware scores that have been considered in previous research. Our goal is to gain a better understanding of the strengths and weaknesses of these various malware scoring techniques. This research should prove useful in designing more robust scores for detecting malware

    REDIR: Automated Static Detection of Obfuscated Anti-Debugging Techniques

    Get PDF
    Reverse Code Engineering (RCE) to detect anti-debugging techniques in software is a very difficult task. Code obfuscation is an anti-debugging technique makes detection even more challenging. The Rule Engine Detection by Intermediate Representation (REDIR) system for automated static detection of obfuscated anti-debugging techniques is a prototype designed to help the RCE analyst improve performance through this tedious task. Three tenets form the REDIR foundation. First, Intermediate Representation (IR) improves the analyzability of binary programs by reducing a large instruction set down to a handful of semantically equivalent statements. Next, an Expert System (ES) rule-engine searches the IR and initiates a sensemaking process for anti-debugging technique detection. Finally, an IR analysis process confirms the presence of an anti-debug technique. The REDIR system is implemented as a debugger plug-in. Within the debugger, REDIR interacts with a program in the disassembly view. Debugger users can instantly highlight anti-debugging techniques and determine if the presence of a debugger will cause a program to take a conditional jump or fall through to the next instruction

    Detecting Malicious Software By Dynamicexecution

    Get PDF
    Traditional way to detect malicious software is based on signature matching. However, signature matching only detects known malicious software. In order to detect unknown malicious software, it is necessary to analyze the software for its impact on the system when the software is executed. In one approach, the software code can be statically analyzed for any malicious patterns. Another approach is to execute the program and determine the nature of the program dynamically. Since the execution of malicious code may have negative impact on the system, the code must be executed in a controlled environment. For that purpose, we have developed a sandbox to protect the system. Potential malicious behavior is intercepted by hooking Win32 system calls. Using the developed sandbox, we detect unknown virus using dynamic instruction sequences mining techniques. By collecting runtime instruction sequences in basic blocks, we extract instruction sequence patterns based on instruction associations. We build classification models with these patterns. By applying this classification model, we predict the nature of an unknown program. We compare our approach with several other approaches such as simple heuristics, NGram and static instruction sequences. We have also developed a method to identify a family of malicious software utilizing the system call trace. We construct a structural system call diagram from captured dynamic system call traces. We generate smart system call signature using profile hidden Markov model (PHMM) based on modularized system call block. Smart system call signature weakly identifies a family of malicious software

    An enhanced performance model for metamorphic computer virus classification and detectioN

    Get PDF
    Metamorphic computer virus employs various code mutation techniques to change its code to become new generations. These generations have similar behavior and functionality and yet, they could not be detected by most commercial antivirus because their solutions depend on a signature database and make use of string signature-based detection methods. However, the antivirus detection engine can be avoided by metamorphism techniques. The purpose of this study is to develop a performance model based on computer virus classification and detection. The model would also be able to examine portable executable files that would classify and detect metamorphic computer viruses. A Hidden Markov Model implemented on portable executable files was employed to classify and detect the metamorphic viruses. This proposed model that produce common virus statistical patterns was evaluated by comparing the results with previous related works and famous commercial antiviruses. This was done by investigating the metamorphic computer viruses and their features, and the existing classifications and detection methods. Specifically, this model was applied on binary format of portable executable files and it was able to classify if the files belonged to a virus family. Besides that, the performance of the model, practically implemented and tested, was also evaluated based on detection rate and overall accuracy. The findings indicated that the proposed model is able to classify and detect the metamorphic virus variants in portable executable file format with a high average of 99.7% detection rate. The implementation of the model is proven useful and applicable for antivirus programs

    Exhaustive Statistical Analysis for Detection of Metamorphic Malware

    Get PDF
    Malware is a serious threat to the security of the system. With the widespread use of the World Wide Web, there has been a tremendous increase in virus attacks, making computer security an essential for every personal computer. The rat-race between virus writers and detectors has led to improved viruses and detection techniques. In recent years, metamorphic malwares have posed serious challenge to anti-virus writers. Current signature based detection techniques, heuristic based techniques are not comprehensive solutions. A formidable solution to detection of metamorphic malware is void. This paper investigates the problem of malware detection, specifically metamorphic malwares. The paper proposes a statistical based detection technique as a viable candidate for comprehensive detection of metamorphic malwares. Related work, experimental results and analysis of the results are presented in this paper

    Forensic Memory Classification using Deep Recurrent Neural Networks

    Get PDF
    The goal of this project is to advance the application of machine learning frameworks and tools in the process of malware detection. Specifically, a deep neural network architecture is proposed to classify application modules as benign or malicious, using the lower level memory block patterns that make up these modules. The modules correspond to blocks of functionality within files used in kernel and OS level processes as well as user level applications. The learned model is proposed to reside in an isolated core with strict communication restrictions to achieve incorruptibility as well as efficiency, therefore providing a probabilistic memory-level view of the system that is consistent with the user-level view. The lower level memory blocks are constructed using basic block sequences of varying sizes that are fed as input into Long-Short Term Memory models. Four configurations of the LSTM model are explored, by adding bi-directionality as well as Attention. Assembly level data from 50 PE files are extracted and basic blocks are constructed using the IDA Disassembler toolkit. The results show that longer basic block sequences result in richer LSTM hidden layer representations. The hidden states are fed as features into Max pooling layers or Attention layers, depending on the configuration being tested, and the final classification is performed using Logistic Regression with a single hidden layer. The bidirectional LSTM with Attention proved to be the best model, used on basic block sequences of size 29. The differences between the model’s ROC curves indicate a strong reliance on lower level, instructional features, as opposed to metadata or String features, that speak to the success of using entire assembly instructions as data, as opposed to just opcodes or higher level features

    GUIDE FOR THE COLLECTION OF INSTRUSION DATA FOR MALWARE ANALYSIS AND DETECTION IN THE BUILD AND DEPLOYMENT PHASE

    Get PDF
    During the COVID-19 pandemic, when most businesses were not equipped for remote work and cloud computing, we saw a significant surge in ransomware attacks. This study aims to utilize machine learning and artificial intelligence to prevent known and unknown malware threats from being exploited by threat actors when developers build and deploy applications to the cloud. This study demonstrated an experimental quantitative research design using Aqua. The experiment\u27s sample is a Docker image. Aqua checked the Docker image for malware, sensitive data, Critical/High vulnerabilities, misconfiguration, and OSS license. The data collection approach is experimental. Our analysis of the experiment demonstrated how unapproved images were prevented from running anywhere in our environment based on known vulnerabilities, embedded secrets, OSS licensing, dynamic threat analysis, and secure image configuration. In addition to the experiment, the forensic data collected in the build and deployment phase are exploitable vulnerability, Critical/High Vulnerability Score, Misconfiguration, Sensitive Data, and Root User (Super User). Since Aqua generates a detailed audit record for every event during risk assessment and runtime, we viewed two events on the Audit page for our experiment. One of the events caused an alert due to two failed controls (Vulnerability Score, Super User), and the other was a successful event meaning that the image is secure to deploy in the production environment. The primary finding for our study is the forensic data associated with the two events on the Audit page in Aqua. In addition, Aqua validated our security controls and runtime policies based on the forensic data with both events on the Audit page. Finally, the study’s conclusions will mitigate the likelihood that organizations will fall victim to ransomware by mitigating and preventing the total damage caused by a malware attack
    corecore