118 research outputs found

    Polygraph: Automatically generating signatures for polymorphic worms

    Get PDF
    It is widely believed that content-signature-based intrusion detection systems (IDSes) are easily evaded by polymorphic worms, which vary their payload on every infection attempt. In this paper, we present Polygraph, a signature generation system that successfully produces signatures that match polymorphic worms. Polygraph generates signatures that consist of multiple disjoint content sub-strings. In doing so, Polygraph leverages our insight that for a real-world exploit to function properly, multiple invariant substrings must often be present in all variants of a payload; these substrings typically correspond to protocol framing, return addresses, and in some cases, poorly obfuscated code. We contribute a definition of the polymorphic signature generation problem; propose classes of signature suited for matching polymorphic worm payloads; and present algorithms for automatic generation of signatures in these classes. Our evaluation of these algorithms on a range of polymorphic worms demonstrates that Polygraph produces signatures for polymorphic worms that exhibit low false negatives and false positives. © 2005 IEEE

    Detecting and Modeling Polymorphic Shellcode

    Get PDF
    In this thesis, we address the problem of modeling and detecting polymorphic engines shellcode. By polymorphic engines, we mean programs having the ability to transform any piece of malware into many instances consisting of different code but having the same functionality as the original malware. Typically, polymorphic engines work by encrypting the target malware using various encryption techniques and providing a decryption module in order to execute the newly encrypted instance. Moreover, those engines have the ability to mutate their decryption routine making them unique from one instance to another and hard to detect. Our analysis focuses on polymorphic shellcode, which is shellcode that uses a polymorphic engine to mutate while keeping the original function of the code the same. We propose a new concept of signatures, shape signatures, which cope with the highly mutated nature of those engines. Those signatures try to identify the constant part as well as the mutated part of the deciphering routines. This combination is able to cope with the highly mutated nature of those engines in a much more efficient way compared to traditional signatures used in most intrusion detection systems. The second part of the thesis aims at modeling those polymorphic engines by showing that they exhibit commo

    Detecting Malicious Software By Dynamicexecution

    Get PDF
    Traditional way to detect malicious software is based on signature matching. However, signature matching only detects known malicious software. In order to detect unknown malicious software, it is necessary to analyze the software for its impact on the system when the software is executed. In one approach, the software code can be statically analyzed for any malicious patterns. Another approach is to execute the program and determine the nature of the program dynamically. Since the execution of malicious code may have negative impact on the system, the code must be executed in a controlled environment. For that purpose, we have developed a sandbox to protect the system. Potential malicious behavior is intercepted by hooking Win32 system calls. Using the developed sandbox, we detect unknown virus using dynamic instruction sequences mining techniques. By collecting runtime instruction sequences in basic blocks, we extract instruction sequence patterns based on instruction associations. We build classification models with these patterns. By applying this classification model, we predict the nature of an unknown program. We compare our approach with several other approaches such as simple heuristics, NGram and static instruction sequences. We have also developed a method to identify a family of malicious software utilizing the system call trace. We construct a structural system call diagram from captured dynamic system call traces. We generate smart system call signature using profile hidden Markov model (PHMM) based on modularized system call block. Smart system call signature weakly identifies a family of malicious software

    Malware Resistant Data Protection in Hyper-connected Networks: A survey

    Full text link
    Data protection is the process of securing sensitive information from being corrupted, compromised, or lost. A hyperconnected network, on the other hand, is a computer networking trend in which communication occurs over a network. However, what about malware. Malware is malicious software meant to penetrate private data, threaten a computer system, or gain unauthorised network access without the users consent. Due to the increasing applications of computers and dependency on electronically saved private data, malware attacks on sensitive information have become a dangerous issue for individuals and organizations across the world. Hence, malware defense is critical for keeping our computer systems and data protected. Many recent survey articles have focused on either malware detection systems or single attacking strategies variously. To the best of our knowledge, no survey paper demonstrates malware attack patterns and defense strategies combinedly. Through this survey, this paper aims to address this issue by merging diverse malicious attack patterns and machine learning (ML) based detection models for modern and sophisticated malware. In doing so, we focus on the taxonomy of malware attack patterns based on four fundamental dimensions the primary goal of the attack, method of attack, targeted exposure and execution process, and types of malware that perform each attack. Detailed information on malware analysis approaches is also investigated. In addition, existing malware detection techniques employing feature extraction and ML algorithms are discussed extensively. Finally, it discusses research difficulties and unsolved problems, including future research directions.Comment: 30 pages, 9 figures, 7 tables, no where submitted ye

    Multimodal Approach for Malware Detection

    Get PDF
    Although malware detection is a very active area of research, few works were focused on using physical properties (e.g., power consumption) and multimodal features for malware detection. We designed an experimental testbed that allowed us to run samples of malware and non-malicious software applications and to collect power consumption, network traffic, and system logs data, and subsequently to extract dynamic behavioral-based features. We also extracted code-based static features of both malware and non-malicious software applications. These features were used for malware detection based on: feature level fusion using power consumption and network traffic data, feature level fusion using network traffic data and system logs, and multimodal feature level and decision level fusion. The contributions when using feature level fusion of power consumption and network traffic data are: (1) We focused on detecting real malware using the extracted dynamic behavioral features (both power-based and network traffic-based) and supervised machine learning algorithms, which has not been done by any of the prior works. (2) We ran a large number of machine learning experiments, which allowed us to identify the best performing learner, DC voltage rails that led to the best malware detection performance, and the subset of features that are the best predictors for malware detection. (3) The comparison of malware detection performance was done using a comprehensive set of metrics that reflect different aspects of the quality of malware detection. In the case of the feature level fusion using network traffic data and system logs, the contributions are: (1) Most of the previous works that have used network flows-based features have done classification of the network traffic, while our focus was on classifying the software running in a machine as malware and non-malicious software using the extracted dynamic behavioral features. (2) We experimented with different sizes of the training set (i.e., 90%, 75%, 50%, and 25% of the data) and found that smaller training sets produced very good classification results. This aspect of our work has a practical value because the manual labeling of the training set is a tedious and time consuming process. In this dissertation we present a multimodal deep learning neural network that integrates different modalities (i.e., power consumption, system logs, network traffic, and code-based static data) using decision level fusion. We evaluated the performance of each modality individually, when using feature level fusion, and when using decision level fusion. The contributions of our multimodal approach are as follow: (1) Collecting data from different modalities allowed us to develop a multimodal approach to malware detection, which has not been widely explored by prior works. Even more, none of the previous works compared the performance of feature level fusion with decision level fusion, which is explored in this dissertation. (2) We proposed a multimodal decision level fusion malware detection approach using a deep neural network and compared its performance with the performance of feature level fusion approaches based on deep neural network and standard supervised machine learning algorithms (i.e., Random Forest, J48, JRip, PART, Naive Bayes, and SMO)

    Generating content-based signatures for detecting bot-infected machines

    Get PDF
    Ankara : The Department of Computer Engineering and the Institute of Engineering and Science of Bilkent University, 2008.Thesis (Master's) -- Bilkent University, 2008.Includes bibliographical references leaves 76-79.A botnet is a network of compromised machines that are remotely controlled and commanded by an attacker, who is often called the botmaster. Such botnets are often abused as platforms to launch distributed denial of service attacks, send spam mails or perform identity theft. In recent years, the basic motivations for malicious activity have shifted from script kiddie vandalism in the hacker community, to more organized attacks and intrusions for financial gain. This shift explains the reason for the rise of botnets that have capabilities to perform more sophisticated malicious activities. Recently, researchers have tried to develop botnet detection mechanisms. The botnet detection mechanisms proposed to date have serious limitations, since they either can handle only certain types of botnets or focus on only specific botnet attributes, such as the spreading mechanism, the attack mechanism, etc., in order to constitute their detection models. We present a system that monitors network traffic to identify bot-infected hosts. Our goal is to develop a more general detection model that identifies single infected machines without relying on the bot propagation vector. To this end, we leverage the insight that all of the bots get a command and perform an action as a response, since the command and response behavior is the unique characteristic that distinguishes the bots from other malware. Thus, we examine the network traffic generated by bots to locate command and response behaviors. Afterwards, we generate signatures from the similar commands that are followed by similar bot responses without any explicit knowledge about the command and control protocol. The signatures are deployed to an IDS that monitors the network traffic of a university. Finally, the experiments showed that our system is capable of detecting bot-infected machines with a low false positive rate.Bilge, LeylaM.S

    Dynamic Application Level Security Sensors

    Get PDF
    The battle for cyber supremacy is a cat and mouse game: evolving threats from internal and external sources make it difficult to protect critical systems. With the diverse and high risk nature of these threats, there is a need for robust techniques that can quickly adapt and address this evolution. Existing tools such as Splunk, Snort, and Bro help IT administrators defend their networks by actively parsing through network traffic or system log data. These tools have been thoroughly developed and have proven to be a formidable defense against many cyberattacks. However, they are vulnerable to zero-day attacks, slow attacks, and attacks that originate from within. Should an attacker or some form of malware make it through these barriers and onto a system, the next layer of defense lies on the host. Host level defenses include system integrity verifiers, virus scanners, and event log parsers. Many of these tools work by seeking specific attack signatures or looking for anomalous events. The defenses at the network and host level are similar in nature. First, sensors collect data from the security domain. Second, the data is processed, and third, a response is crafted based on the processing. The application level security domain lacks this three step process. Application level defenses focus on secure coding practices and vulnerability patching, which is ineffective. The work presented in this thesis uses a technique that is commonly employed by malware, dynamic-link library (DLL) injection, to develop dynamic application level security sensors that can extract fine-grain data at runtime. This data can then be processed to provide stronger application level defense by shrinking the vulnerability window. Chapters 5 and 6 give proof of concept sensors and describe the process of developing the sensors in detail
    corecore