247 research outputs found

    Polygraph: Automatically generating signatures for polymorphic worms

    Get PDF
    It is widely believed that content-signature-based intrusion detection systems (IDSes) are easily evaded by polymorphic worms, which vary their payload on every infection attempt. In this paper, we present Polygraph, a signature generation system that successfully produces signatures that match polymorphic worms. Polygraph generates signatures that consist of multiple disjoint content sub-strings. In doing so, Polygraph leverages our insight that for a real-world exploit to function properly, multiple invariant substrings must often be present in all variants of a payload; these substrings typically correspond to protocol framing, return addresses, and in some cases, poorly obfuscated code. We contribute a definition of the polymorphic signature generation problem; propose classes of signature suited for matching polymorphic worm payloads; and present algorithms for automatic generation of signatures in these classes. Our evaluation of these algorithms on a range of polymorphic worms demonstrates that Polygraph produces signatures for polymorphic worms that exhibit low false negatives and false positives. Ā© 2005 IEEE

    Anagram: A Content Anomaly Detector Resistant to Mimicry Attack

    Get PDF
    In this paper, we present Anagram, a content anomaly detector that models a mixture of high-order n-grams (n > 1) designed to detect anomalous and suspicious network packet payloads. By using higher- order n-grams, Anagram can detect significant anomalous byte sequences and generate robust signatures of validated malicious packet content. The Anagram content models are implemented using highly efficient Bloom filters, reducing space requirements and enabling privacy-preserving cross-site correlation. The sensor models the distinct content flow of a network or host using a semi- supervised training regimen. Previously known exploits, extracted from the signatures of an IDS, are likewise modeled in a Bloom filter and are used during training as well as detection time. We demonstrate that Anagram can identify anomalous traffic with high accuracy and low false positive rates. Anagramā€™s high-order n-gram analysis technique is also resilient against simple mimicry attacks that blend exploits with normal appearing byte padding, such as the blended polymorphic attack recently demonstrated in. We discuss randomized n-gram models, which further raises the bar and makes it more difficult for attackers to build precise packet structures to evade Anagram even if they know the distribution of the local site content flow. Finally, Anagram-ā€™s speed and high detection rate makes it valuable not only as a standalone sensor, but also as a network anomaly flow classifier in an instrumented fault-tolerant host-based environment; this enables significant cost amortization and the possibility of a symbiotic feedback loop that can improve accuracy and reduce false positive rates over time

    Toward Machine Intelligence that Learns to Fingerprint Polymorphic Worms in IoT

    Get PDF
    Internet of Things (IoT) is fast growing. Non-PC devices under the umbrella of IoT have been increasingly applied in various fields and will soon account for a significant share of total Internet traffic. However, the security and privacy of IoT and its devices have been challenged by malware, particularly polymorphic worms that rapidly self-propagate once being launched and vary their appearance over each infection to escape from the detection of signature-based intrusion detection systems. It is well recognized that polymorphic worms are one of the most intrusive threats to IoT security. To build an effective, strong defense for IoT networks against polymorphic worms, this research proposes a machine intelligent system, termed Gram-Restricted Boltzmann Machine (Gram-RBM), which automatically generates generic fingerprints/signatures for the polymorphic worm. Two augmented N-gram based methods are designed and applied in derivation of polymorphic wormsequences, also known as fingerprints/signatures. These derived sequences are then optimized using the Gaussian-Bernoulli RBM dimension reduction algorithm. The results, gained from the experiments involved three different types of polymorphicworms, show that the system generates accurate fingerprints/signatures even under "noisy" conditions and outperforms related methods in terms of accuracy and efficiency

    Malware Detection Using Dynamic Analysis

    Get PDF
    In this research, we explore the field of dynamic analysis which has shown promis- ing results in the field of malware detection. Here, we extract dynamic software birth- marks during malware execution and apply machine learning based detection tech- niques to the resulting feature set. Specifically, we consider Hidden Markov Models and Profile Hidden Markov Models. To determine the effectiveness of this dynamic analysis approach, we compare our detection results to the results obtained by using static analysis. We show that in some cases, significantly stronger results can be obtained using our dynamic approach

    Discovering New Vulnerabilities in Computer Systems

    Get PDF
    Vulnerability research plays a key role in preventing and defending against malicious computer system exploitations. Driven by a multi-billion dollar underground economy, cyber criminals today tirelessly launch malicious exploitations, threatening every aspect of daily computing. to effectively protect computer systems from devastation, it is imperative to discover and mitigate vulnerabilities before they fall into the offensive parties\u27 hands. This dissertation is dedicated to the research and discovery of new design and deployment vulnerabilities in three very different types of computer systems.;The first vulnerability is found in the automatic malicious binary (malware) detection system. Binary analysis, a central piece of technology for malware detection, are divided into two classes, static analysis and dynamic analysis. State-of-the-art detection systems employ both classes of analyses to complement each other\u27s strengths and weaknesses for improved detection results. However, we found that the commonly seen design patterns may suffer from evasion attacks. We demonstrate attacks on the vulnerabilities by designing and implementing a novel binary obfuscation technique.;The second vulnerability is located in the design of server system power management. Technological advancements have improved server system power efficiency and facilitated energy proportional computing. However, the change of power profile makes the power consumption subjected to unaudited influences of remote parties, leaving the server systems vulnerable to energy-targeted malicious exploit. We demonstrate an energy abusing attack on a standalone open Web server, measure the extent of the damage, and present a preliminary defense strategy.;The third vulnerability is discovered in the application of server virtualization technologies. Server virtualization greatly benefits today\u27s data centers and brings pervasive cloud computing a step closer to the general public. However, the practice of physical co-hosting virtual machines with different security privileges risks introducing covert channels that seriously threaten the information security in the cloud. We study the construction of high-bandwidth covert channels via the memory sub-system, and show a practical exploit of cross-virtual-machine covert channels on virtualized x86 platforms

    Detecting worm mutations using machine learning

    Get PDF
    Worms are malicious programs that spread over the Internet without human intervention. Since worms generally spread faster than humans can respond, the only viable defence is to automate their detection. Network intrusion detection systems typically detect worms by examining packet or flow logs for known signatures. Not only does this approach mean that new worms cannot be detected until the corresponding signatures are created, but that mutations of known worms will remain undetected because each mutation will usually have a different signature. The intuitive and seemingly most effective solution is to write more generic signatures, but this has been found to increase false alarm rates and is thus impractical. This dissertation investigates the feasibility of using machine learning to automatically detect mutations of known worms. First, it investigates whether Support Vector Machines can detect mutations of known worms. Support Vector Machines have been shown to be well suited to pattern recognition tasks such as text categorisation and hand-written digit recognition. Since detecting worms is effectively a pattern recognition problem, this work investigates how well Support Vector Machines perform at this task. The second part of this dissertation compares Support Vector Machines to other machine learning techniques in detecting worm mutations. Gaussian Processes, unlike Support Vector Machines, automatically return confidence values as part of their result. Since confidence values can be used to reduce false alarm rates, this dissertation determines how Gaussian Process compare to Support Vector Machines in terms of detection accuracy. For further comparison, this work also compares Support Vector Machines to K-nearest neighbours, known for its simplicity and solid results in other domains. The third part of this dissertation investigates the automatic generation of training data. Classifier accuracy depends on good quality training data -- the wider the training data spectrum, the higher the classifier's accuracy. This dissertation describes the design and implementation of a worm mutation generator whose output is fed to the machine learning techniques as training data. This dissertation then evaluates whether the training data can be used to train classifiers of sufficiently high quality to detect worm mutations. The findings of this work demonstrate that Support Vector Machines can be used to detect worm mutations, and that the optimal configuration for detection of worm mutations is to use a linear kernel with unnormalised bi-gram frequency counts. Moreover, the results show that Gaussian Processes and Support Vector Machines exhibit similar accuracy on average in detecting worm mutations, while K-nearest neighbours consistently produces lower quality predictions. The generated worm mutations are shown to be of sufficiently high quality to serve as training data. Combined, the results demonstrate that machine learning is capable of accurately detecting mutations of known worms

    Survey on representation techniques for malware detection system

    Get PDF
    Malicious programs are malignant softwareā€™s designed by hackers or cyber offenders with a harmful intent to disrupt computer operation. In various researches, we found that the balance between designing an accurate architecture that can detect the malware and track several advanced techniques that malware creators apply to get variants of malware are always a difficult line. Hence the study of malware detection techniques has become more important and challenging within the security field. This review paper provides a detailed discussion and full reviews for various types of malware, malware detection techniques, various researches on them, malware analysis methods and different dynamic programmingbased tools that could be used to represent the malware sampled. We have provided a comprehensive bibliography in malware detection, its techniques and analysis methods for malware researchers

    A Praise for Defensive Programming: Leveraging Uncertainty for Effective Malware Mitigation

    Full text link
    A promising avenue for improving the effectiveness of behavioral-based malware detectors would be to combine fast traditional machine learning detectors with high-accuracy, but time-consuming deep learning models. The main idea would be to place software receiving borderline classifications by traditional machine learning methods in an environment where uncertainty is added, while software is analyzed by more time-consuming deep learning models. The goal of uncertainty would be to rate-limit actions of potential malware during the time consuming deep analysis. In this paper, we present a detailed description of the analysis and implementation of CHAMELEON, a framework for realizing this uncertain environment for Linux. CHAMELEON offers two environments for software: (i) standard - for any software identified as benign by conventional machine learning methods and (ii) uncertain - for software receiving borderline classifications when analyzed by these conventional machine learning methods. The uncertain environment adds obstacles to software execution through random perturbations applied probabilistically on selected system calls. We evaluated CHAMELEON with 113 applications and 100 malware samples for Linux. Our results showed that at threshold 10%, intrusive and non-intrusive strategies caused approximately 65% of malware to fail accomplishing their tasks, while approximately 30% of the analyzed benign software to meet with various levels of disruption. With a dynamic, per-system call threshold, CHAMELEON caused 92% of the malware to fail, and only 10% of the benign software to be disrupted. We also found that I/O-bound software was three times more affected by uncertainty than CPU-bound software. Further, we analyzed the logs of software crashed with non-intrusive strategies, and found that some crashes are due to the software bugs
    • ā€¦
    corecore