314 research outputs found

    Approximate Disassembly using Dynamic Programming

    Get PDF
    Most commercial anti-virus software uses signature based techniques to detect whether a file is infected by a virus or not. However, signature based detection systems are unable to detect metamorphic viruses, since such viruses change their internal structure from generation to generation. Previous work has shown that hidden Markov models (HMMs) can be used to detect metamorphic viruses. In this technique, the code is disassembled and the resulting opcode sequences are used for training and detection. Due to the disassembly step, this process is not efficient enough to use when a decision has to be made in real time. In this project, we explore whether dynamic programming can be used to speed up the process of disassembling, with minimal loss of accuracy. Dynamic programming is generally used to solve problems having two key attributes: optimal substructure and overlapping sub problems. During each iteration our algorithm reads part of the input stream from the executable file and determines assembly instructions, thus dividing problems into sub problems. We have created a score matrix representing digraphs of the most common opcode instructions and we have implanted a dynamic program based on this scoring matrix. For various file sizes, we determine the time taken by our dynamic program and we show that our approach is significantly faster than a standard disassembler (OllyDbg). Finally, we analyze the accuracy of our results

    Malicious cryptography techniques for unreversable (malicious or not) binaries

    Full text link
    Fighting against computer malware require a mandatory step of reverse engineering. As soon as the code has been disassemblied/decompiled (including a dynamic analysis step), there is a hope to understand what the malware actually does and to implement a detection mean. This also applies to protection of software whenever one wishes to analyze them. In this paper, we show how to amour code in such a way that reserse engineering techniques (static and dymanic) are absolutely impossible by combining malicious cryptography techniques developped in our laboratory and new types of programming (k-ary codes). Suitable encryption algorithms combined with new cryptanalytic approaches to ease the protection of (malicious or not) binaries, enable to provide both total code armouring and large scale polymorphic features at the same time. A simple 400 Kb of executable code enables to produce a binary code and around 21402^{140} mutated forms natively while going far beyond the old concept of decryptor.Comment: 17 pages, 2 figures, accepted for presentation at H2HC'1

    Precise static analysis of untrusted driver binaries

    Get PDF
    Most closed source drivers installed on desktop systems today have never been exposed to formal analysis. Without vendor support, the only way to make these often hastily written, yet critical programs accessible to static analysis is to directly work at the binary level. In this paper, we describe a full architecture to perform static analysis on binaries that does not rely on unsound external components such as disassemblers. To precisely calculate data and function pointers without any type information, we introduce Bounded Address Tracking, an abstract domain that is tailored towards machine code and is path sensitive up to a tunable bound assuring termination. We implemented Bounded Address Tracking in our binary analysis platform Jakstab and used it to verify API specifications on several Windows device drivers. Even without assumptions about executable layout and procedures as made by state of the art approaches, we achieve more precise results on a set of drivers from the Windows DDK. Since our technique does not require us to compile drivers ourselves, we also present results from analyzing over 300 closed source drivers

    Sans Signature Buffer Overflow Blocker

    Get PDF
    The objective of Sans Signature buffer overflow blocker mainly is to intercept communications between a server and client, analyse the contents for the presence of executable code and prevent the code reaching the server. In this project, Sans Signature is a signature free approach, which can identify and block new and unknown buffer overflow attacks. The system can intercept the data coming via various channels before the server receives the packets. Typically, the data exchanged with a standard application is the data related to the transaction. Therefore, the presence of executable code along with data is something unwarranted. This system will analyze the incoming of the data, check is it contains any executable code. If the executable code is found, the packet is dropped. If the data packet is found to be safe, it is allowed to pass through. The payload or data is analyzed at the application layer called Proxy based Sans Signature. The system has been designed to identify certain executable pattern that is considered harmful.  It also has a thresh hold limit beyond which, the packet will be considered to be discarded. Given the intelligence of the algorithm, it prevents most of the buffer overflow attacks. The system can handle the packet analysis in a transparent manner, thus making it suitable for deployment at Firewall/Application Gateway level. Hence, it is quite powerful and efficient with very low deployment and maintenance cost. Keywords: buffer overflow attacks, code-injection attacks, Defense-side obfuscatio

    The Effect of Code Obfuscation on Authorship Attribution of Binary Computer Files

    Get PDF
    In many forensic investigations, questions linger regarding the identity of the authors of the software specimen. Research has identified methods for the attribution of binary files that have not been obfuscated, but a significant percentage of malicious software has been obfuscated in an effort to hide both the details of its origin and its true intent. Little research has been done around analyzing obfuscated code for attribution. In part, the reason for this gap in the research is that deobfuscation of an unknown program is a challenging task. Further, the additional transformation of the executable file introduced by the obfuscator modifies or removes features from the original executable that would have been used in the author attribution process. Existing research has demonstrated good success in attributing the authorship of an executable file of unknown provenance using methods based on static analysis of the specimen file. With the addition of file obfuscation, static analysis of files becomes difficult, time consuming, and in some cases, may lead to inaccurate findings. This paper presents a novel process for authorship attribution using dynamic analysis methods. A software emulated system was fully instrumented to become a test harness for a specimen of unknown provenance, allowing for supervised control, monitoring, and trace data collection during execution. This trace data was used as input into a supervised machine learning algorithm trained to identify stylometric differences in the specimen under test and provide predictions on who wrote the specimen. The specimen files were also analyzed for authorship using static analysis methods to compare prediction accuracies with prediction accuracies gathered from this new, dynamic analysis based method. Experiments indicate that this new method can provide better accuracy of author attribution for files of unknown provenance, especially in the case where the specimen file has been obfuscated

    Detecting Malicious Software By Dynamicexecution

    Get PDF
    Traditional way to detect malicious software is based on signature matching. However, signature matching only detects known malicious software. In order to detect unknown malicious software, it is necessary to analyze the software for its impact on the system when the software is executed. In one approach, the software code can be statically analyzed for any malicious patterns. Another approach is to execute the program and determine the nature of the program dynamically. Since the execution of malicious code may have negative impact on the system, the code must be executed in a controlled environment. For that purpose, we have developed a sandbox to protect the system. Potential malicious behavior is intercepted by hooking Win32 system calls. Using the developed sandbox, we detect unknown virus using dynamic instruction sequences mining techniques. By collecting runtime instruction sequences in basic blocks, we extract instruction sequence patterns based on instruction associations. We build classification models with these patterns. By applying this classification model, we predict the nature of an unknown program. We compare our approach with several other approaches such as simple heuristics, NGram and static instruction sequences. We have also developed a method to identify a family of malicious software utilizing the system call trace. We construct a structural system call diagram from captured dynamic system call traces. We generate smart system call signature using profile hidden Markov model (PHMM) based on modularized system call block. Smart system call signature weakly identifies a family of malicious software

    A Framework for File Format Fuzzing with Genetic Algorithms

    Get PDF
    Secure software, meaning software free from vulnerabilities, is desirable in today\u27s marketplace. Consumers are beginning to value a product\u27s security posture as well as its functionality. Software development companies are recognizing this trend, and they are factoring security into their entire software development lifecycle. Secure development practices like threat modeling, static analysis, safe programming libraries, run-time protections, and software verification are being mandated during product development. Mandating these practices improves a product\u27s security posture before customer delivery, and these practices increase the difficulty of discovering and exploiting vulnerabilities. Since the 1980\u27s, security researchers have uncovered software defects by fuzz testing an application. In fuzz testing\u27s infancy, randomly generated data could discover multiple defects quickly. However, as software matures and software development companies integrate secure development practices into their development life cycles, fuzzers must apply more sophisticated techniques in order to retain their ability to uncover defects. Fuzz testing must evolve, and fuzz testing practitioners must devise new algorithms to exercise an application in unexpected ways. This dissertation\u27s objective is to create a proof-of-concept genetic algorithm fuzz testing framework to exercise an application\u27s file format parsing routines. The framework includes multiple genetic algorithm variations, provides a configuration scheme, and correlates data gathered from static and dynamic analysis to guide negative test case evolution. Experiments conducted for this dissertation illustrate the effectiveness of a genetic algorithm fuzzer in comparison to standard fuzz testing tools. The experiments showcase a genetic algorithm fuzzer\u27s ability to discover multiple unique defects within a limited number of negative test cases. These experiments also highlight an application\u27s increased execution time when fuzzing with a genetic algorithm. To combat increased execution time, a distributed architecture is implemented and additional experiments demonstrate a decrease in execution time comparable to standard fuzz testing tools. A final set of experiments provide guidance on fitness function selection with a CHC genetic algorithm fuzzer with different population size configurations
    • …
    corecore