7 research outputs found

    Scanning native binaries to resolve unsoundness in static analysis of mixed Java-native code

    Get PDF
    Πολλές εφαρμογές πραγματικού κόσμου σε Java περιέχουν εγγενή κώδικα γραμμένο σε C και/ή C++, ο οποίος αλληλεπιδρά με τον κώδικα Java. Αν αναλύσουμε τα εγγενή αρχεία, είναι δυνατόν να υπολογίσουμε τον τρόπο με τον οποίο ο κώδικας Java καλείται από τον εγγενή κώδικα και επιπλέον να επιλύσουμε την αβεβαιότητα στις στατικές αναλύσεις τέτοιων εφαρμογών. Παρουσιάζουμε μια ανάλυση η οποία βρίσκει κλήσεις μεθόδων Java σε εγγενή κώδικα με σάρωση αποσυναρμολογημένων δυαδικών αρχείων εγγενούς κώδικα. Η κύρια πρόκληση στην ανάλυση αυτών των δυαδικών αρχείων είναι η δυσκολία εύρεσης ορίων μεθόδων προκειμένου να βρεθεί ποια εγγενής μέθοδος καλεί ποιες μεθόδους Java της ίδιας εφαρμογής. Η ανάλυση γράφτηκε σε γλώσσα Java και Datalog και βασίζεται στο Doop framework. Συγκεκριμένα, η υλοποίηση ενός Java utility για τη σάρωση δυαδικών αρχείων σε συνδυασμό με μια συγκεκριμένη λογική ανάλυσης σε Datalog επιδεικνύει τις δυνατότητες του Doop στη δημιουργία περιεκτικών και εκφραστικών στατικών αναλύσεων.Many real-world applications contain native code written in C and/or C++, which interacts with Java code. Analyzing native files, it is possible to estimate how Java code is called by native code and furthermore resolve the unsoundness in static analyses of such applications. We present an analysis that finds Java method calls in native code by scanning disassembled binary files of native code. The main challenge in analyzing these binary files is the difficulty of finding function boundaries in order to determine which native function calls which Java method of the same application. The analysis was written in the Java and Datalog languages and is based on the Doop framework. Specifically, the implementation of a Java utility for scanning binary files combined with a specific analysis logic in Datalog demonstrates Doop's capabilities in creating concise and expressive static analyses

    BinGold: Towards robust binary analysis by extracting the semantics of binary code as semantic flow graphs (SFGs)

    Get PDF
    AbstractBinary analysis is useful in many practical applications, such as the detection of malware or vulnerable software components. However, our survey of the literature shows that most existing binary analysis tools and frameworks rely on assumptions about specific compilers and compilation settings. It is well known that techniques such as refactoring and light obfuscation can significantly alter the structure of code, even for simple programs. Applying such techniques or changing the compiler and compilation settings can significantly affect the accuracy of available binary analysis tools, which severely limits their practicability, especially when applied to malware. To address these issues, we propose a novel technique that extracts the semantics of binary code in terms of both data and control flow. Our technique allows more robust binary analysis because the extracted semantics of the binary code is generally immune from light obfuscation, refactoring, and varying the compilers or compilation settings. Specifically, we apply data-flow analysis to extract the semantic flow of the registers as well as the semantic components of the control flow graph, which are then synthesized into a novel representation called the semantic flow graph (SFG). Subsequently, various properties, such as reflexive, symmetric, antisymmetric, and transitive relations, are extracted from the SFG and applied to binary analysis. We implement our system in a tool called BinGold and evaluate it against thirty binary code applications. Our evaluation shows that BinGold successfully determines the similarity between binaries, yielding results that are highly robust against light obfuscation and refactoring. In addition, we demonstrate the application of BinGold to two important binary analysis tasks: binary code authorship attribution, and the detection of clone components across program executables. The promising results suggest that BinGold can be used to enhance existing techniques, making them more robust and practical

    A Novel Malware Target Recognition Architecture for Enhanced Cyberspace Situation Awareness

    Get PDF
    The rapid transition of critical business processes to computer networks potentially exposes organizations to digital theft or corruption by advanced competitors. One tool used for these tasks is malware, because it circumvents legitimate authentication mechanisms. Malware is an epidemic problem for organizations of all types. This research proposes and evaluates a novel Malware Target Recognition (MaTR) architecture for malware detection and identification of propagation methods and payloads to enhance situation awareness in tactical scenarios using non-instruction-based, static heuristic features. MaTR achieves a 99.92% detection accuracy on known malware with false positive and false negative rates of 8.73e-4 and 8.03e-4 respectively. MaTR outperforms leading static heuristic methods with a statistically significant 1% improvement in detection accuracy and 85% and 94% reductions in false positive and false negative rates respectively. Against a set of publicly unknown malware, MaTR detection accuracy is 98.56%, a 65% performance improvement over the combined effectiveness of three commercial antivirus products

    Automatic Detection and Repair of Input Validation and Sanitization Bugs

    Get PDF
    A crucial problem in developing dependable web applications is thecorrectness of the input validation and sanitization. Bugs in stringmanipulation operations used for validation and sanitization are common,resulting in erroneous application behavior and vulnerabilities that areexploitable by malicious users. In this dissertation, we investigate theproblem of automatic detection and repair of validation and sanitization bugsboth at the client-side (JavaScript) and the server-side (PHP or Java) code.We first present a formal model for input validation and sanitizationfunctions along with a new domain specific intermediate languageto represent them. Then, we show how to extract input validation andsanitization functions in our intermediate language from both client andserver-side code in web applications. After the extraction phase, we useautomata-based static string-analysis techniques to automatically verifyand fix the extracted functions. One of our contributions is the developmentof efficient automata-based string analysis techniques for frequently used,complex string operations.We developed two basic approaches to bug detection and repair: 1)policy-based, and 2) differential. In the policy-based approach, inputvalidation and sanitization policies are expressed using two regularexpressions, one specifying the maximum policy (the upper bound for theset of strings that should be allowed) and the other specifying the minimumpolicy (the lower bound for the set of strings that should be allowed). Usingour string analysis techniques we can identify two types of errors inan input validation and sanitization function: 1) it accepts a set of strings thatis not permitted by the maximum policy (i.e., it is under-constrained),or 2) it rejects a set of strings that is permitted by the minimum policy(i.e., it is over-constrained).Our differential bug detection and repair approach does not require anypolicy specifications. It exploits the fact that, in web applications,developers typically perform redundant input validation and sanitizationin both the client and the server-side since client-side checks canbe by-passed. Using automata-based string analysis, we compare theinput validation and sanitization functions extracted from the client- andserver-side code, and identify and report the inconsistencies between them.Finally, we present an automated differential repair technique that canrepair client and server-side code with respect to each other, or acrossapplications in order to strengthen the validation and sanitizationchecks. Given a reference and a target function, our differential repairtechnique strengthens the validation and sanitization operations in thetarget function based on the reference function by automatically generatinga set of patches.We experimented with a number of real world web applications and found manybugs and vulnerabilities. Our analysis generates counter-example behaviorsdemonstrating the detected bugs and vulnerabilities to help the developerswith the debugging process. Moreover, we automatically generate patchesthat can be used to mitigate the detected bugs and vulnerabilities untildevelopers write their own patches

    String analysis for x86 binaries

    No full text

    String analysis for x86 binaries

    No full text
    corecore