3,716 research outputs found

    On the Dissection of Evasive Malware

    Get PDF
    Complex malware samples feature measures to impede automatic and manual analyses, making their investigation cumbersome. While automatic characterization of malware benefits from recently proposed designs for passive monitoring, the subsequent dissection process still sees human analysts struggling with adversarial behaviors, many of which also closely resemble those studied for automatic systems. This gap affects the day-to-day analysis of complex samples and researchers have not yet attempted to bridge it. We make a first step down this road by proposing a design that can reconcile transparency requirements with manipulation capabilities required for dissection. Our open-source prototype BluePill (i) offers a customizable execution environment that remains stealthy when analysts intervene to alter instructions and data or run third-party tools, (ii) is extensible to counteract newly encountered anti-analysis measures using insights from the dissection, and (iii) can accommodate program analyses that aid analysts, as we explore for taint analysis. On a set of highly evasive samples BluePill resulted as stealthy as commercial sandboxes while offering new intervention and customization capabilities for dissection

    Robust and secure monitoring and attribution of malicious behaviors

    Get PDF
    Worldwide computer systems continue to execute malicious software that degrades the systemsâ performance and consumes network capacity by generating high volumes of unwanted traffic. Network-based detectors can effectively identify machines participating in the ongoing attacks by monitoring the traffic to and from the systems. But, network detection alone is not enough; it does not improve the operation of the Internet or the health of other machines connected to the network. We must identify malicious code running on infected systems, participating in global attack networks. This dissertation describes a robust and secure approach that identifies malware present on infected systems based on its undesirable use of network. Our approach, using virtualization, attributes malicious traffic to host-level processes responsible for the traffic. The attribution identifies on-host processes, but malware instances often exhibit parasitic behaviors to subvert the execution of benign processes. We then augment the attribution software with a host-level monitor that detects parasitic behaviors occurring at the user- and kernel-level. User-level parasitic attack detection happens via the system-call interface because it is a non-bypassable interface for user-level processes. Due to the unavailability of one such interface inside the kernel for drivers, we create a new driver monitoring interface inside the kernel to detect parasitic attacks occurring through this interface. Our attribution software relies on a guest kernelâ s data to identify on-host processes. To allow secure attribution, we prevent illegal modifications of critical kernel data from kernel-level malware. Together, our contributions produce a unified research outcome --an improved malicious code identification system for user- and kernel-level malware.Ph.D.Committee Chair: Giffin, Jonathon; Committee Member: Ahamad, Mustaque; Committee Member: Blough, Douglas; Committee Member: Lee, Wenke; Committee Member: Traynor, Patric

    THE SCALABLE AND ACCOUNTABLE BINARY CODE SEARCH AND ITS APPLICATIONS

    Get PDF
    The past decade has been witnessing an explosion of various applications and devices. This big-data era challenges the existing security technologies: new analysis techniques should be scalable to handle “big data” scale codebase; They should be become smart and proactive by using the data to understand what the vulnerable points are and where they locate; effective protection will be provided for dissemination and analysis of the data involving sensitive information on an unprecedented scale. In this dissertation, I argue that the code search techniques can boost existing security analysis techniques (vulnerability identification and memory analysis) in terms of scalability and accuracy. In order to demonstrate its benefits, I address two issues of code search by using the code analysis: scalability and accountability. I further demonstrate the benefit of code search by applying it for the scalable vulnerability identification [57] and the cross-version memory analysis problems [55, 56]. Firstly, I address the scalability problem of code search by learning “higher-level” semantic features from code [57]. Instead of conducting fine-grained testing on a single device or program, it becomes much more crucial to achieve the quick vulnerability scanning in devices or programs at a “big data” scale. However, discovering vulnerabilities in “big code” is like finding a needle in the haystack, even when dealing with known vulnerabilities. This new challenge demands a scalable code search approach. To this end, I leverage successful techniques from the image search in computer vision community and propose a novel code encoding method for scalable vulnerability search in binary code. The evaluation results show that this approach can achieve comparable or even better accuracy and efficiency than the baseline techniques. Secondly, I tackle the accountability issues left in the vulnerability searching problem by designing vulnerability-oriented raw features [58]. The similar code does not always represent the similar vulnerability, so it requires that the feature engineering for the code search should focus on semantic level features rather than syntactic ones. I propose to extract conditional formulas as higher-level semantic features from the raw binary code to conduct the code search. A conditional formula explicitly captures two cardinal factors of a vulnerability: 1) erroneous data dependencies and 2) missing or invalid condition checks. As a result, the binary code search on conditional formulas produces significantly higher accuracy and provides meaningful evidence for human analysts to further examine the search results. The evaluation results show that this approach can further improve the search accuracy of existing bug search techniques with very reasonable performance overhead. Finally, I demonstrate the potential of the code search technique in the memory analysis field, and apply it to address their across-version issue in the memory forensic problem [55, 56]. The memory analysis techniques for COTS software usually rely on the so-called “data structure profiles” for their binaries. Construction of such profiles requires the expert knowledge about the internal working of a specified software version. However, it is still a cumbersome manual effort most of time. I propose to leverage the code search technique to enable a notion named “cross-version memory analysis”, which can update a profile for new versions of a software by transferring the knowledge from the model that has already been trained on its old version. The evaluation results show that the code search based approach advances the existing memory analysis methods by reducing the manual efforts while maintaining the reasonable accuracy. With the help of collaborators, I further developed two plugins to the Volatility memory forensic framework [2], and show that each of the two plugins can construct a localized profile to perform specified memory forensic tasks on the same memory dump, without the need of manual effort in creating the corresponding profile
    • …
    corecore