101 research outputs found

    HyperDbg: Reinventing Hardware-Assisted Debugging (Extended Version)

    Full text link
    Software analysis, debugging, and reverse engineering have a crucial impact in today's software industry. Efficient and stealthy debuggers are especially relevant for malware analysis. However, existing debugging platforms fail to address a transparent, effective, and high-performance low-level debugger due to their detectable fingerprints, complexity, and implementation restrictions. In this paper, we present HyperDbg, a new hypervisor-assisted debugger for high-performance and stealthy debugging of user and kernel applications. To accomplish this, HyperDbg relies on state-of-the-art hardware features available in today's CPUs, such as VT-x and extended page tables. In contrast to other widely used existing debuggers, we design HyperDbg using a custom hypervisor, making it independent of OS functionality or API. We propose hardware-based instruction-level emulation and OS-level API hooking via extended page tables to increase the stealthiness. Our results of the dynamic analysis of 10,853 malware samples show that HyperDbg's stealthiness allows debugging on average 22% and 26% more samples than WinDbg and x64dbg, respectively. Moreover, in contrast to existing debuggers, HyperDbg is not detected by any of the 13 tested packers and protectors. We improve the performance over other debuggers by deploying a VMX-compatible script engine, eliminating unnecessary context switches. Our experiment on three concrete debugging scenarios shows that compared to WinDbg as the only kernel debugger, HyperDbg performs step-in, conditional breaks, and syscall recording, 2.98x, 1319x, and 2018x faster, respectively. We finally show real-world applications, such as a 0-day analysis, structure reconstruction for reverse engineering, software performance analysis, and code-coverage analysis

    HyperDbg: Reinventing Hardware-Assisted Debugging

    Get PDF
    Software analysis, debugging, and reverse engineering have a crucial impact in today's software industry. Efficient and stealthy debuggers are especially relevant for malware analysis. However, existing debugging platforms fail to address a transparent, effective, and high-performance low-level debugger due to their detectable fingerprints, complexity, and implementation restrictions. In this paper, we present StealthDbg, a new hypervisor-assisted debugger for high-performance and stealthy debugging of user and kernel applications. To accomplish this, StealthDbg relies on state-of-the-art hardware features available in today's CPUs, such as VT-x and extended page tables. In contrast to other widely used existing debuggers, we design StealthDbg using a custom hypervisor, making it independent of OS functionality or API. We propose hardware-based instruction-level emulation and OS-level API hooking via extended page tables to increase the stealthiness. Our results of the dynamic analysis of 10,853 malware samples show that StealthDbg's stealthiness allows debugging on average 22% and 26% more samples than WinDbg and x64dbg, respectively. Moreover, in contrast to existing debuggers, StealthDbg is not detected by any of the 13 tested packers and protectors. We improve the performance over other debuggers by deploying a VMX-compatible script engine, eliminating unnecessary context switches. Our experiment on three concrete debugging scenarios shows that compared to WinDbg as the only kernel debugger, StealthDbg performs step-in, conditional breaks, and syscall recording, 2.98x, 1319x, and 2018x faster, respectively. We finally show real-world applications, such as a 0-day analysis, structure reconstruction for reverse engineering, software performance analysis, and code-coverage analysis

    Low-Impact System Performance Analysis Using Hardware Assisted Tracing Techniques

    Get PDF
    RÉSUMÉ Les applications modernes sont difficiles à diagnostiquer avec les outils de débogage et de profilage traditionnels. Dans les systèmes de production, la première priorité est de minimiser la perturbation sur l'application cible. Les outils de traçage sont très appropriés pour l'étude des performances de tels systèmes car les événements sont enregistrés et l'analyse se fait a posteriori. Une des principales exigences des systèmes de traçage est le faible surcoût. L'activation d'un nombre réduit d'événements aide à respecter cette exigence, mais au prix de la diminution de la granularité de la trace. Dans cette thèse, nous présentons notre travail de recherche qui traite du problème de la granularité limitée des traces en maintenant un faible surcoût sur les applications cibles. Nous présentons de nouvelles techniques et algorithmes qui abordent le problème en se basant d'abord sur une approche de filtrage logiciel et de traçage coopératif, puis en explorant des mécanismes plus avancés de traçage matériel. Nous avons proposé une approche efficace de traçage conditionnel dans l'espace noyau et utilisateur qui se base sur des mécanismes de filtrages compilés en code natif. Afin d'atteindre l'objectif d'avoir une trace détaillée du système, nous expliquons que les processeurs modernes contiennent des blocs de traçage matériel qui n'ont pas encore été entièrement exploités dans le domaine du traçage. Nous caractérisons leur performance et nous analysons les paquets de traces, leur relation avec l'exécution du logiciel, et les possibilités de les utiliser pour une trace détaillée. Nous proposons des techniques à faible surcoût, assistées par le matériel, rendant possible une analyse détaillée permettant la détection des latences d'interruption et des appels systèmes. Nous présentons aussi une nouvelle technique qui se base sur les paquets de trace à bas niveau du processeur pour analyser efficacement les processus et les ressources utilisées dans une machine virtuelle. De plus, nous avons identifié et solutionné des problèmes reliés au traçage matériel en utilisant l'assistance logicielle du système d'exploitation, ouvrant ainsi la voie à des recherches plus approfondies sur les approches coopératives de traçage matériel-logiciel. Comme nos techniques sont axées sur les exigences du traçage à haute vitesse dans les systèmes embarqués et de production traitant des transactions à haute fréquence, nous avons constaté que nos progrès dans le domaine du traçage matériel-logiciel se sont avérés très utiles pour détecter la contention des ressources et la latence dans les systèmes.----------ABSTRACT Modern applications are becoming hard to diagnose using traditional debugging and profiling tools. For production systems the first priority is to have minimal disturbance on the target application. To analyze performance of such systems, tracing tools are imperative where events can be logged and analyzed post-execution. One of the key requirements for tracing solutions however, is low overhead. A generic solution can be to select only a few events to trace, but at the cost of trace granularity. In this thesis, we present our research work that deals with the problem of lack of high granularity in traces while maintaining a low-overhead on target applications. We present our new techniques and algorithms that approach the problem initially from a software filtering and co-operative tracing approach, and then explore more advanced hardware tracing mechanisms that can be used. We have proposed an efficient kernel and userspace conditional tracing approach, with an enhanced native compiled filtering mechanism. Continuing towards our goal to have a detailed trace of a system, we further discuss how modern processors contain new hardware tracing blocks that have not yet been fully explored and exploited in the tracing domain. We characterize their performance and analyze the trace packets, their relation with software executions and opportunities to utilize them for a detailed trace. We therefore propose low-overhead hardware assisted techniques that allow a fine grained instruction based interrupt and system call latency detection mechanism. We also present a new algorithm that shows how such low-level trace packets coming directly from the processor, can be effectively utilized to analyze even the processes or resources consumed inside a VM. We have also identified and improved upon issues related to hardware tracing itself using software assistance from operating systems thus laying out ground for further research in hardware-software co-operative tracing approaches. As our techniques are focused towards requirements of high speed tracing in embedded or production systems, catering high frequency transactions, we have found that our advancements in the hardware-software domain have proved to be invaluable in detecting resource contention and latency in systems

    OS-level Attacks and Defenses: from Software to Hardware-based Exploits

    Get PDF
    Run-time attacks have plagued computer systems for more than three decades, with control-flow hijacking attacks such as return-oriented programming representing the long-standing state-of-the-art in memory-corruption based exploits. These attacks exploit memory-corruption vulnerabilities in widely deployed software, e.g., through malicious inputs, to gain full control over the platform remotely at run time, and many defenses have been proposed and thoroughly studied in the past. Among those defenses, control-flow integrity emerged as a powerful and effective protection against code-reuse attacks in practice. As a result, we now start to see attackers shifting their focus towards novel techniques through a number of increasingly sophisticated attacks that combine software and hardware vulnerabilities to construct successful exploits. These emerging attacks have a high impact on computer security, since they completely bypass existing defenses that assume either hardware or software adversaries. For instance, they leverage physical effects to provoke hardware faults or force the system into transient micro-architectural states. This enables adversaries to exploit hardware vulnerabilities from software without requiring physical presence or software bugs. In this dissertation, we explore the real-world threat of hardware and software-based run-time attacks against operating systems. While memory-corruption-based exploits have been studied for more than three decades, we show that data-only attacks can completely bypass state-of-the-art defenses such as Control-Flow Integrity which are also deployed in practice. Additionally, hardware vulnerabilities such as Rowhammer, CLKScrew, and Meltdown enable sophisticated adversaries to exploit the system remotely at run time without requiring any memory-corruption vulnerabilities in the system’s software. We develop novel design strategies to defend the OS against hardware-based attacks such as Rowhammer and Meltdown to tackle the limitations of existing defenses. First, we present two novel data-only attacks that completely break current code-reuse defenses deployed in real-world software and propose a randomization-based defense against such data-only attacks in the kernel. Second, we introduce a compiler-based framework to automatically uncover memory-corruption vulnerabilities in real-world kernel code. Third, we demonstrate the threat of Rowhammer-based attacks in security-sensitive applications and how to enable a partitioning policy in the system’s physical memory allocator to effectively and efficiently defend against such attacks. We demonstrate feasibility and real-world performance through our prototype for the popular and widely used Linux kernel. Finally, we develop a side-channel defense to eliminate Meltdown-style cache attacks by strictly isolating the address space of kernel and user memory

    ISA semantics for ARMV8-A, RISC-V, and ChERI-MIPs

    Get PDF
    Architecture specifications notionally define the fundamental interface between hardware and software: the envelope of allowed behaviour for processor implementations, and the basic assumptions for software development and verification. But in practice, they are typically prose and pseudocode documents, not rigorous or executable artifacts, leaving software and verification on shaky ground. In this paper, we present rigorous semantic models for the sequential behaviour of large parts of the mainstream ARMv8-A, RISC-V, and MIPS architectures, and the research CHERI-MIPS architecture, that are complete enough to boot operating systems, variously Linux, FreeBSD, or seL4. Our ARMv8-A models are automatically translated from authoritative ARM-internal definitions, and (in one variant) tested against the ARM Architecture Validation Suite. We do this using a custom language for ISA semantics, Sail, with a lightweight dependent type system, that supports automatic generation of emulator code in C and OCaml, and automatic generation of proof-assistant definitions for Isabelle, HOL4, and (currently only for MIPS) Coq. We use the former for validation, and to assess specification coverage. To demonstrate the usability of the latter, we prove (in Isabelle) correctness of a purely functional characterisation of ARMv8-A address translation. We moreover integrate the RISC-V model into the RMEM tool for (user-mode) relaxed-memory concurrency exploration. We prove (on paper) the soundness of the core Sail type system. We thereby take a big step towards making the architectural abstraction actually well-defined, establishing foundations for verification and reasoning.</jats:p

    TEDDI: Tamper Event Detection on Distributed Cyber-Physical Systems

    Get PDF
    Edge devices, or embedded devices installed along the periphery of a power grid SCADA network, pose a significant threat to the grid, as they give attackers a convenient entry point to access and cause damage to other essential equipment in substations and control centers. Grid defenders would like to protect these edge devices from being accessed and tampered with, but they are hindered by the grid defender\u27s dilemma; more specifically, the range and nature of tamper events faced by the grid (particularly distributed events), the prioritization of grid availability, the high costs of improper responses, and the resource constraints of both grid networks and the defenders that run them makes prior work in the tamper and intrusion protection fields infeasible to apply. In this thesis, we give a detailed description of the grid defender\u27s dilemma, and introduce TEDDI (Tamper Event Detection on Distributed Infrastructure), a distributed, sensor-based tamper protection system built to solve this dilemma. TEDDI\u27s distributed architecture and use of a factor graph fusion algorithm gives grid defenders the power to detect and differentiate between tamper events, and also gives defenders the flexibility to tailor specific responses for each event. We also propose the TEDDI Generation Tool, which allows us to capture the defender\u27s intuition about tamper events, and assists defenders in constructing a custom TEDDI system for their network. To evaluate TEDDI, we collected and constructed twelve different tamper scenarios, and show how TEDDI can detect all of these events and solve the grid defender\u27s dilemma. In our experiments, TEDDI demonstrated an event detection accuracy level of over 99% at both the information and decision point levels, and could process a 99-node factor graph in under 233 microseconds. We also analyzed the time and resources needed to use TEDDI, and show how it requires less up-front configuration effort than current tamper protection solutions

    Identifying Code Injection and Reuse Payloads In Memory Error Exploits

    Get PDF
    Today's most widely exploited applications are the web browsers and document readers we use every day. The immediate goal of these attacks is to compromise target systems by executing a snippet of malicious code in the context of the exploited application. Technical tactics used to achieve this can be classified as either code injection - wherein malicious instructions are directly injected into the vulnerable program - or code reuse, where bits of existing program code are pieced together to form malicious logic. In this thesis, I present a new code reuse strategy that bypasses existing and up-and-coming mitigations, and two methods for detecting attacks by identifying the presence of code injection or reuse payloads. Fine-grained address space layout randomization efficiently scrambles program code, limiting one's ability to predict the location of useful instructions to construct a code reuse payload. To expose the inadequacy of this exploit mitigation, a technique for "just-in-time" exploitation is developed. This new technique maps memory on-the-fly and compiles a code reuse payload at runtime to ensure it works in a randomized application. The attack also works in face of all other widely deployed mitigations, as demonstrated with a proof-of-concept attack against Internet Explorer 10 in Windows 8. This motivates the need for detection of such exploits rather than solely relying on prevention. Two new techniques are presented for detecting attacks by identifying the presence of a payload. Code reuse payloads are identified by first taking a memory snapshot of the target application, then statically profiling the memory for chains of code pointers that reuse code to implement malicious logic. Code injection payloads are identified with runtime heuristics by leveraging hardware virtualization for efficient sandboxed execution of all buffers in memory. Employing both detection methods together to scan program memory takes about a second and produces negligible false positives and false negatives provided that the given exploit is functional and triggered in the target application version. Compared to other strategies, such as the use of signatures, this approach requires relatively little effort spent on maintenance over time and is capable of detecting never before seen attacks. Moving forward, one could use these contributions to form the basis of a unique and effective network intrusion detection system (NIDS) to augment existing systems.Doctor of Philosoph
    corecore