Search CORE

86 research outputs found

Recommended from our members

Improving Security Through Egalitarian Binary Recompilation

Author: Williams-King David Christopher
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2021
Field of study

In this thesis, we try to bridge the gap between which program transformations are possible at source-level and which are possible at binary-level. While binaries are typically seen as opaque artifacts, our binary recompiler Egalito (ASPLOS 2020) enables users to parse and modify stripped binaries on existing systems. Our technique of binary recompilation is not robust to errors in disassembly, but with an accurate analysis, provides near-zero transformation overhead. We wrote several demonstration security tools with Egalito, including code randomization, control-flow integrity, retpoline insertion, and a fuzzing backend. We also wrote Nibbler (ACSAC 2019, DTRAP 2020), which detects unused code and removes it. Many of these features, including Nibbler, can be combined with other defenses resulting in multiplicatively stronger or more effective hardening. Enabled by our recompiler, an overriding theme of this thesis is our focus on deployable software transformation. Egalito has been tested by collaborators across tens of thousands of Debian programs and libraries. We coined this term egalitarian in the context of binary security. Simply put, an egalitarian analysis or security mechanism is one that can operate on itself (and is usually more deployable as a result). As one demonstration of this idea, we created a strong, deployable defense against code reuse attacks. Shuffler (OSDI 2016) randomizes function addresses, moving functions periodically every few milliseconds. This makes an attacker's job extremely difficult, especially if they are located across a network (which necessitates ping time) -- JIT-ROP attacks take 2.3 to 378 seconds to complete. Shuffler is egalitarian and defends its own code and target code simultaneously; Shuffler actually shuffles itself. We hope our deployable, egalitarian binary defenses will allow others to improve upon state-of-the-art and paint binaries as far more malleable than they have been in the past

Columbia University Academic Commons

Mitigating security and privacy threats from untrusted application components on Android

Author: Huang Jie
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2021
Field of study

Aufgrund von Androids datenzentrierter und Open-Source Natur sowie von fehlerhaften/bösartigen Apps durch das lockere Marktzulassungsverfahren, ist die Privatsphäre von Benutzern besonders gefährdet. Diese Dissertation präsentiert eine Reihe von Forschungsarbeiten, die die Bedrohung der Sicherheit/Privatsphäre durch nicht vertrauenswürdige Appkomponenten mindern. Die erste Arbeit stellt eine Compiler-basierte Kompartmentalisierungslösung vor, die Privilegientrennung nutzt, um eine starke Barriere zwischen der Host-App und Bibliothekskomponenten zu etablieren, und somit sensible Daten vor der Kompromittierung durch neugierige/bösartige Werbe-Bibliotheken schützt. Für fehleranfällige Bibliotheken von Drittanbietern implementieren wir in der zweiten Arbeit ein auf API-Kompatibilität basierendes Bibliothek-Update-Framework, das veraltete Bibliotheken durch Drop-Ins aktualisiert, um das durch Bibliotheken verursachte Zeitfenster der Verwundbarkeit zu minimieren. Die neueste Arbeit untersucht die missbräuchliche Nutzung von privilegierten Accessibility(a11y)-Funktionen in bösartigen Apps. Wir zeigen ein datenschutzfreundliches a11y-Framework, das die a11y-Logik wie eine Pipeline behandelt, die aus mehreren Modulen besteht, die in verschiedenen Sandboxen laufen. Weiterhin erzwingen wir eine Flusskontrolle über die Kommunikation zwischen den Modulen, wodurch die Angriffsfläche für den Missbrauch von a11y-APIs verringert wird, während die Vorteile von a11y erhalten bleiben.While Android’s data-intensive and open-source nature, combined with its less-than-strict market approval process, has allowed the installation of flawed and even malicious apps, its coarse-grained security model and update bottleneck in the app ecosystem make the platform’s privacy and security situation more worrying. This dissertation introduces a line of works that mitigate privacy and security threats from untrusted app components. The first work presents a compiler-based library compartmentalization solution that utilizes privilege separation to establish a strong trustworthy boundary between the host app and untrusted lib components, thus protecting sensitive user data from being compromised by curious or malicious ad libraries. While for vulnerable third-party libraries, we then build the second work that implements an API-compatibility-based library update framework using drop-in replacements of outdated libraries to minimize the open vulnerability window caused by libraries and we perform multiple dynamic tests and case studies to investigate its feasibility. Our latest work focuses on the misusing of powerful accessibility (a11y) features in untrusted apps. We present a privacy-enhanced a11y framework that treats the a11y logic as a pipeline composed of multiple modules running in different sandboxes. We further enforce flow control over the communication between modules, thus reducing the attack surface from abusing a11y APIs while preserving the a11y benefits

Universaar

Acronym

Opaque Control-Flow Integrity

Author
Publication venue: 'Internet Society'
Publication date: 01/01/2015
Field of study

Crossref

Opaque control-flow integrity.

Author: Kevin W Hamlen
Michael Franz
Per Larsen † Stefan Brunthaler
Vishwath Mohan
Publication venue
Publication date: 01/01/2015
Field of study

Abstract-A new binary software randomization and ControlFlow Integrity (CFI) enforcement system is presented, which is the first to efficiently resist code-reuse attacks launched by informed adversaries who possess full knowledge of the inmemory code layout of victim programs. The defense mitigates a recent wave of implementation disclosure attacks, by which adversaries can exfiltrate in-memory code details in order to prepare code-reuse attacks (e.g., Return-Oriented Programming (ROP) attacks) that bypass fine-grained randomization defenses. Such implementation-aware attacks defeat traditional fine-grained randomization by undermining its assumption that the randomized locations of abusable code gadgets remain secret. Opaque CFI (O-CFI) overcomes this weakness through a novel combination of fine-grained code-randomization and coarsegrained control-flow integrity checking. It conceals the graph of hijackable control-flow edges even from attackers who can view the complete stack, heap, and binary code of the victim process. For maximal efficiency, the integrity checks are implemented using instructions that will soon be hardware-accelerated on commodity x86-x64 processors. The approach is highly practical since it does not require a modified compiler and can protect legacy binaries without access to source code. Experiments using our fully functional prototype implementation show that O-CFI provides significant probabilistic protection against ROP attacks launched by adversaries with complete code layout knowledge, and exhibits only 4.7% mean performance overhead on current hardware (with further overhead reductions to follow on forthcoming Intel processors). I. MOTIVATION Code-reuse attacks (cf., Permission to freely reproduce all or part of this paper for noncommercial purposes is granted provided that copies bear this notice and the full citation on the first page. Reproduction for commercial purposes is strictly prohibited without the prior written consent of the Internet Society, the first-named author (for reproduction of an entire paper only), and the author's employer if the paper was prepared within the scope of employment. This has motivated copious work on defenses against codereuse threats. Prior defenses can generally be categorized into: CFI [1] and artificial software diversity CFI restricts all of a program's runtime control-flows to a graph of whitelisted control-flow edges. Usually the graph is derived from the semantics of the program source code or a conservative disassembly of its binary code. As a result, CFIprotected programs reject control-flow hijacks that attempt to traverse edges not supported by the original program's semantics. Fine-grained CFI monitors indirect control-flows precisely; for example, function callees must return to their exact callers. Although such precision provides the highest security, it also tends to incur high performance overheads (e.g., 21% for precise caller-callee return-matching [1]). Because this overhead is often too high for industry adoption, researchers have proposed many optimized, coarser-grained variants of CFI. Coarse-grained CFI trades some security for better performance by reducing the precision of the checks. For example, functions must return to valid call sites (but not necessarily to the particular site that invoked the callee). Unfortunately, such relaxations have proved dangerous-a number of recent proof-of-concept exploits have shown how even minor relaxations of the control-flow policy can be exploited to effect attacks Artificial software diversity offers a different but complementary approach that randomizes programs in such a way that attacks succeeding against one program instance have a very low probability of success against other (independently randomized) instances of the same program. Probabilistic defenses rely on memory secrecy-i.e., the effects of randomization must remain hidden from attackers. One of the simplest and most widely adopted forms of artificial diversity is Address Space Layout Randomization (ASLR), which randomizes the base addresses of program segments at loadtime. Unfortunately, merely randomizing the base addresses does not yield sufficient entropy to preserve memory secrecy in many cases; there are numerous successful derandomization attacks against ASLR Recently, a new wave of implementation disclosure attacks Experiments show that O-CFI enjoys performance overheads comparable to standard fine-grained diversity and non-opaque, coarse-grained CFI. Moreover, O-CFI's control-flow checking logic is implemented using Intel x86/x64 memory-protection extensions (MPX) that are expected to be hardware-accelerated in commodity CPUs from 2015 onwards. We therefore expect even better performance for O-CFI in the near future. Our contributions are as follows: • We introduce O-CFI, the first low-overhead code-reuse defense that tolerates implementation disclosures. • We describe our implementation of a fully functional prototype that protects stripped, x86 legacy binaries without source code. II. THREAT MODEL Our work is motivated by the emergence of attacks against fine-grained diversity and coarse-grained control-flow integrity. We therefore introduce these attacks and distill them into a single, unified threat model. A. Bypassing Coarse-Grained CFI Ideally, CFI permits only programmer-intended control-flow transfers during a program's execution. The typical approach is to assign a unique ID to each permissible indirect controlflow target, and check the IDs at runtime. Unfortunately, this introduces performance overhead proportional to the degree of the graph-the more overlaps between valid target sets of indirect branch instructions, the more IDs must be stored and checked at each branch. Moreover, perfect CFI cannot be realized with a purely static control-flow graph; for example, the permissible destinations of function returns depend on the calling context, which is only known at runtime. Fine-grained CFI therefore implements a dynamically computed shadow stack, incurring high overheads To avoid this, coarse-grained CFI implementations resort to a reduced-degree, static approximation of the control-flow graph, and merge identifiers at the cost of reduced security. Typically, attackers need more than a single 4K page worth of code to find enough gadgets to mount a code-reuse attack. To discourage brute-force searches for more code pages, artificial diversity defenses routinely mine the address space with unmapped pages that abort the process if accessed B. Assumptions Given these sobering realities, we adopt a conservative threat model that assumes that attackers will eventually find and disassemble all code pages in victim processes. Our threat model therefore assumes that the adversary knows the complete in-memory code layout-including the locations of any gadgets required to launch a ROP attack. We also assume that the attacker can read and write the full contents of the heap and stack, as well as any data structures used by the dynamic loader. In keeping with common practice, we assume that data execution protection is activated, so that code page permissions can be maintained as either writable or executable but not both. However, we assume that attackers cannot safely perform a comprehensive, linear scan of virtual memory, since defenders may place unmapped guard pages at random locations. Instead, attackers must follow references from one disclosed memory page to another III. O-CFI OVERVIEW O-CFI combines insights from CFI and automated software diversity. It extends CFI with a new, coarse-grained CFI enforcement strategy inspired by bounds-checking, that validates control-flow transfers without divulging the bounds against which their destinations are checked. Bounds-checking is fast, the bounds are easier to conceal than arbitrary gadget locations, and the bounds are randomizable. This imbues CFI and fine-grained software diversity with an additional layer of protection against code-reuse attacks aided by implementation disclosures. As a result, O-CFI enjoys performance similar to coarse-grained CFI, with probabilistic security guarantees similar to fine-grained artificial diversity in the absence of implementation disclosures. Following traditional CFI, an O-CFI policy assigns to each indirect branch site a destination set that captures its set of permissible destination addresses. Such a graph can be derived from the program's source code or (with lesser precision) a conservative disassembly of its object code. We next reformulate this policy as a bounds-checking problem by reducing each destination set to only its minimal and maximal members. This policy approximation can be efficiently enforced by confining each branch to the memory-aligned addresses within its destination set range. All intended destination addresses are aligned within these bounds, so the enforcement conservatively preserves intended control-flows. Code layout is optimized to tighten the bounds, so that the set of unintended, aligned destinations within the bounds remains minimal. These few remaining unintended but reachable destinations are protected by the artificial diversity half of our approach. Our artificial diversity approach probabilistically protects the aligned, in-bounds, but policy-violating control-flows by applying fine-grained randomization to the binary code at load-time. While the overall strategy for implementing this randomization step is based on prior works Reformulating CFI in this way forces attackers to change their plan of attack. The recent attacks against coarse-grained CFI succeed by finding exploitable code that is reachable due to policy-relaxations needed for acceptable performance. These relaxations admit an alarming array of false-positives: instead of identifying the actual caller, all call-preceded instructions are incorrectly identified as permitted branch destinations. Such instructions saturate a typical address space, giving attackers too much wiggle room to build attacks. O-CFI counters this by changing the approximation approach: each branch destination is restricted to a relatively short span of aligned addresses, with all the bounds chosen pseudo-randomly at load-time. This greatly narrows the field of possible hijacks, and it removes the opportunity for attackers to analyze programs ahead of time for viable ROP gadget chains. In O-CFI, no two program instances admit the same set of ROP payloads, since the bounds are all randomized every time the program is loaded. Since the security of coarse-grained CFI depends in part on the precision of its policy approximation, it is worthwhile to improve the precision by tightening the bounds imposed upon each branch. This effectively reduces the space of attacker guesses that might succeed in hijacking any given branch. To reduce this space as much as possible, we introduce a novel binary code optimization, called portals, that minimizes the distance covered by the lowest and greatest element of each indirect branch's destination set. Our fine-grained artificial diversity implementation is an adaptation and extension of binary stirring To protect against information leaks that might disclose bounds information, our implementation is carefully designed to keep all bounds opaque to external threats. They are randomly chosen at load-time (as a side-effect of binary stirring) and stored in a bounds lookup table (BLT) located at a randomly chosen base address. The table size is very small relative to the virtual address space, and attackers cannot safely perform bruteforce scans of the full address space (see §II-B), so guessing the BLT's location is probabilistically infeasible for attackers. No code or data sections contain any pointer references to BLT addresses; all references are computed dynamically at load-time and stored henceforth exclusively in protected registers. A. Bounding the Control Flow For each indirect branch site with (non-empty) destination set D, O-CFI guards the branch instruction with a bounds-check that continues execution only if the impending target t satisfies t ∈ [min D, max D]. Indirect branch instructions include all control-flow transfer instructions that target computed destinations, including return instructions. Failure of the boundscheck solicits immediate process termination with an error code (for easier debugging). Termination could be replaced with a different intervention if desired, such as an automated attack analysis or alarm, followed by restart and re-randomization. The bounds-check implementation first loads the pair (min D, max D) from the BLT into registers via an indirect, indexed memory reference. The load instruction's arguments and syntax are independent of the BLT's location, concealing its address from attackers who can read the checking code. The impending branch target t is then checked against the loaded bounds. If the check succeeds, execution continues; otherwise the process immediately terminates with a bounds range (#BR) exception. The #BR exception helps distinguish between crashes and guessing attacks. To resist guessing attacks (e.g., BROP), web servers and other services should use this exception to trigger re-randomization as they restart. Following the approaches of PittSFIeld To bypass these checks, an attacker must craft a payload whose every gadget is properly aligned and falls within the bounds of the preceding gadget's conclusory indirect branch. The odds of guessing a reachable series of such gadgets decrease exponentially with the number of gadgets in the desired payload. B. Opacifying Control-flow Bounds Diversifying bounds. The bounds introduced by O-CFI constitute a coarse-grained CFI policy. Section II warns that such coarse granularity can lead to vulnerabilities. However, to exploit such vulnerabilities, attackers must discover which control-flows adhere to the CFI policy and which do not. To make the impermissible flows opaque to attackers, we use diversity. Our prototype uses a modified version of the technique outlined by Wartell et al. Performing fine-grain code randomization at load-time indirectly randomizes the ranges used to bound the control-flow. In contrast to other CFI techniques, attackers therefore do not have a priori knowledge of the control-flow bounds. Preventing Information Leaks. Attackers bypass fine grained diversity using information leaks, such as those described in §II-A. Were O-CFI's control-flow bounds expressed as constants in the instruction stream, attackers could bypass our defense via information leaks. To avoid this, we instead confine this sensitive information to an isolated data page, the BLT. The BLT is initialized at a random virtual memory address at load-time, and there are no pointer references (obfuscated or otherwise) to any BLT address in any code or data page in the process. This keeps its location hidden from attackers. Furthermore, we take additional steps to prevent accidental BLT disclosure via pointer leaks. Our prototype stores BLT base addresses in segment selectors-a legacy feature of all x86 processors. In particular, each load from the BLT uses the gs segment selector and a unique index to read the correct bounds. We only use the gs selector for instructions that implement bounds checks, so there are no other instructions that adversaries can reuse to learn its value. Attackers are also prevented from executing instructions that reveal the contents of the segment registers, since such instructions are privileged. To succeed, attackers must therefore (i) guess branch ranges, or (ii) guess the base address of the BLT. The odds of correctly guessing the location of the BLT are low enough to provide probabilistic protection. On 32-bit Windows Systems, for instance, the chances of guessing the base address are or less than one in two billion. Incorrect guesses alert defenders and trigger re-randomization with high probability (by accessing an unallocated memory page). The likelihood of successfully guessing a reachable gadget chain is a function of the length of the chain and the span of the bounds. The next section therefore focuses on reducing the average bounds span. C. Tightening Control-flow Check Bounds The distance between the lowest and highest intended destinations of any given indirect control-flow transfer instruction depends on the code layout. Placing indirect branches close to their targets both reduces bounds and improves locality, elevating both security and efficiency. Therefore we organize the code segment into clusters-one per indirect branch-each containing the basic blocks targeted by a particular branch. To accommodate blocks that are destinations of multiple distinct branch instructions, we consider three options: (i) put the block in one cluster and expand the bounds of other branches to include its address, (ii) create duplicate copies of the block in multiple clusters, or (iii) add a portal block to each cluster, which unconditionally jumps to the block. Each solution incurs a trade-off: expanding bounds reduces security, creating duplicates increases code size, and portals introduce runtime overhead. The options are not mutually exclusive, affording optimizers a range of strategies. Our experiments indicate that portals are often the best choice (see below). The capacity of the portal system limits the number of portals per nexus. Varying nexus capacity allows O-CFI to be tuned to different requirements. Setting it to zero prevents the creation of any portals, forcing the optimizer to choose alternative options. At the other extreme, setting no upper limit allows a portal to be created for every target, reducing all bounds ranges to wt, where w is the alignment width (usually 16 bytes; see §V-A) and t is the number of targets of the branch. At this setting, all indirect branches can only branch into a nexus, and through them, only to exactly those addresses that have been statically identified as targets. Thus, O-CFI with unbounded nexus capacity enforces fine-grained, static CFI. The extra layer of indirection imposed by a portal has a minor impact on runtime; there is thus a trade-off between security and performance. Users may opt for full CFI enforcement with O-CFI for security-critical components, and lower the nexus capacity to a desired performance level for less critical software. In our experiments, we found that a nexus capacity of 12 results in a significant reduction in bounds sizes with imperceptible performance effects. All of our experiments in §V use this nexus capacity. Section V-D details how different nexus capacities affect bound ranges. D. Example Defense against JIT-ROP The following example illustrates how O-CFI secures binaries against disclosure attacks. Consider a binary whose code segment contains five useful gadgets g 1 , . . . , g 5 . Each gadget terminates in an indirect branch protected by a bounds check. Under appropriate conditions, a disclosure attack such as JIT-ROP is able to recover a large portion of the runtime layout of the binary In our example, if g 1 is selected to be part of the payload, it can only be chained with gadget g 4 or g 5 . Attempting to jump from g 1 to any other gadget triggers a bounds violation that stops the attack. Similarly, an attack that hijacks a controlflow to c 1 can only redirect it to gadgets g 1 , g 2 , or g 3 ; all other gadgets are outside cluster c 1 and are therefore detected as impermissible destinations of the hijacked branch. Broadly speaking, all links in a payload's chain must traverse edges in the Cartesian product of the (aligned) gadget sets within the corresponding clusters. A successful attack must therefore limit itself to an extremely sparse graph of available edges. Our experiments (see §V-C) indicate that in practice the probability of successfully chaining gadgets in such a sparse graph is very low-just 0.01% for a four-gadget payload. The entropy of our procedure is further analyzed in §VI-A. IV. O-CFI IMPLEMENTATION We have implemented a fully functional prototype of O-CFI for the Intel x86 architecture. Our implementation uses a binary rewriting framework that secures COTS x86 binaries without source, debug, or relocation information. Like traditional CFI, however, we emphasize that O-CFI is equally suitable for inclusion in a compiler. Our rewriter generates a transformed version of the binary that leverages 1) a coarse-grained CFI policy that bounds control-flows, 2) fine-grained randomization to thwart traditional ROP attacks and diversify control-flow bounds so they become unknown and unreliable for attackers, 3) x86 segmentation registers to prevent accidental leakage of the bounds lookup table (BLT), and 4) an SFI framework similar to PittSFIeld [29] to enforce instruction alignment, denying attackers access to misaligned instructions that bypass bounds checks. The architecture of O-CFI is shown in A. Static Binary Rewriting 1) Conservative Disassembly: We first disassemble the code section using a conservative disassembler. Similar to the approach outlined by Wartell et al. [45], the code section is duplicated, with the old copy (renamed to .told) serving as a read-only data segment and the new copy (called .tnew) containing the rewritten executable code. The .told section is set non-executable, and all code blocks identified as possible targets of indirect jumps are overwritten with a five-byte tagged pointer. The tagged pointer consists of a tag byte (0xF4) followed by the four-byte address of that block in the .tnew section. The tag byte facilitates efficient runtime redirection of stale pointers to their correct targets,

CiteSeerX

Protecting applications using trusted execution environments

Author: Priebe Christian
Publication venue: Computing, Imperial College London
Publication date: 01/11/2020
Field of study

While cloud computing has been broadly adopted, companies that deal with sensitive data are still reluctant to do so due to privacy concerns or legal restrictions. Vulnerabilities in complex cloud infrastructures, resource sharing among tenants, and malicious insiders pose a real threat to the confidentiality and integrity of sensitive customer data. In recent years trusted execution environments (TEEs), hardware-enforced isolated regions that can protect code and data from the rest of the system, have become available as part of commodity CPUs. However, designing applications for the execution within TEEs requires careful consideration of the elevated threats that come with running in a fully untrusted environment. Interaction with the environment should be minimised, but some cooperation with the untrusted host is required, e.g. for disk and network I/O, via a host interface. Implementing this interface while maintaining the security of sensitive application code and data is a fundamental challenge. This thesis addresses this challenge and discusses how TEEs can be leveraged to secure existing applications efficiently and effectively in untrusted environments. We explore this in the context of three systems that deal with the protection of TEE applications and their host interfaces: SGX-LKL is a library operating system that can run full unmodified applications within TEEs with a minimal general-purpose host interface. By providing broad system support inside the TEE, the reliance on the untrusted host can be reduced to a minimal set of low-level operations that cannot be performed inside the enclave. SGX-LKL provides transparent protection of the host interface and for both disk and network I/O. Glamdring is a framework for the semi-automated partitioning of TEE applications into an untrusted and a trusted compartment. Based on source-level annotations, it uses either dynamic or static code analysis to identify sensitive parts of an application. Taking into account the objectives of a small TCB size and low host interface complexity, it defines an application-specific host interface and generates partitioned application code. EnclaveDB is a secure database using Intel SGX based on a partitioned in-memory database engine. The core of EnclaveDB is its logging and recovery protocol for transaction durability. For this, it relies on the database log managed and persisted by the untrusted database server. EnclaveDB protects against advanced host interface attacks and ensures the confidentiality, integrity, and freshness of sensitive data.Open Acces

Spiral - Imperial College Digital Repository

Robust Low-Overhead Binary Rewriting: Design, Extensibility, And Customizability

Author: Majlesi Kupaei Amir
Publication venue
Publication date: 01/01/2021
Field of study

Binary rewriting is the foundation of a wide range of binary analysis tools and techniques, including securing untrusted code, enforcing control-flow integrity, dynamic optimization, profiling, race detection, and taint tracking to prevent data leaks. There are two equally important and necessary criteria that a binary rewriter must have: it must be robust and incur low overhead. First, a binary rewriter must work for different binaries, including those produced by commercial compilers from a wide variety of languages, and possibly modified by obfuscation tools. Second, the binary rewriter must be low overhead. Although the off-line use of programs, such as testing and profiling, can tolerate large overheads, the use of binary rewriters in deployed programs must not introduce significant overheads; typically, it should not be more than a few percent. Existing binary rewriters have their challenges: static rewriters do not reliably work for stripped binaries (i.e., those without relocation information), and dynamic rewriters suffer from high base overhead. Because of this high overhead, existing dynamic rewriters are limited to off-line testing and cannot be practically used in deployment. In the first part, we have designed and implemented a dynamic binary rewriter called RL-Bin, a robust binary rewriter that can instrument binaries reliably with very low overhead. Unlike existing static rewriters, RL-Bin works for all benign binaries, including stripped binaries that do not contain relocation information. In addition, RL-Bin does not suffer from high overhead because its design is not based on the code-cache, which is the primary mechanism for other dynamic rewriters such as Pin, DynamoRIO, and Dyninst. RL-Bin's design and optimization methods have empowered RL-Bin to rewrite binaries with very low overhead (1.04x on average for SPECrate 2017) and very low memory overhead (1.69x for SPECrate 2017). In comparison, existing dynamic rewriters have a high runtime overhead (1.16x for DynamoRIO, 1.29x for Pin, and 1.20x for Dyninst) and have a bigger memory footprint (2.5x for DynamoRIO, 2.73x for Pin, and 2.3x for Dyninst). RL-Bin differentiates itself from other rewriters by having negligible overhead, which is proportional to the added instrumentation. This low overhead is achieved by utilizing an in-place design and applying multiple novel optimization methods. As a result, lightweight instrumentation can be added to applications deployed in live systems for monitoring and analysis purposes. In the second part, we present RL-Bin++, an improved version of RL-Bin, that handles various problematic real-world features commonly found in obfuscated binaries. We demonstrate the effectiveness of RL-Bin++ for the SPECrate 2017 benchmark obfuscated with UPX, PECompact, and ASProtect obfuscation tools. RL-Bin++ can efficiently instrument heavily obfuscated binaries (overhead averaging 2.76x, compared to 4.11x, 4.72x, and 5.31x overhead respectively caused by DynamoRIO, Dyninst, and Pin). However, the major accomplishment is that we achieved this while maintaining the low overhead of RL-Bin for unobfuscated binaries (only 1.04x). The extra level of robustness is achieved by employing dynamic deobfuscation techniques and using a novel hybrid in-place and code-cache design. Finally, to show the efficacy of RL-Bin in the development of sophisticated and efficient analysis tools, we have designed, implemented, and tested two novel applications of RL-Bin; An application-level file access permission system and a security tool for enforcing secure execution of applications. Using RL-Bin's system call instrumentation capability, we developed a fine-grained file access permission system that enables the user to define separate file access policies for each application. The overhead is very low, only 6%, making this tool practical to be used in live systems. Secondly, we designed a security enforcement tool that instruments indirect control transfer instructions to ensure that the program execution follows the predetermined anticipated path. Hence, it would protect the application from being hijacked. Our implementation showed effectiveness in detecting exploits in real-world programs while being practical with a low overhead of only 9%

Digital Repository at the University of Maryland

DETECTION, DIAGNOSIS AND MITIGATION OF MALICIOUS JAVASCRIPT WITH ENRICHED JAVASCRIPT EXECUTIONS

Author: Hu Xunchao
Publication venue: SURFACE at Syracuse University
Publication date: 22/12/2017
Field of study

Malicious JavaScript has become an important attack vector for software exploitation attacks and imposes a severe threat to computer security. In particular, three major class of problems, malware detection, exploit diagnosis, and exploits mitigation, bring considerable challenges to security researchers. Although a lot of research efforts have been made to address these threats, they have fundamental limitations and thus cannot solve the problems. Existing analysis techniques fall into two general categories: static analysis and dynamic analysis. Static analysis tends to produce inaccurate results (both false positive and false negative) and is vulnerable to a wide series of obfuscation techniques. Thus, dynamic analysis is constantly gaining popularity for exposing the typical features of malicious JavaScript. However, existing dynamic analysis techniques possess limitations such as limited code coverage and incomplete environment setup, leaving a broad attack surface for evading the detection. Once a zero-day exploit is captured, it is critical to quickly pinpoint the JavaScript statements that uniquely characterize the exploit and the payload location in the exploit. However, the current diagnosis techniques are inadequate because they approach the problem either from a JavaScript perspective and fail to account for “implicit” data flow invisible at JavaScript level, or from a binary execution perspective and fail to present the JavaScript level view of exploit. Although software vendors have deployed techniques like ASLR, sandbox, etc. to mitigate JavaScript exploits, hacking contests (e.g.,PWN2OWN, GeekPWN) have demonstrated that the latest software (e.g., Chrome, IE, Edge, Safari) can still be exploited. An ideal JavaScript exploit mitigation solution should be flexible and allow for deployment without requiring code changes. To combat malicious JavaScript, this dissertation addresses these problems through enriched executions, which explore arbitrary paths for detection, preserve JS-binary semantics for diagnosis, and perturbs memory with chaff code for mitigation. Firstly, JSForce, a forced execution engine for JavaScript, is proposed and developed to improve the detection results of current malicious JavaScript detection techniques. It drives an arbitrary JavaScript snippet to execute along different paths without any input or environment setup. While increasing code coverage, JSForce can tolerate invalid object accesses while introducing no runtime errors during execution. Secondly, JScalpel, a system that utilizes the JavaScript context information from the JavaScript level to perform context-aware binary analysis, is presented for JavaScript exploit diagnosis. In essence, it performs JS-Binary analysis to (1) generate a minimized exploit script, which in turn helps to generate a signature for the exploit, and (2) precisely locate the payload within the exploit. It replaces the malicious payload with a friendly payload and generates a PoV for the exploit. Thirdly, ChaffyScript, a vulnerability-agnostic mitigation system, is introduced to block JavaScript exploits via undermining the memory preparation stage. Specifically, given suspicious JavaScript, ChaffyScript rewrites the code to insert memory perturbation code, and then generates semantically-equivalent code. JavaScript exploits will fail as a result of unexpected memory states introduced by memory perturbation code, while the benign JavaScript still behaves as expected since the memory perturbation code does not change the JavaScript’s original semantics

Syracuse University Research Facility and Collaborative Environment

Recommended from our members

Repurposing Software Defenses with Specialized Hardware

Author: Sinha Kanad
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

Computer security has largely been the domain of software for the last few decades. Although this approach has been moderately successful during this period, its problems have started becoming more apparent recently because of one primary reason — performance. Software solutions typically exact a significant toll in terms of program slowdown, especially when applied to large, complex software. In the past, when chips became exponentially faster, this growing burden could be accommodated almost for free. But as Moore’s law winds down, security-related slowdowns become more apparent, increasingly intolerable, and subsequently abandoned. As a result, the community has started looking elsewhere for continued protection, as attacks continue to become progressively more sophisticated. One way to mitigate this problem is to complement these defenses in hardware. Despite lacking the semantic perspective of high-level software, specialized hardware typically is not only faster, but also more energy-efficient. However, hardware vendors also have to factor in the cost of integrating security solutions from the perspective of effectiveness, longevity, and cost of development, while allaying the customer’s concerns of performance. As a result, although numerous hardware solutions have been proposed in the past, the fact that so few of them have actually transitioned into practice implies that they were unable to strike an optimal balance of the above qualities. This dissertation proposes the thesis that it is possible to add hardware features that complement and improve program security, traditionally provided by software, without requiring extensive modifications to existing hardware microarchitecture. As such, it marries the collective concerns of not only users and software developers, who demand performant but secure products, but also that of hardware vendors, since implementation simplicity directly relates to reduction in time and cost of development and deployment. To support this thesis, this dissertation discusses two hardware security features aimed at securing program code and data separately and details their full system implementations, and a study of a negative result where the design was deemed practically infeasible, given its high implementation complexity. Firstly, the dissertation discusses code protection by reviving instruction set randomization (ISR), an idea originally proposed for countering code injection and considered impractical in the face of modern attack vectors that employ reuse of existing program code (also known as code reuse attacks). With Polyglot, we introduce ISR with strong AES encryption along with basic code randomization that disallows code decryption at runtime, thus countering most forms of state-of-the-art dynamic code reuse attacks, that read the code at runtime prior to building the code reuse payload. Through various optimizations and corner case workarounds, we show how Polyglot enables code execution with minimal hardware changes while maintaining a small attack surface and incurring nominal overheads even when the code is strongly encrypted in the binary and memory. Next, the dissertation presents REST, a hardware primitive that allows programs to mark memory regions invalid for regular memory accesses. This is achieved simply by storing a large, pre-determined random value at those locations with a special store instruction and then, detecting incoming values at the data cache for matches to the predetermined value. Subsequently, we show how this primitive can be used to protect data from common forms of spatial and temporal memory safety attacks. Notably, because of the simplicity of the primitive, REST requires trivial microarchitectural modifications and hence, is easy to implement, and exhibits negligible performance overheads. Additionally, we demonstrate how it is able to provide practical heap safety even for legacy binaries. For the above proposals, we also detail their hardware implementations on FPGAs, and discuss how each fits within a complete multiprocess system. This serves to give the reader an idea of usage and deployment challenges on a broader scale that goes beyond just the technique’s effectiveness within the context of a single program. Lastly, the dissertation discusses an alternative to the virtual address space, that randomizes the sequence of addresses in a manner invisible to even the program, thus achieving transparent randomization of the entire address space at a very fine granularity. The biggest challenge is to achieve this with minimal microarchitectural changes while accommodating linear data structures in the program (e.g., arrays, structs), both of which are fundamentally based on a linear address space. As a result, this modified address space subsumes the benefits of most other spatial randomization schemes, with the additional benefit of ideally making traversal from one data structure to another impossible. Our study of this idea concludes that although valuable, current memory safety techniques are cheaper to implement and secure enough, so that there are no perceivable use cases for this model of address space safety

Columbia University Academic Commons