173 research outputs found

    Differential Fuzzing the WebAssembly

    Get PDF
    WebAssembly, colloquially known as Wasm, is a specification for an intermediate representation that is suitable for the web environment, particularly in the client-side. It provides a machine abstraction and hardware-agnostic instruction sets, where a high-level programming language can target the compilation to the Wasm instead of specific hardware architecture. The JavaScript engine implements the Wasm specification and recompiles the Wasm instruction to the target machine instruction where the program is executed. Technically, Wasm is similar to a popular virtual machine bytecode, such as Java Virtual Machine (JVM) or Microsoft Intermediate Language (MSIL). There are two major implementations of Wasm, correlated with the two most popular web browsers in the market. These two are the V8 engine by Chromium project and the SpiderMonkey engine by Mozilla. Wasm does not mandate a specific implementation over its specification. Therefore, both engines may employ different mechanisms to apply the specification. These different implementations may open a research question: are both engines implementing the Wasm specification equally? In this thesis, we are going to explore the internal implementation of the JavaScript engine in regards to the Wasm specification. We experimented using a differential fuzzing technique, in which we test two JavaScript engines with a randomly generated Wasm program and compares its behavior. We executed the experiment to identify any anomalous behavior, which then we analyzed and identified the root cause of the different behavior. This thesis covers the WebAssembly specification extensively. It discusses several foundational knowledge about the specification that is currently lacking in references. This thesis also presents the instrumentation made to the JavaScript engine to perform the experiment, which can be a foundation to perform a similar experiment. Finally, this thesis analyzes the identified anomaly found in the experiment through reverse engineering techniques, such as static and dynamic analysis, combined with white-box analysis to the JavaScript engine source code. In this experiment, we discovered a different behavior of the JavaScript engine that is observable from the perspective of the Wasm program. We created a proof-of-concept to demonstrate the different behavior that can be executed in the recent web browser up to the writing of this thesis. This experiment also evaluated the implementation of both JavaScript engine on the Wasm specification to conclude that both engines implement the specification faithfully

    Towards Principled Dynamic Analysis on Android

    Get PDF
    The vast amount of information and services accessible through mobile handsets running the Android operating system has led to the tight integration of such devices into our daily routines. However, their capability to capture and operate upon user data provides an unprecedented insight into our private lives that needs to be properly protected, which demands for comprehensive analysis and thorough testing. While dynamic analysis has been applied to these problems in the past, the corresponding literature consists of scattered work that often specializes on sub-problems and keeps on re-inventing the wheel, thus lacking a structured approach. To overcome this unsatisfactory situation, this dissertation introduces two major systems that advance the state-of-the-art of dynamically analyzing the Android platform. First, we introduce a novel, fine-grained and non-intrusive compiler-based instrumentation framework that allows for precise and high-performance modification of Android apps and system components. Second, we present a unifying dynamic analysis platform with a special focus on Android’s middleware in order to overcome the common challenges we identified from related work. Together, these two systems allow for a more principled approach for dynamic analysis on Android that enables comparability and composability of both existing and future work.Die enorme Menge an Informationen und Diensten, die durch mobile EndgerĂ€te mit dem Android Betriebssystem zugĂ€nglich gemacht werden, hat zu einer verstĂ€rkten Einbindung dieser GerĂ€te in unseren Alltag gefĂŒhrt. Gleichzeitig erlauben die dabei verarbeiteten Benutzerdaten einen beispiellosen Einblick in unser Privatleben. Diese Informationen mĂŒssen adĂ€quat geschĂŒtzt werden, was umfassender Analysen und grĂŒndlicher PrĂŒfung bedarf. Dynamische Analysetechniken, die in der Vergangenheit hier bereits angewandt wurden, fokussieren sich oftmals auf Teilprobleme und reimplementieren regelmĂ€ĂŸig bereits existierende Komponenten statt einen strukturierten Ansatz zu verfolgen. Zur Überwindung dieser unbefriedigenden Situation stellt diese Dissertation zwei Systeme vor, die den Stand der Technik dynamischer Analyse der Android Plattform erweitern. ZunĂ€chst prĂ€sentieren wir ein compilerbasiertes, feingranulares und nur geringfĂŒgig eingreifendes Instrumentierungsframework fĂŒr prĂ€zises und performantes Modifizieren von Android Apps und Systemkomponenten. Anschließend fĂŒhren wir eine auf die Android Middleware spezialisierte Plattform zur Vereinheitlichung von dynamischer Analyse ein, um die aus existierenden Arbeiten extrahierten, gemeinsamen Herausforderungen in diesem Gebiet zu ĂŒberwinden. Zusammen erlauben diese beiden Systeme einen prinzipienorientierten Ansatz zur dynamischen Analyse, welcher den Vergleich und die ZusammenfĂŒhrung existierender und zukĂŒnftiger Arbeiten ermöglicht

    Applications of information sharing for code generation in process virtual machines

    Get PDF
    As the backbone of many computing environments today, it is important that process virtual machines be both performant and robust in mobile, personal desktop, and enterprise applications. This thesis focusses on code generation within these virtual machines, particularly addressing situations where redundant work is being performed. The goal is to exploit information sharing in order to improve the performance and robustness of virtual machines that are accelerated by native code generation. First, the thesis investigates the potential to share generated code between multiple threads in a dynamic binary translator used to perform instruction set simulation. This is done through a code generation design that allows native code to be executed by any simulated core and adding a mechanism to share native code regions between threads. This is shown to improve the average performance of multi-threaded benchmarks by 1.4x when simulating 128 cores on a quad-core host machine. Secondly, the ahead-of-time code generation system used for executing Android applications is improved through the use of profiling. The thesis investigates the potential for profiles produced by individual users of applications to be shared and merged together to produce a generic profile that still provides a lot of benefit for a new user who is then able to skip the expensive profiling phase. These profiles can not only be used for selective compilation to reduce code-size and installation time, but can also be used for focussed optimisation on vital code regions of an application in order to improve overall performance. With selective compilation applied to a set of popular Android applications, code-size can be reduced by 49.9% on average, while installation time can be reduced by 31.8%, with only an average 8.5% increase in the amount of sequential runtime required to execute the collected profiles. The thesis also shows that, among the tested users, the use of a crowd-sourced and merged profile does not significantly affect their estimated performance loss from selective compilation (0.90x-0.92x) in comparison to when they they perform selective compilation with their own unique profile (0.93x). Furthermore, by proposing a new, more powerful code generator for Android’s virtual machine, these same profiles can be used to perform focussed optimisation, which preliminary results show to increase runtime performance across a set of common Android benchmarks by 1.46x-10.83x. Finally, in such a situation where a new code generator is being added to a virtual machine, it is also important to test the code generator for correctness and robustness. The methods of execution of a virtual machine, such as interpreters and code generators, must share a set of semantics about how programs must be executed, and this can be exploited in order to improve testing. This is done through the application of domain-aware binary fuzzing and differential testing within Android’s virtual machine. The thesis highlights a series of actual code generation and verification bugs that were found in Android’s virtual machine using this testing methodology, as well as comparing the proposed approach to other state-of-the-art fuzzing techniques

    Dependability Assessment of Android OS

    Get PDF
    In this brave new world of smartphone-dependent society, dependability is a strong requirement and needs to be addressed properly. Assessing the dependability of these mobile system is still an open issue, and companies should have the tools to improve their devices and beat the competition against other vendors. The main objective of this dissertation is to provide the methods to assess the dependability of mobile OS, fundamental for further improvements. Mobile OS are threatened mainly by traditional residual faults (when errors spread across components as failures), aging-related faults (when errors accumulate over time), and misuses by users and applications. This thesis faces these three aspects. First, it presents a qualitative method to define the fault model of a mobile OS, and an exhaustive fault model for Android. I designed and developed AndroFIT, a novel fault injection tool for Android smartphone, and performed an extensive fault injection campaign on three Android devices from different vendors to analyze the impact of component failure on the mobile OS. Second, it presents an experimental methodology to analyze the software aging phenomenon in mobile OS. I performed a software aging analysis campaign on Android devices to identify the impacting factors on performance degradation and resource consumption. Third, it presents the design and implementation of a novel fuzzing tool, namely Chizpurfle, able to automatically test Android vendor customizations by leveraging code coverage information at run-time

    Directed random testing

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. 155-162).Random testing can quickly generate many tests, is easy to implement, scales to large software applications, and reveals software errors. But it tends to generate many tests that are illegal or that exercise the same parts of the code as other tests, thus limiting its effectiveness. Directed random testing is a new approach to test generation that overcomes these limitations, by combining a bottom-up generation of tests with runtime guidance. A directed random test generator takes a collection of operations under test and generates new tests incrementally, by randomly selecting operations to apply and finding arguments from among previously-constructed tests. As soon as it generates a new test, the generator executes it, and the result determines whether the test is redundant, illegal, error-revealing, or useful for generating more tests. The technique outputs failing tests pointing to potential errors that should be corrected, and passing tests that can be used for regression testing. The thesis also contributes auxiliary techniques that post-process the generated tests, including a simplification technique that transforms a, failing test into a smaller one that better isolates the cause of failure, and a branch-directed test generation technique that aims to increase the code coverage achieved by the set of generated tests. Applied to 14 widely-used libraries (including the Java JDK and the core .NET framework libraries), directed random testing quickly reveals many serious, previously unknown errors in the libraries. And compared with other test generation tools (model checking, symbolic execution, and traditional random testing), it reveals more errors and achieves higher code coverage.(cont.) In an industrial case study, a test team at Microsoft using the technique discovered in fifteen hours of human effort as many errors as they typically discover in a person-year of effort using other testing methods.by Carlos Pacheco.Ph.D

    A Framework for File Format Fuzzing with Genetic Algorithms

    Get PDF
    Secure software, meaning software free from vulnerabilities, is desirable in today\u27s marketplace. Consumers are beginning to value a product\u27s security posture as well as its functionality. Software development companies are recognizing this trend, and they are factoring security into their entire software development lifecycle. Secure development practices like threat modeling, static analysis, safe programming libraries, run-time protections, and software verification are being mandated during product development. Mandating these practices improves a product\u27s security posture before customer delivery, and these practices increase the difficulty of discovering and exploiting vulnerabilities. Since the 1980\u27s, security researchers have uncovered software defects by fuzz testing an application. In fuzz testing\u27s infancy, randomly generated data could discover multiple defects quickly. However, as software matures and software development companies integrate secure development practices into their development life cycles, fuzzers must apply more sophisticated techniques in order to retain their ability to uncover defects. Fuzz testing must evolve, and fuzz testing practitioners must devise new algorithms to exercise an application in unexpected ways. This dissertation\u27s objective is to create a proof-of-concept genetic algorithm fuzz testing framework to exercise an application\u27s file format parsing routines. The framework includes multiple genetic algorithm variations, provides a configuration scheme, and correlates data gathered from static and dynamic analysis to guide negative test case evolution. Experiments conducted for this dissertation illustrate the effectiveness of a genetic algorithm fuzzer in comparison to standard fuzz testing tools. The experiments showcase a genetic algorithm fuzzer\u27s ability to discover multiple unique defects within a limited number of negative test cases. These experiments also highlight an application\u27s increased execution time when fuzzing with a genetic algorithm. To combat increased execution time, a distributed architecture is implemented and additional experiments demonstrate a decrease in execution time comparable to standard fuzz testing tools. A final set of experiments provide guidance on fitness function selection with a CHC genetic algorithm fuzzer with different population size configurations

    SKilL language server

    Get PDF
    Language analysis features offered by integrated development environments (IDEs) can ease and accelerate the task of writing code, but are often not available for domain-specific languages. The Language Server Protocol (LSP) aims to solve this problem by allowing language servers that support these features for a certain programming language to be used portably in a number of IDEs. A language server for Serialization Killer Language (SKilL) was implemented that supports a multitude of language features including automatic formatting, completion suggestions, and display of references and documentation associated with symbols. This thesis presents how the language server was implemented and discusses associated challenges that arose due to the nature of the SKilL and LSP specification
    • 

    corecore