1,193 research outputs found

    Putting Randomized Compiler Testing into Production (Artifact)

    Get PDF
    This artifact accompanies our experience report for our compiler testing technology transfer project: taking the GraphicsFuzz research project on randomized metamorphic testing of graphics shader compilers, and building the necessary tooling around it to provide a highly automated process for improving the Khronos Vulkan Conformance Test Suite (CTS) with test cases that expose fuzzer-found compiler bugs, or that plug gaps in test coverage. The artifact consists of two Dockerfiles and associated files that can be used to build two Docker containers. The containers include our main tool for performing fuzzing: gfauto. The containers allow the user to fuzz SwiftShader, a software Vulkan implementation, finding 4 bugs. The user will also perform some line coverage analysis of SwiftShader using our tools to synthesize a small test that increases line coverage. Ubuntu, gfauto, SwiftShader, and other dependencies inside the Docker containers are fixed at specific versions, and all random seeds are set to specific values. Thus, all examples should reproduce faithfully on any machine

    Low-overhead Online Code Transformations.

    Full text link
    The ability to perform online code transformations - to dynamically change the implementation of running native programs - has been shown to be useful in domains as diverse as optimization, security, debugging, resilience and portability. However, conventional techniques for performing online code transformations carry significant runtime overhead, limiting their applicability for performance-sensitive applications. This dissertation proposes and investigates a novel low-overhead online code transformation technique that works by running the dynamic compiler asynchronously and in parallel to the running program. As a consequence, this technique allows programs to execute with the online code transformation capability at near-native speed, unlocking a host of additional opportunities that can take advantage of the ability to re-visit compilation choices as the program runs. This dissertation builds on the low-overhead online code transformation mechanism, describing three novel runtime systems that represent in best-in-class solutions to three challenging problems facing modern computer scientists. First, I leverage online code transformations to significantly increase the utilization of multicore datacenter servers by dynamically managing program cache contention. Compared to state-of-the-art prior work that mitigate contention by throttling application execution, the proposed technique achieves a 1.3-1.5x improvement in application performance. Second, I build a technique to automatically configure and parameterize approximate computing techniques for each program input. This technique results in the ability to configure approximate computing to achieve an average performance improvement of 10.2x while maintaining 90% result accuracy, which significantly improves over oracle versions of prior techniques. Third, I build an operating system designed to secure running applications from dynamic return oriented programming attacks by efficiently, transparently and continuously re-randomizing the code of running programs. The technique is able to re-randomize program code at a frequency of 300ms with an average overhead of 9%, a frequency fast enough to resist state-of-the-art return oriented programming attacks based on memory disclosures and side channels.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120775/1/mlaurenz_1.pd

    Software reverse engineering education

    Get PDF
    Software Reverse Engineering (SRE) is the practice of analyzing a software system, either in whole or in part, to extract design and implementation information. A typical SRE scenario would involve a software module that has worked for years and carries several rules of a business in its lines of code. Unfortunately the source code of the application has been lost; what remains is “native ” or “binary ” code. Reverse engineering skills are also used to detect and neutralize viruses and malware as well as to protect intellectual property. It became frighteningly apparent during the Y2K crisis that reverse engineering skills were not commonly held amongst programmers. Since that time, much research has been undertaken to formalize the types of activities that fall into the category of reverse engineering so that these skills can be taught to computer programmers and testers. To help address the lack of software reverse engineering education, several peer-reviewed articles on software reverse engineering, re-engineering, reuse, maintenance, evolution, and security were gathered with the objective of developing relevant, practical exercises for instructional purposes. The research revealed that SRE is fairly well described and most of the related activities fall into one of tw

    An input centric paradigm for program dynamic optimizations and lifetime evolvement

    Get PDF
    Accurately predicting program behaviors (e.g., memory locality, method calling frequency) is fundamental for program optimizations and runtime adaptations. Despite decades of remarkable progress, prior studies have not systematically exploited the use of program inputs, a deciding factor of program behaviors, to help in program dynamic optimizations. Triggered by the strong and predictive correlations between program inputs and program behaviors that recent studies have uncovered, the dissertation work aims to bring program inputs into the focus of program behavior analysis and program dynamic optimization, cultivating a new paradigm named input-centric program behavior analysis and dynamic optimization.;The new optimization paradigm consists of three components, forming a three-layer pyramid. at the base is program input characterization, a component for resolving the complexity in program raw inputs and extracting important features. In the middle is input-behavior modeling, a component for recognizing and modeling the correlations between characterized input features and program behaviors. These two components constitute input-centric program behavior analysis, which (ideally) is able to predict the large-scope behaviors of a program\u27s execution as soon as the execution starts. The top layer is input-centric adaptation, which capitalizes on the novel opportunities created by the first two components to facilitate proactive adaptation for program optimizations.;This dissertation aims to develop this paradigm in two stages. In the first stage, we concentrate on exploring the implications of program inputs for program behaviors and dynamic optimization. We construct the basic input-centric optimization framework based on of line training to realize the basic functionalities of the three major components of the paradigm. For the second stage, we focus on making the paradigm practical by addressing multi-facet issues in handling input complexities, transparent training data collection, predictive model evolvement across production runs. The techniques proposed in this stage together cultivate a lifelong continuous optimization scheme with cross-input adaptivity.;Fundamentally the new optimization paradigm provides a brand new solution for program dynamic optimization. The techniques proposed in the dissertation together resolve the adaptivity-proactivity dilemma that has been limiting the effectiveness of existing optimization techniques. its benefits are demonstrated through proactive dynamic optimizations in Jikes RVM and version selection using IBM XL C Compiler, yielding significant performance improvement on a set of Java and C/C++ programs. It may open new opportunities for a broad range of runtime optimizations and adaptations. The evaluation results on both Java and C/C++ applications demonstrate the new paradigm is promising in advancing the current state of program optimizations

    Dynamically detecting and tolerating IF-Condition Data Races

    Full text link
    An IF-Condition Invariance Violation (ICIV) occurs when, after a thread has computed the control expression of an IF statement and while it is executing the THEN or ELSE clauses, another thread updates variables in the IF’s control expression. An ICIV can be easily detected, and is likely to be a sign of a concurrency bug in the code. Typically, the ICIV is caused by a data race, which we call IF-Condition Data Race (ICR). In this paper, we analyze the data races reported in the bug databases of popular software systems and show that ICRs occur relatively often. Then, we present two techniques to handle ICRs dynamically. They rely on simple code transformations and, in one case, additional hardware help. One of them (SW-IF) detects the races, while the other (HW-IF) detects and prevents them. We evaluate SW-IF and HW-IF using a variety of applica-tions. We show that these new techniques are effective at finding new data race bugs and run with low overhead. Specifically, HW-IF finds 5 new (unreported) race bugs and SW-IF finds 3 of them. In addition, 8-threaded executions of SPLASH-2 codes show that, on average, SW-IF adds 2 % execution overhead, while HW-IF adds less than 1%. 1

    An Empirical Evaluation of the Effectiveness of JML Assertions as Test Oracles

    Get PDF
    Test oracles remain one of the least understood aspects of the modern testing process. An oracle is a mechanism used by software testers and software engineers for determining whether a test has passed or failed. One widely-supported approach to oracles is the use of runtime assertion checking during the testing activity. Method invariants,pre- and postconditions help detect bugs during runtime. While assertions are supported by virtually all programming environments, are used widely in practice, and are often assumed to be effective as test oracles, there are few empirical studies of their efficacy in this role. In this thesis, we present the results of an experiment we conducted to help understand this question. To do this, we studied seven of the core Java classes that had been annotated by others with assertions in the Java Modeling Language, used the muJava mutation analysis tool to create mutant implementations of these classes, exercised them with input-only (i.e., no oracle) test suites that achieve branch coverage, and used a machine learning tool, Weka, to determine which annotations were effective at ``killing\u27\u27 these mutants. We also evaluate how effective the ``null oracle\u27\u27 (in our case, the Java runtime system) is at catching these bugs. The results of our study are interesting, and help provide software engineers with insight into situations in which assertions can be relied upon to find bugs, and situations in which assertions may need to be augmented with other approaches to test oracles

    Using Code Perforation to Improve Performance, Reduce Energy Consumption, and Respond to Failures

    Get PDF
    Many modern computations (such as video and audio encoders, Monte Carlo simulations, and machine learning algorithms) are designed to trade off accuracy in return for increased performance. To date, such computations typically use ad-hoc, domain-specific techniques developed specifically for the computation at hand. We present a new general technique, code perforation, for automatically augmenting existing computations with the capability of trading off accuracy in return for performance. In contrast to existing approaches, which typically require the manual development of new algorithms, our implemented SpeedPress compiler can automatically apply code perforation to existing computations with no developer intervention whatsoever. The result is a transformed computation that can respond almost immediately to a range of increased performancedemands while keeping any resulting output distortion within acceptable user-defined bounds. We have used SpeedPress to automatically apply code perforation to applications from the PARSEC benchmark suite. The results show that the transformed applications can run as much as two to three times faster than the original applications while distorting the output by less than 10%. Because the transformed applications can operate successfully at many points in the performance/accuracy tradeoff space, they can (dynamically and on demand) navigate the tradeoff space to either maximize performance subject to a given accuracy constraint, or maximize accuracy subject to a given performance constraint. We also demonstrate the SpeedGuard runtime system which uses code perforation to enable applications to automatically adapt to challenging execution environments such as multicore machines that suffer core failures or machines that dynamically adjust the clock speed to reduce power consumption or to protect the machine from overheating

    Metamorphic Testing for Software Libraries and Graphics Compilers

    Get PDF
    Metamorphic Testing is a testing technique which mutates existing test cases in semantically equivalent forms, by making use of metamorphic relations, while avoiding the oracle problem. However, these required relations are not readily available for a given system under test. Defining effective metamorphic relations is difficult, and arguably the main obstacle towards adoption of metamorphic testing in production-level software development. One example application is testing graphics compilers, where the approximate and under-specified nature of the domain makes it hard to apply more traditional techniques. We propose an approach with a lower barrier of entry to applying metamorphic testing for a software library. The user must still identify relations that hold over their particular library, but can do so within a development-like environment. We apply methods from the domains of metamorphic testing and fuzzing to produce complex test cases. We consider the user interaction a bonus, as they can control what parts of the target codebase is tested, potentially focusing on less-tested or critical sections of the codebase. We implement our proposed approach in a tool, MF++, which synthesises C++ test cases for a C++ library, defined by user-provided ingredients. We applied MF++ to 7 libraries in the domains of satisfiability modulo theories and Presburger arithmetic,. Our evaluation of MF++ was able to identify 21 bugs in these tools. We additionally provide an automatic reducer for tests generated by MF++, named MF++R. In addition to minimising tests exposing issues, MF++R can also be used to identify incorrect user-provided relations. Additionally, we investigate the combined use of MF++ and MF++R in order to augment code coverage of library test suites. We assess the utility of this application by contributing 21 tests aimed at improving coverage across 3 libraries.Open Acces
    • …
    corecore