502 research outputs found

    Badger: Complexity Analysis with Fuzzing and Symbolic Execution

    Full text link
    Hybrid testing approaches that involve fuzz testing and symbolic execution have shown promising results in achieving high code coverage, uncovering subtle errors and vulnerabilities in a variety of software applications. In this paper we describe Badger - a new hybrid approach for complexity analysis, with the goal of discovering vulnerabilities which occur when the worst-case time or space complexity of an application is significantly higher than the average case. Badger uses fuzz testing to generate a diverse set of inputs that aim to increase not only coverage but also a resource-related cost associated with each path. Since fuzzing may fail to execute deep program paths due to its limited knowledge about the conditions that influence these paths, we complement the analysis with a symbolic execution, which is also customized to search for paths that increase the resource-related cost. Symbolic execution is particularly good at generating inputs that satisfy various program conditions but by itself suffers from path explosion. Therefore, Badger uses fuzzing and symbolic execution in tandem, to leverage their benefits and overcome their weaknesses. We implemented our approach for the analysis of Java programs, based on Kelinci and Symbolic PathFinder. We evaluated Badger on Java applications, showing that our approach is significantly faster in generating worst-case executions compared to fuzzing or symbolic execution on their own

    A Survey of Symbolic Execution Techniques

    Get PDF
    Many security and software testing applications require checking whether certain properties of a program hold for any possible usage scenario. For instance, a tool for identifying software vulnerabilities may need to rule out the existence of any backdoor to bypass a program's authentication. One approach would be to test the program using different, possibly random inputs. As the backdoor may only be hit for very specific program workloads, automated exploration of the space of possible inputs is of the essence. Symbolic execution provides an elegant solution to the problem, by systematically exploring many possible execution paths at the same time without necessarily requiring concrete inputs. Rather than taking on fully specified input values, the technique abstractly represents them as symbols, resorting to constraint solvers to construct actual instances that would cause property violations. Symbolic execution has been incubated in dozens of tools developed over the last four decades, leading to major practical breakthroughs in a number of prominent software reliability applications. The goal of this survey is to provide an overview of the main ideas, challenges, and solutions developed in the area, distilling them for a broad audience. The present survey has been accepted for publication at ACM Computing Surveys. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5Fv

    Memoized Symbolic Execution

    Get PDF
    This paper introduces memoized symbolic execution (Memoise), a novel approach for more efficient application of forward symbolic execution, which is a well-studied technique for systematic exploration of program behaviors based on bounded execution paths. Our key insight is that application of symbolic execution often requires several successive runs of the technique on largely similar underlying problems, e.g., running it once to check a program to find a bug, fixing the bug, and running it again to check the modified program. Memoise introduces a trie-based data structure that stores the key elements of a run of symbolic execution. Maintenance of the trie during successive runs allows re-use of previously computed results of symbolic execution without the need for re-computing them as is traditionally done. Experiments using our prototype embodiment of Memoise show the benefits it holds in various standard scenarios of using symbolic execution, e.g., with iterative deepening of exploration depth, to perform regression analysis, or to enhance coverage

    Towards a Framework for Generating Tests to Satisfy Complex Code Coverage in Java Pathfinder

    Get PDF
    We present work on a prototype tool based on the JavaPathfinder (JPF) model checker for automatically generating tests satisfying the MC/DC code coverage criterion. Using the Eclipse IDE, developers and testers can quickly instrument Java source code with JPF annotations covering all MC/DC coverage obligations, and JPF can then be used to automatically generate tests that satisfy these obligations. The prototype extension to JPF enables various tasks useful in automatic test generation to be performed, such as test suite reduction and execution of generated tests

    Feedback-Driven Dynamic Invariant Discovery

    Get PDF
    Program invariants can help software developers identify program properties that must be preserved as the software evolves, however, formulating correct invariants can be challenging. In this work, we introduce iDiscovery, a technique which leverages symbolic execution to improve the quality of dynamically discovered invariants computed by Daikon. Candidate invariants generated by Daikon are synthesized into assertions and instrumented onto the program. The instrumented code is executed symbolically to generate new test cases that are fed back to Daikon to help further re ne the set of candidate invariants. This feedback loop is executed until a x-point is reached. To mitigate the cost of symbolic execution, we present optimizations to prune the symbolic state space and to reduce the complexity of the generated path conditions. We also leverage recent advances in constraint solution reuse techniques to avoid computing results for the same constraints across iterations. Experimental results show that iDiscovery converges to a set of higher quality invariants compared to the initial set of candidate invariants in a small number of iterations

    Enhancing dynamic symbolic execution via loop summarisation, segmented memory and pending constraints

    Get PDF
    Software has become ubiquitous and its impact is still increasing. The more software is created, the more bugs get introduced into it. With software’s increasing omnipresence, these bugs have a high probability of negative impact on everyday life. There are many efforts aimed at improving software correctness, among which symbolic execution, a program analysis technique that aims to systematically explore all program paths. In this thesis we present three techniques for enhancing symbolic execution. We first present a counterexample-guided inductive synthesis approach to summarise a class of loops, called memoryless loops using standard library functions. Our approach can summarize two thirds of memoryless loops we gathered on a set of open-source programs. These loop summaries can be used to: 1) enhance symbolic execution, 2) optimise native code and 3) refactor code. We then propose a technique that avoids expensive forking by using a segmented memory model. In this model, we split memory into segments using pointer alias analysis, so that each symbolic pointer refers to objects in a single segment. This results in a memory model where forking due to symbolic pointer dereferences is reduced. We evaluate our segmented memory model on benchmarks such as SQLite, m4 and make and observe significant decreases in execution time and memory usage. Finally, we present pending constraints, which can enhance scalability of symbolic execution by aggressively prioritising execution paths that are already known to be feasible either via cached solver solutions or seeds. The execution of other paths is deferred until no paths are known to be feasible without using the constraint solver. We evaluate our technique on nine applications, including SQLite3, make and tcpdump, and show it can achieve higher coverage for both seeded and non-seeded exploration.Open Acces

    Directed Incremental Symbolic Execution

    Get PDF
    The last few years have seen a resurgence of interest in the use of symbolic execution -- a program analysis technique developed more than three decades ago to analyze program execution paths. Scaling symbolic execution and other path-sensitive analysis techniques to large systems remains challenging despite recent algorithmic and technological advances. An alternative to solving the problem of scalability is to reduce the scope of the analysis. One approach that is widely studied in the context of regression analysis is to analyze the differences between two related program versions. While such an approach is intuitive in theory, finding efficient and precise ways to identify program differences, and characterize their effects on how the program executes has proved challenging in practice. In this paper, we present Directed Incremental Symbolic Execution (DiSE), a novel technique for detecting and characterizing the effects of program changes. The novelty of DiSE is to combine the efficiencies of static analysis techniques to compute program difference information with the precision of symbolic execution to explore program execution paths and generate path conditions affected by the differences. DiSE is a complementary technique to other reduction or bounding techniques developed to improve symbolic execution. Furthermore, DiSE does not require analysis results to be carried forward as the software evolves -- only the source code for two related program versions is required. A case-study of our implementation of DiSE illustrates its effectiveness at detecting and characterizing the effects of program changes

    Software testing or the bugs’ nightmare

    Get PDF
    Software development is not error-free. For decades, bugs –including physical ones– have become a significant development problem requiring major maintenance efforts. Even in some cases, solving bugs led to increment them. One of the main reasons for bug’s prominence is their ability to hide. Finding them is difficult and costly in terms of time and resources. However, software testing made significant progress identifying them by using different strategies that combine knowledge from every single part of the program. This paper humbly reviews some different approaches from software testing that discover bugs automatically and presents some different state-of-the-art methods and tools currently used in this area. It covers three testing strategies: search-based methods, symbolic execution, and fuzzers. It also provides some income about the application of diversity in these areas, and common and future challenges on automatic test generation that still need to be addressed

    Obtaining Real-World Benchmark Programs from Open-Source Repositories Through Abstract-Semantics Preserving Transformations

    Get PDF
    Benchmark programs are an integral part of program analysis research. Researchers use benchmark programs to evaluate existing techniques and test the feasibility of new approaches. The larger and more realistic the set of benchmarks, the more confident a researcher can be about the correctness and reproducibility of their results. However, obtaining an adequate set of benchmark programs has been a long-standing challenge in the program analysis community. In this thesis, we present the APT tool, a framework we designed and implemented to automate the generation of realistic benchmark programs suitable for program analysis evaluations. Our tool targets intra-procedural analyses that operate on an integer domain, specifically symbolic execution. The framework is composed of three main stages. In the first stage, the tool extracts potential benchmark programs from open-source repositories suitable for symbolic execution. In the second stage, the tool transforms the extracted programs into compilable, stand-alone benchmarks by removing external dependencies and nonlinear expressions. In the third stage, the benchmarks are verified and made available for the user. We have designed our transformation algorithms to remove program dependencies and nonlinear expressions while preserving their semantics-equivalence in the abstraction of symbolic analysis. That is, we want the information the analysis computes on the original program and its transformed version to be equivalent. Our work provides static analysis researchers with concise, compilable benchmark programs that are relevant to symbolic execution, allowing them to focus their efforts on advancing analysis techniques. Furthermore, our work benefits the software engineering community by enabling static analysis researchers to perform benchmarking with a large, realistic set of programs, thus strengthening the empirical evidence of the advancements in static program analysis
    • …
    corecore