20 research outputs found

    A Survey of Symbolic Execution Techniques

    Get PDF
    Many security and software testing applications require checking whether certain properties of a program hold for any possible usage scenario. For instance, a tool for identifying software vulnerabilities may need to rule out the existence of any backdoor to bypass a program's authentication. One approach would be to test the program using different, possibly random inputs. As the backdoor may only be hit for very specific program workloads, automated exploration of the space of possible inputs is of the essence. Symbolic execution provides an elegant solution to the problem, by systematically exploring many possible execution paths at the same time without necessarily requiring concrete inputs. Rather than taking on fully specified input values, the technique abstractly represents them as symbols, resorting to constraint solvers to construct actual instances that would cause property violations. Symbolic execution has been incubated in dozens of tools developed over the last four decades, leading to major practical breakthroughs in a number of prominent software reliability applications. The goal of this survey is to provide an overview of the main ideas, challenges, and solutions developed in the area, distilling them for a broad audience. The present survey has been accepted for publication at ACM Computing Surveys. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5Fv

    Enhancing dynamic symbolic execution via loop summarisation, segmented memory and pending constraints

    Get PDF
    Software has become ubiquitous and its impact is still increasing. The more software is created, the more bugs get introduced into it. With software’s increasing omnipresence, these bugs have a high probability of negative impact on everyday life. There are many efforts aimed at improving software correctness, among which symbolic execution, a program analysis technique that aims to systematically explore all program paths. In this thesis we present three techniques for enhancing symbolic execution. We first present a counterexample-guided inductive synthesis approach to summarise a class of loops, called memoryless loops using standard library functions. Our approach can summarize two thirds of memoryless loops we gathered on a set of open-source programs. These loop summaries can be used to: 1) enhance symbolic execution, 2) optimise native code and 3) refactor code. We then propose a technique that avoids expensive forking by using a segmented memory model. In this model, we split memory into segments using pointer alias analysis, so that each symbolic pointer refers to objects in a single segment. This results in a memory model where forking due to symbolic pointer dereferences is reduced. We evaluate our segmented memory model on benchmarks such as SQLite, m4 and make and observe significant decreases in execution time and memory usage. Finally, we present pending constraints, which can enhance scalability of symbolic execution by aggressively prioritising execution paths that are already known to be feasible either via cached solver solutions or seeds. The execution of other paths is deferred until no paths are known to be feasible without using the constraint solver. We evaluate our technique on nine applications, including SQLite3, make and tcpdump, and show it can achieve higher coverage for both seeded and non-seeded exploration.Open Acces

    Improving Scalability of Symbolic Execution for Software with Complex Environment Interfaces

    Get PDF
    Manual software testing is laborious and prone to human error. Yet, among practitioners, it is the most popular method for quality assurance. Automating the test case generation promises better effectiveness, especially for exposing corner-case bugs. Symbolic execution stands out as an automated testing technique that has no false positives, it eventually enumerates all feasible program executions, and can prioritize executions of interest. However, path explosionâthe fact that the number of program executions is typically at least exponential in the size of the programâhinders the applicability of symbolic execution in the real world, where software commonly reaches millions of lines of code. In practice, large systems can be efficiently executed symbolically by exploiting their modularity and thus symbolically execute the different parts of the system separately. However, a component typically depends on its environment to perform its task. Thus, a symbolic execution engine needs to provide an environment interface that is efficient, while maintaining accuracy and completeness. This conundrum is known as the environment problem. Systematically addressing the environment problem is challenging, as its instantiation depends on the nature of the environment and its interface. This thesis addresses two instances of the environment problem in symbolic execution, which are at opposite ends of the spectrum of interface stability: (1) system software interacting with an operating system with stable and well-documented semantics (e.g., POSIX), and (2) high-level programs written in dynamic languages, such as Python, Ruby, or JavaScript, whose semantics and interfaces are continuously evolving. To address the environment problem for stable operating system interfaces, this thesis introduces the idea of splitting an operating system model into a core set of primitives built into the engine at host level and, on top of it, the full operating system interface emulated inside the guest. As few as two primitives are sufficient to support a complex interface such as POSIX: threads with synchronization and address spaces with shared memory. We prototyped this idea in the Cloud9 symbolic execution platform. Cloud9's accurate and efficient POSIX model exposes hard-to-reproduce bugs in systems such as UNIX utilities, web servers, and distributed systems. Cloud9 is available at http://cloud9.epfl.ch. For programs written in high-level interpreted languages, this thesis introduces the idea of using the language interpreter as an "executable language specification". The interpreter runs inside a low-level (e.g., x86) symbolic execution engine while it executes the target program. The aggregate system acts as a high-level symbolic execution engine for the program. To manage the complexity of symbolically executing the entire interpreter, this thesis introduces Class-Uniform Path Analysis (CUPA), an algorithm for prioritizing paths that groups paths into equivalence classes according to a coverage goal. We built a prototype of these ideas in the form of Chef, a symbolic execution platform for interpreted languages that generates up to 1000 times more tests in popular Python and Lua packages compared to a plain execution of the interpreters. Chef is available at http://dslab.epfl.ch/proj/chef/

    Fundamental Approaches to Software Engineering

    Get PDF
    This open access book constitutes the proceedings of the 23rd International Conference on Fundamental Approaches to Software Engineering, FASE 2020, which took place in Dublin, Ireland, in April 2020, and was held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The 23 full papers, 1 tool paper and 6 testing competition papers presented in this volume were carefully reviewed and selected from 81 submissions. The papers cover topics such as requirements engineering, software architectures, specification, software quality, validation, verification of functional and non-functional properties, model-driven development and model transformation, software processes, security and software evolution

    Proceedings of the First NASA Formal Methods Symposium

    Get PDF
    Topics covered include: Model Checking - My 27-Year Quest to Overcome the State Explosion Problem; Applying Formal Methods to NASA Projects: Transition from Research to Practice; TLA+: Whence, Wherefore, and Whither; Formal Methods Applications in Air Transportation; Theorem Proving in Intel Hardware Design; Building a Formal Model of a Human-Interactive System: Insights into the Integration of Formal Methods and Human Factors Engineering; Model Checking for Autonomic Systems Specified with ASSL; A Game-Theoretic Approach to Branching Time Abstract-Check-Refine Process; Software Model Checking Without Source Code; Generalized Abstract Symbolic Summaries; A Comparative Study of Randomized Constraint Solvers for Random-Symbolic Testing; Component-Oriented Behavior Extraction for Autonomic System Design; Automated Verification of Design Patterns with LePUS3; A Module Language for Typing by Contracts; From Goal-Oriented Requirements to Event-B Specifications; Introduction of Virtualization Technology to Multi-Process Model Checking; Comparing Techniques for Certified Static Analysis; Towards a Framework for Generating Tests to Satisfy Complex Code Coverage in Java Pathfinder; jFuzz: A Concolic Whitebox Fuzzer for Java; Machine-Checkable Timed CSP; Stochastic Formal Correctness of Numerical Algorithms; Deductive Verification of Cryptographic Software; Coloured Petri Net Refinement Specification and Correctness Proof with Coq; Modeling Guidelines for Code Generation in the Railway Signaling Context; Tactical Synthesis Of Efficient Global Search Algorithms; Towards Co-Engineering Communicating Autonomous Cyber-Physical Systems; and Formal Methods for Automated Diagnosis of Autosub 6000

    Improving systems software security through program analysis and instrumentation

    Get PDF
    Security and reliability bugs are prevalent in systems software. Systems code is often written in low-level languages like C/C++, which offer many benefits but also delegate memory management and type safety to programmers. This invites bugs that cause crashes or can be exploited by attackers to take control of the program. This thesis presents techniques to detect and fix security and reliability issues in systems software without burdening the software developers. First, we present code-pointer integrity (CPI), a technique that combines static analysis with compile-time instrumentation to guarantee the integrity of all code pointers in a program and thereby prevent all control-flow hijack attacks. We also present code-pointer separation (CPS), a relaxation of CPI with better performance properties. CPI and CPS offer substantially better security-to-overhead ratios than the state of the art in control flow hijack defense mechanisms, they are practical (we protect a complete FreeBSD system and over 100 packages like apache and postgresql), effective (prevent all attacks in the RIPE benchmark), and efficient: on SPEC CPU2006, CPS averages 1.2% overhead for C and 1.9% for C/C++, while CPIâs overhead is 2.9% for C and 8.4% for C/C++. Second, we present DDT, a tool for testing closed-source device drivers to automatically find bugs like memory errors or race conditions. DDT showcases a combination of a form of program analysis called selective symbolic execution with virtualization to thoroughly exercise tested drivers and produce detailed, executable traces for every path that leads to a failure. We applied DDT to several closed-source Microsoft-certified Windows device drivers and discovered 14 serious new bugs that can cause crashes or compromise security of the entire system. Third, we present a technique for increasing the scalability of symbolic execution by merging states obtained on different execution paths. State merging reduces the number of states to analyze, but the merged states can be more complex and harder to analyze than their individual components. We introduce query count estimation, a technique to reason about the analysis time of merged states and decide which states to merge in order to achieve optimal net performance of symbolic execution. We also introduce dynamic state merging, a technique for merging states that interacts favorably with search strategies employed by practical bug finding tools, such as DDT and KLEE. Experiments on the 96 GNU Coreutils show that our approach consistently achieves several orders of magnitude speedup over previously published results

    Computer Aided Verification

    Get PDF
    This open access two-volume set LNCS 13371 and 13372 constitutes the refereed proceedings of the 34rd International Conference on Computer Aided Verification, CAV 2022, which was held in Haifa, Israel, in August 2022. The 40 full papers presented together with 9 tool papers and 2 case studies were carefully reviewed and selected from 209 submissions. The papers were organized in the following topical sections: Part I: Invited papers; formal methods for probabilistic programs; formal methods for neural networks; software Verification and model checking; hyperproperties and security; formal methods for hardware, cyber-physical, and hybrid systems. Part II: Probabilistic techniques; automata and logic; deductive verification and decision procedures; machine learning; synthesis and concurrency. This is an open access book

    Computer Aided Verification

    Get PDF
    The open access two-volume set LNCS 11561 and 11562 constitutes the refereed proceedings of the 31st International Conference on Computer Aided Verification, CAV 2019, held in New York City, USA, in July 2019. The 52 full papers presented together with 13 tool papers and 2 case studies, were carefully reviewed and selected from 258 submissions. The papers were organized in the following topical sections: Part I: automata and timed systems; security and hyperproperties; synthesis; model checking; cyber-physical systems and machine learning; probabilistic systems, runtime techniques; dynamical, hybrid, and reactive systems; Part II: logics, decision procedures; and solvers; numerical programs; verification; distributed systems and networks; verification and invariants; and concurrency
    corecore