240 research outputs found
IntRepair: Informed Repairing of Integer Overflows
Integer overflows have threatened software applications for decades. Thus, in
this paper, we propose a novel technique to provide automatic repairs of
integer overflows in C source code. Our technique, based on static symbolic
execution, fuses detection, repair generation and validation. This technique is
implemented in a prototype named IntRepair. We applied IntRepair to 2,052C
programs (approx. 1 million lines of code) contained in SAMATE's Juliet test
suite and 50 synthesized programs that range up to 20KLOC. Our experimental
results show that IntRepair is able to effectively detect integer overflows and
successfully repair them, while only increasing the source code (LOC) and
binary (Kb) size by around 1%, respectively. Further, we present the results of
a user study with 30 participants which shows that IntRepair repairs are more
than 10x efficient as compared to manually generated code repairsComment: Accepted for publication at the IEEE TSE journal. arXiv admin note:
text overlap with arXiv:1710.0372
Eunomia: Enabling User-specified Fine-Grained Search in Symbolically Executing WebAssembly Binaries
Although existing techniques have proposed automated approaches to alleviate
the path explosion problem of symbolic execution, users still need to optimize
symbolic execution by applying various searching strategies carefully. As
existing approaches mainly support only coarse-grained global searching
strategies, they cannot efficiently traverse through complex code structures.
In this paper, we propose Eunomia, a symbolic execution technique that allows
users to specify local domain knowledge to enable fine-grained search. In
Eunomia, we design an expressive DSL, Aes, that lets users precisely pinpoint
local searching strategies to different parts of the target program. To further
optimize local searching strategies, we design an interval-based algorithm that
automatically isolates the context of variables for different local searching
strategies, avoiding conflicts between local searching strategies for the same
variable. We implement Eunomia as a symbolic execution platform targeting
WebAssembly, which enables us to analyze applications written in various
languages (like C and Go) but can be compiled into WebAssembly. To the best of
our knowledge, Eunomia is the first symbolic execution engine that supports the
full features of the WebAssembly runtime. We evaluate Eunomia with a dedicated
microbenchmark suite for symbolic execution and six real-world applications.
Our evaluation shows that Eunomia accelerates bug detection in real-world
applications by up to three orders of magnitude. According to the results of a
comprehensive user study, users can significantly improve the efficiency and
effectiveness of symbolic execution by writing a simple and intuitive Aes
script. Besides verifying six known real-world bugs, Eunomia also detected two
new zero-day bugs in a popular open-source project, Collections-C.Comment: Accepted by ACM SIGSOFT International Symposium on Software Testing
and Analysis (ISSTA) 202
A Survey of Symbolic Execution Techniques
Many security and software testing applications require checking whether
certain properties of a program hold for any possible usage scenario. For
instance, a tool for identifying software vulnerabilities may need to rule out
the existence of any backdoor to bypass a program's authentication. One
approach would be to test the program using different, possibly random inputs.
As the backdoor may only be hit for very specific program workloads, automated
exploration of the space of possible inputs is of the essence. Symbolic
execution provides an elegant solution to the problem, by systematically
exploring many possible execution paths at the same time without necessarily
requiring concrete inputs. Rather than taking on fully specified input values,
the technique abstractly represents them as symbols, resorting to constraint
solvers to construct actual instances that would cause property violations.
Symbolic execution has been incubated in dozens of tools developed over the
last four decades, leading to major practical breakthroughs in a number of
prominent software reliability applications. The goal of this survey is to
provide an overview of the main ideas, challenges, and solutions developed in
the area, distilling them for a broad audience.
The present survey has been accepted for publication at ACM Computing
Surveys. If you are considering citing this survey, we would appreciate if you
could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing
this survey, we would appreciate if you could use the following BibTeX entry:
http://goo.gl/Hf5Fv
Bit-vector Support in Z3-str2 Solver and Automated Exploit Synthesis
Improper string manipulations are an important cause of software defects, which make them a target for program analysis by hackers and developers alike. Symbolic execution based program analysis techniques that systematically explore paths through string-intensive programs require reasoning about string and bit-vector constraints cohesively. The current state of the art symbolic execution engines for programs written in C/C++ languages track constraints on a bit-level and use bit-vector solver to reason about the collected path constraints. However, string functions incur high-performance penalties and lead to path explosion in the symbolic execution engine. The current state of the art string solvers are written primarily for the analysis of web applications with underlying support for the theory of strings and integers, which limits their use in the analysis of low-level programs. Therefore, we designed a decision procedure for the theory of strings and bit-vectors in Z3-str2, a decision procedure for strings and integers, to efficiently solve word equations and length functions over bit-vectors. The new theory combination has a significant role in the detection of integer overflows and memory corruption vulnerabilities associated with string operations. In addition, we introduced a new search space pruning technique for string lengths based on a binary search approach, which enabled our decision procedure to solve constraints involving large strings. We evaluated our decision procedure on a set of real security vulnerabilities collected from Common Vulnerabilities and Exposures (CVE) database and compared the result against the Z3-str2 string-integer solver. The experiments show that our decision procedure is orders of magnitude faster than Z3-str2 string-integer. The techniques we developed have the potential to dramatically improve the efficiency of symbolic execution of string-intensive programs.
In addition to designing and implementing a string bit-vector solver, we also addressed the problem of automated remote exploit construction. In this context, we introduce a practical approach for automating remote exploitation using information leakage vulnerability and show that current protection schemes against control-flow hijack attacks are not always very effective. To demonstrate the efficacy of our technique, we performed an over-the-network format string exploitation followed by a return-to-libc attack against a pre-forking concurrent server to gain remote access to a shell. Our attack managed to defeat various protections including ASLR, DEP, PIE, stack canary and RELRO
Benchmarking Symbolic Execution Using Constraint Problems -- Initial Results
Symbolic execution is a powerful technique for bug finding and program
testing. It is successful in finding bugs in real-world code. The core
reasoning techniques use constraint solving, path exploration, and search,
which are also the same techniques used in solving combinatorial problems,
e.g., finite-domain constraint satisfaction problems (CSPs). We propose CSP
instances as more challenging benchmarks to evaluate the effectiveness of the
core techniques in symbolic execution. We transform CSP benchmarks into C
programs suitable for testing the reasoning capabilities of symbolic execution
tools. From a single CSP P, we transform P depending on transformation choice
into different C programs. Preliminary testing with the KLEE, Tracer-X, and
LLBMC tools show substantial runtime differences from transformation and solver
choice. Our C benchmarks are effective in showing the limitations of existing
symbolic execution tools. The motivation for this work is we believe that
benchmarks of this form can spur the development and engineering of improved
core reasoning in symbolic execution engines
- …