287 research outputs found
Automata-based Model Counting String Constraint Solver for Vulnerability Analysis
Most common vulnerabilities in modern software applications are due to errors in string manipulation code. String constraint solvers are essential components of program analysis techniques for detecting and repairing vulnerabilities that are due to string manipulation errors. In this dissertation, we present an automata-based string constraint solver for vulnerability analysis of string manipulating programs.Given a string constraint, we generate an automaton that accepts all solutions that satisfy the constraint. Our string constraint solver can also map linear arithmetic constraints to automata in order to handle constraints on string lengths. By integrating our string constraint solver to a symbolic execution tool, we can check for string manipulation errors in programs. Recently, quantitative and probabilistic program analyses techniques have been proposed which require counting the number of solutions to string constraints. We extend our string constraint solver with model counting capability based on the observation that, using an automata-based constraint representation, model counting reduces to path counting, which can be solved precisely. Our approach is parameterized in the sense that, we do notassume a finite domain size during automata construction, resulting in a potentially infinite set of solutions, and our model counting approach works for arbitrarily large bounds.We have implemented our approach in a tool called ABC (Automata-Based model Counter) using a constraint language that is compatible with the SMTLIB language specification used by satifiabilty-modula-theories solvers. This SMTLIB interface facilitates integration of our constraint solver with existing symbolic execution tools. We demonstrate the effectiveness of ABC on a large set of string constraints extracted from real-world web applications.We also present automata-based testing techniques for string manipulating programs. A vulnerability signature is a characterization of all user inputs that can be used to exploit a vulnerability. Automata-based static string analysis techniques allow automated computation of vulnerability signatures represented as automata. Given a vulnerability signature represented as an automaton, we present algorithms for test case generation based on state, transition, and path coverage. These automaticallygenerated test cases can be used to test applications that are not analyzable statically, and to discover attack strings that demonstrate how the vulnerabilities can be exploited. We experimentally comparedifferent coverage criteria and demonstrate the effectiveness of our test generation approach
Search-driven string constraint solving for vulnerability detection
Constraint solving is an essential technique for detecting vulnerabilities in programs, since it can reason about input sanitization and validation operations performed on user inputs. However, real-world programs typically contain complex string operations that challenge vulnerability detection. State-of-the-art string constraint solvers support only a limited set of string operations and fail when they encounter an unsupported one; this leads to limited effectiveness in finding vulnerabilities.
In this paper we propose a search-driven constraint solving technique that complements the support for complex string operations provided by any existing string constraint solver. Our technique uses a hybrid constraint solving procedure based on the Ant Colony Optimization meta-heuristic. The idea is to execute it as a fallback mechanism, only when a solver encounters a constraint containing an operation that it does not support.
We have implemented the proposed search-driven constraint solving technique in the ACO-Solver tool, which we have evaluated in the context of injection and XSS vulnerability detection for Java Web applications. We have assessed the benefits and costs of combining the proposed technique with two state-of-the-art constraint solvers (Z3-str2 and CVC4). The experimental results, based on a benchmark with 104 constraints derived from nine realistic Web applications, show that our approach, when combined in a state-of-the-art solver, significantly improves the number of detected vulnerabilities (from 4.7% to 71.9% for Z3-str2, from 85.9% to 100.0% for CVC4), and solves several cases on which the solver fails when used stand-alone (46 more solved cases for Z3-str2, and 11 more for CVC4), while still keeping the execution time affordable in practice
Recommended from our members
Software Side-Channel Analysis
Software side-channel attacks are able to recover confidential information by observing non-functional computation characteristics of program execution such as elapsed time, amount of allocated memory, or network packet size. The ability to automatically determine the amount of information that a malicious user can gain through side-channel observations allows one to quantitatively assess the security of an application. Since most software that accesses confidential information leaks some amount of information through side channels, it is important to quantify the amount of leakage in order to detect vulnerabilities. In addition, one can prove that a program is vulnerable to side-channel attacks by synthesizing attacks that recover confidential information. In this dissertation, I provide methods for (1) quantifying side-channel vulnerabilities and (2) synthesizing adaptive side-channel attack steps. My approaches advance the state-of-the-art in automatic software side-channel analysis which I summarize as follows. I make use of symbolic execution to extract program constraints that characterize the relationship between secret information, the inputs of a malicious user, and observable program behaviors. By applying model counting constraint solving to these constraints, I compute probabilistic relationships among secrets, attacker inputs, and attacker side-channel observations. These probabilities are used to quantify information leakage for a program by applying methods from the field of quantitative information flow. Moreover, by automatically generating a symbolic expression that quantifies information leakage, I am able to perform numeric maximization over attacker inputs to synthesize optimal attack steps. The sequence of attack steps serves as a proof of exploitability. I give two different automatic attack synthesis techniques: a fully static approach and an online dynamic approach that constructs an attack that takes into account system noise and is able to execute the attack through the network. I demonstrate the effectiveness of my approaches on a set of experimental benchmarks
A Decision Procedure for Path Feasibility of String Manipulating Programs with Integer Data Type
Strings are widely used in programs, especially in web applications. Integer
data type occurs naturally in string-manipulating programs, and is frequently
used to refer to lengths of, or positions in, strings. Analysis and testing of
string-manipulating programs can be formulated as the path feasibility problem:
given a symbolic execution path, does there exist an assignment to the inputs
that yields a concrete execution that realizes this path? Such a problem can
naturally be reformulated as a string constraint solving problem. Although
state-of-the-art string constraint solvers usually provide support for both
string and integer data types, they mainly resort to heuristics without
completeness guarantees. In this paper, we propose a decision procedure for a
class of string-manipulating programs which includes not only a wide range of
string operations such as concatenation, replaceAll, reverse, and finite
transducers, but also those involving the integer data-type such as length,
indexof, and substring. To the best of our knowledge, this represents one of
the most expressive string constraint languages that is currently known to be
decidable. Our decision procedure is based on a variant of cost register
automata. We implement the decision procedure, giving rise to a new solver
OSTRICH+. We evaluate the performance of OSTRICH+ on a wide range of existing
and new benchmarks. The experimental results show that OSTRICH+ is the first
string decision procedure capable of tackling finite transducers and integer
constraints, whilst its overall performance is comparable with the
state-of-the-art string constraint solvers
Bit-Vector Model Counting using Statistical Estimation
Approximate model counting for bit-vector SMT formulas (generalizing \#SAT)
has many applications such as probabilistic inference and quantitative
information-flow security, but it is computationally difficult. Adding random
parity constraints (XOR streamlining) and then checking satisfiability is an
effective approximation technique, but it requires a prior hypothesis about the
model count to produce useful results. We propose an approach inspired by
statistical estimation to continually refine a probabilistic estimate of the
model count for a formula, so that each XOR-streamlined query yields as much
information as possible. We implement this approach, with an approximate
probability model, as a wrapper around an off-the-shelf SMT solver or SAT
solver. Experimental results show that the implementation is faster than the
most similar previous approaches which used simpler refinement strategies. The
technique also lets us model count formulas over floating-point constraints,
which we demonstrate with an application to a vulnerability in differential
privacy mechanisms
What is decidable about string constraints with the ReplaceAll function
The theory of strings with concatenation has been widely argued as the basis of constraint solving for verifying string-manipulating programs. However, this theory is far from adequate for expressing many string constraints that are also needed in practice; for example, the use of regular constraints (pattern matching against a regular expression), and the string-replace function (replacing either the first occurrence or all occurrences of a ``pattern'' string constant/variable/regular expression by a ``replacement'' string constant/variable), among many others. Both regular constraints and the string-replace function are crucial for such applications as analysis of JavaScript (or more generally HTML5 applications) against cross-site scripting (XSS) vulnerabilities, which motivates us to consider a richer class of string constraints. The importance of the string-replace function (especially the replace-all facility) is increasingly recognised, which can be witnessed by the incorporation of the function in the input languages of several string constraint solvers.
Recently, it was shown that any theory of strings containing the string-replace function (even the most restricted version where pattern/replacement strings are both constant strings) becomes undecidable if we do not impose some kind of straight-line (aka acyclicity) restriction on the formulas. Despite this, the straight-line restriction is still practically sensible since this condition is typically met by string constraints that are generated by symbolic execution. In this paper, we provide the first systematic study of straight-line string constraints with the string-replace function and the regular constraints as the basic operations. We show that a large class of such constraints (i.e. when only a constant string or a regular expression is permitted in the pattern) is decidable. We note that the string-replace function, even under this restriction, is sufficiently powerful for expressing the concatenation operator and much more (e.g. extensions of regular expressions with string variables). This gives us the most expressive decidable logic containing concatenation, replace, and regular constraints under the same umbrella. Our decision procedure for the straight-line fragment follows an automata-theoretic approach, and is modular in the sense that the string-replace terms are removed one by one to generate more and more regular constraints, which can then be discharged by the state-of-the-art string constraint solvers. We also show that this fragment is, in a way, a maximal decidable subclass of the straight-line fragment with string-replace and regular constraints. To this end, we show undecidability results for the following two extensions: (1) variables are permitted in the pattern parameter of the replace function, (2) length constraints are permitted
- …