2,004 research outputs found
Search-driven string constraint solving for vulnerability detection
Constraint solving is an essential technique for detecting vulnerabilities in programs, since it can reason about input sanitization and validation operations performed on user inputs. However, real-world programs typically contain complex string operations that challenge vulnerability detection. State-of-the-art string constraint solvers support only a limited set of string operations and fail when they encounter an unsupported one; this leads to limited effectiveness in finding vulnerabilities.
In this paper we propose a search-driven constraint solving technique that complements the support for complex string operations provided by any existing string constraint solver. Our technique uses a hybrid constraint solving procedure based on the Ant Colony Optimization meta-heuristic. The idea is to execute it as a fallback mechanism, only when a solver encounters a constraint containing an operation that it does not support.
We have implemented the proposed search-driven constraint solving technique in the ACO-Solver tool, which we have evaluated in the context of injection and XSS vulnerability detection for Java Web applications. We have assessed the benefits and costs of combining the proposed technique with two state-of-the-art constraint solvers (Z3-str2 and CVC4). The experimental results, based on a benchmark with 104 constraints derived from nine realistic Web applications, show that our approach, when combined in a state-of-the-art solver, significantly improves the number of detected vulnerabilities (from 4.7% to 71.9% for Z3-str2, from 85.9% to 100.0% for CVC4), and solves several cases on which the solver fails when used stand-alone (46 more solved cases for Z3-str2, and 11 more for CVC4), while still keeping the execution time affordable in practice
Bit-vector Support in Z3-str2 Solver and Automated Exploit Synthesis
Improper string manipulations are an important cause of software defects, which make them a target for program analysis by hackers and developers alike. Symbolic execution based program analysis techniques that systematically explore paths through string-intensive programs require reasoning about string and bit-vector constraints cohesively. The current state of the art symbolic execution engines for programs written in C/C++ languages track constraints on a bit-level and use bit-vector solver to reason about the collected path constraints. However, string functions incur high-performance penalties and lead to path explosion in the symbolic execution engine. The current state of the art string solvers are written primarily for the analysis of web applications with underlying support for the theory of strings and integers, which limits their use in the analysis of low-level programs. Therefore, we designed a decision procedure for the theory of strings and bit-vectors in Z3-str2, a decision procedure for strings and integers, to efficiently solve word equations and length functions over bit-vectors. The new theory combination has a significant role in the detection of integer overflows and memory corruption vulnerabilities associated with string operations. In addition, we introduced a new search space pruning technique for string lengths based on a binary search approach, which enabled our decision procedure to solve constraints involving large strings. We evaluated our decision procedure on a set of real security vulnerabilities collected from Common Vulnerabilities and Exposures (CVE) database and compared the result against the Z3-str2 string-integer solver. The experiments show that our decision procedure is orders of magnitude faster than Z3-str2 string-integer. The techniques we developed have the potential to dramatically improve the efficiency of symbolic execution of string-intensive programs.
In addition to designing and implementing a string bit-vector solver, we also addressed the problem of automated remote exploit construction. In this context, we introduce a practical approach for automating remote exploitation using information leakage vulnerability and show that current protection schemes against control-flow hijack attacks are not always very effective. To demonstrate the efficacy of our technique, we performed an over-the-network format string exploitation followed by a return-to-libc attack against a pre-forking concurrent server to gain remote access to a shell. Our attack managed to defeat various protections including ASLR, DEP, PIE, stack canary and RELRO
Dynamic Protocol Reverse Engineering a Grammatical Inference Approach
Round trip engineering of software from source code and reverse engineering of software from binary files have both been extensively studied and the state-of-practice have documented tools and techniques. Forward engineering of protocols has also been extensively studied and there are firmly established techniques for generating correct protocols. While observation of protocol behavior for performance testing has been studied and techniques established, reverse engineering of protocol control flow from observations of protocol behavior has not received the same level of attention. State-of-practice in reverse engineering the control flow of computer network protocols is comprised of mostly ad hoc approaches. We examine state-of-practice tools and techniques used in three open source projects: Pidgin, Samba, and rdesktop . We examine techniques proposed by computational learning researchers for grammatical inference. We propose to extend the state-of-art by inferring protocol control flow using grammatical inference inspired techniques to reverse engineer automata representations from captured data flows. We present evidence that grammatical inference is applicable to the problem domain under consideration
Saggitarius: A DSL for Specifying Grammatical Domains
Common data types like dates, addresses, phone numbers and tables can have
multiple textual representations, and many heavily-used languages, such as SQL,
come in several dialects. These variations can cause data to be misinterpreted,
leading to silent data corruption, failure of data processing systems, or even
security vulnerabilities. Saggitarius is a new language and system designed to
help programmers reason about the format of data, by describing grammatical
domains -- that is, sets of context-free grammars that describe the many
possible representations of a datatype. We describe the design of Saggitarius
via example and provide a relational semantics. We show how Saggitarius may be
used to analyze a data set: given example data, it uses an algorithm based on
semi-ring parsing and MaxSAT to infer which grammar in a given domain best
matches that data. We evaluate the effectiveness of the algorithm on a
benchmark suite of 110 example problems, and we demonstrate that our system
typically returns a satisfying grammar within a few seconds with only a small
number of examples. We also delve deeper into a more extensive case study on
using Saggitarius for CSV dialect detection. Despite being general-purpose, we
find that Saggitarius offers comparable results to hand-tuned, specialized
tools; in the case of CSV, it infers grammars for 84% of benchmarks within 60
seconds, and has comparable accuracy to custom-built dialect detection tools.Comment: OOPSLA 202
Computational aerodynamics and artificial intelligence
The general principles of artificial intelligence are reviewed and speculations are made concerning how knowledge based systems can accelerate the process of acquiring new knowledge in aerodynamics, how computational fluid dynamics may use expert systems, and how expert systems may speed the design and development process. In addition, the anatomy of an idealized expert system called AERODYNAMICIST is discussed. Resource requirements for using artificial intelligence in computational fluid dynamics and aerodynamics are examined. Three main conclusions are presented. First, there are two related aspects of computational aerodynamics: reasoning and calculating. Second, a substantial portion of reasoning can be achieved with artificial intelligence. It offers the opportunity of using computers as reasoning machines to set the stage for efficient calculating. Third, expert systems are likely to be new assets of institutions involved in aeronautics for various tasks of computational aerodynamics
- …