307 research outputs found
Interprocedural Data Flow Analysis in Soot using Value Contexts
An interprocedural analysis is precise if it is flow sensitive and fully
context-sensitive even in the presence of recursion. Many methods of
interprocedural analysis sacrifice precision for scalability while some are
precise but limited to only a certain class of problems.
Soot currently supports interprocedural analysis of Java programs using graph
reachability. However, this approach is restricted to IFDS/IDE problems, and is
not suitable for general data flow frameworks such as heap reference analysis
and points-to analysis which have non-distributive flow functions.
We describe a general-purpose interprocedural analysis framework for Soot
using data flow values for context-sensitivity. This framework is not
restricted to problems with distributive flow functions, although the lattice
must be finite. It combines the key ideas of the tabulation method of the
functional approach and the technique of value-based termination of call string
construction.
The efficiency and precision of interprocedural analyses is heavily affected
by the precision of the underlying call graph. This is especially important for
object-oriented languages like Java where virtual method invocations cause an
explosion of spurious call edges if the call graph is constructed naively. We
have instantiated our framework with a flow and context-sensitive points-to
analysis in Soot, which enables the construction of call graphs that are far
more precise than those constructed by Soot's SPARK engine.Comment: SOAP 2013 Final Versio
Parameterized Algorithms for Scalable Interprocedural Data-flow Analysis
Data-flow analysis is a general technique used to compute information of
interest at different points of a program and is considered to be a cornerstone
of static analysis. In this thesis, we consider interprocedural data-flow
analysis as formalized by the standard IFDS framework, which can express many
widely-used static analyses such as reaching definitions, live variables, and
null-pointer. We focus on the well-studied on-demand setting in which queries
arrive one-by-one in a stream and each query should be answered as fast as
possible. While the classical IFDS algorithm provides a polynomial-time
solution to this problem, it is not scalable in practice. Specifically, it
either requires a quadratic-time preprocessing phase or takes linear time per
query, both of which are untenable for modern huge codebases with hundreds of
thousands of lines. Previous works have already shown that parameterizing the
problem by the treewidth of the program's control-flow graph is promising and
can lead to significant gains in efficiency. Unfortunately, these results were
only applicable to the limited special case of same-context queries.
In this work, we obtain significant speedups for the general case of
on-demand IFDS with queries that are not necessarily same-context. This is
achieved by exploiting a new graph sparsity parameter, namely the treedepth of
the program's call graph. Our approach is the first to exploit the sparsity of
control-flow graphs and call graphs at the same time and parameterize by both
treewidth and treedepth. We obtain an algorithm with a linear preprocessing
phase that can answer each query in constant time with respect to the input
size. Finally, we show experimental results demonstrating that our approach
significantly outperforms the classical IFDS and its on-demand variant
Precise Null Pointer Analysis Through Global Value Numbering
Precise analysis of pointer information plays an important role in many
static analysis techniques and tools today. The precision, however, must be
balanced against the scalability of the analysis. This paper focusses on
improving the precision of standard context and flow insensitive alias analysis
algorithms at a low scalability cost. In particular, we present a
semantics-preserving program transformation that drastically improves the
precision of existing analyses when deciding if a pointer can alias NULL. Our
program transformation is based on Global Value Numbering, a scheme inspired
from compiler optimizations literature. It allows even a flow-insensitive
analysis to make use of branch conditions such as checking if a pointer is NULL
and gain precision. We perform experiments on real-world code to measure the
overhead in performing the transformation and the improvement in the precision
of the analysis. We show that the precision improves from 86.56% to 98.05%,
while the overhead is insignificant.Comment: 17 pages, 1 section in Appendi
Static Analysis for Discovering Security Vulnerabilities in Web Applications on the Asp.Net Platform
Tato bakalářská práce popisuje jak teoretické základy, tak způsob vytvoření statického analyzátoru založeném na platformě .NET Framework a službách poskytnutých prostřednictvím .NET Compiler Platform. Tento analyzátor detekuje bezpečnostní slabiny typu SQL injection na platformě ASP.NET MVC. Analyzátor nejdříve sestrojuje grafy řízení toku jako abstraktní reprezentaci analyzovaného programu. Poté využívá statické analýzy pro sledování potenciálně nedůvěryhodných dat. Nakonec jsou výsledky analýzy prezentovány uživateli.This Bachelor thesis is intended to describe theoretical foundations as well as the construction of a static taint analyser based on the .NET Framework and the analysis services provided by the .NET Compiler Platform. This analyser detects SQL injection security vulnerabilities on the ASP.NET MVC platform. Firstly, the analyser constructs control flow graphs as an abstract representation of the analysed program. Then, it uses a static taint analysis to track potentially distrusted and tainted data values. Finally, analysis results are presented to the user.
Incremental Analysis of Programs
Algorithms used to determine the control and data flow properties of computer programs are generally designed for one-time analysis of an entire new input. Application of such algorithms when the input is only slightly modified results in an inefficient system. In this theses a set of incremental update algorithms are presented for data flow analysis. These algorithms update the solution from a previous analysis to reflect changes in the program. Thus, extensive reanalysis to reflect changes in the program. Thus, extensive reanalysis of programs after each program modification can be avoided. The incremental update algorithms presented for global flow analysis are based on Hecht/Ullman iterative algorithms. Banning\u27s interprocedural data flow analysis algorithms form the basis for the incremental interprocedural algorithms
Systematic Approaches to Advanced Information Flow Analysis – and Applications to Software Security
In dieser Arbeit berichte ich über Anwendungen von Slicing und Programmabhängigkeitsgraphen (PAG) in der Softwaresicherheit. Außerdem schlage ich ein Analyse-Rahmenwerk vor, welches Datenflussanalyse auf Kontrollflussgraphen und Slicing auf Programmabhängigkeitsgraphen verallgemeinert. Mit einem solchen Rahmenwerk lassen sich neue PAG-basierte Analysen systematisch ableiten, die über Slicing hinausgehen.
Die Hauptthesen meiner Arbeit lauten wie folgt:
(1) PAG-basierte Informationsflusskontrolle ist nützlich, praktisch anwendbar und relevant.
(2) Datenflussanalyse kann systematisch auf Programmabhängigkeitsgraphen angewendet werden.
(3) Datenflussanalyse auf Programmabhängigkeitsgraphen ist praktisch durchführbar
SMT-Based Refutation of Spurious Bug Reports in the Clang Static Analyzer
We describe and evaluate a bug refutation extension for the Clang Static
Analyzer (CSA) that addresses the limitations of the existing built-in
constraint solver. In particular, we complement CSA's existing heuristics that
remove spurious bug reports. We encode the path constraints produced by CSA as
Satisfiability Modulo Theories (SMT) problems, use SMT solvers to precisely
check them for satisfiability, and remove bug reports whose associated path
constraints are unsatisfiable. Our refutation extension refutes spurious bug
reports in 8 out of 12 widely used open-source applications; on average, it
refutes ca. 7% of all bug reports, and never refutes any true bug report. It
incurs only negligible performance overheads, and on average adds 1.2% to the
runtime of the full Clang/LLVM toolchain. A demonstration is available at {\tt
https://www.youtube.com/watch?v=ylW5iRYNsGA}.Comment: 4 page
Lazy model checking for recursive state machines
Recursive state machines (RSMs) are state-based models for procedural programs with wide-ranging applications in program verification and interprocedural analysis. Model-checking algorithms for RSMs and related formalisms have been intensively studied in the literature. In this article, we devise a new model-checking algorithm for RSMs and requirements in computation tree logic (CTL) that exploits the compositional structure of RSMs by ternary model checking in combination with a lazy evaluation scheme. Specifically, a procedural component is only analyzed in those cases in which it might influence the satisfaction of the CTL requirement. We implemented our model-checking algorithms and evaluate them on randomized scalability benchmarks and on an interprocedural data-flow analysis of Java programs, showing both practical applicability and significant speedups in comparison to state-of-the-art model-checking tools for procedural programs.</p
- …