5,208 research outputs found
Pruning, Pushdown Exception-Flow Analysis
Statically reasoning in the presence of exceptions and about the effects of
exceptions is challenging: exception-flows are mutually determined by
traditional control-flow and points-to analyses. We tackle the challenge of
analyzing exception-flows from two angles. First, from the angle of pruning
control-flows (both normal and exceptional), we derive a pushdown framework for
an object-oriented language with full-featured exceptions. Unlike traditional
analyses, it allows precise matching of throwers to catchers. Second, from the
angle of pruning points-to information, we generalize abstract garbage
collection to object-oriented programs and enhance it with liveness analysis.
We then seamlessly weave the techniques into enhanced reachability computation,
yielding highly precise exception-flow analysis, without becoming intractable,
even for large applications. We evaluate our pruned, pushdown exception-flow
analysis, comparing it with an established analysis on large scale standard
Java benchmarks. The results show that our analysis significantly improves
analysis precision over traditional analysis within a reasonable analysis time.Comment: 14th IEEE International Working Conference on Source Code Analysis
and Manipulatio
Sound and Precise Malware Analysis for Android via Pushdown Reachability and Entry-Point Saturation
We present Anadroid, a static malware analysis framework for Android apps.
Anadroid exploits two techniques to soundly raise precision: (1) it uses a
pushdown system to precisely model dynamically dispatched interprocedural and
exception-driven control-flow; (2) it uses Entry-Point Saturation (EPS) to
soundly approximate all possible interleavings of asynchronous entry points in
Android applications. (It also integrates static taint-flow analysis and least
permissions analysis to expand the class of malicious behaviors which it can
catch.) Anadroid provides rich user interface support for human analysts which
must ultimately rule on the "maliciousness" of a behavior.
To demonstrate the effectiveness of Anadroid's malware analysis, we had teams
of analysts analyze a challenge suite of 52 Android applications released as
part of the Auto- mated Program Analysis for Cybersecurity (APAC) DARPA
program. The first team analyzed the apps using a ver- sion of Anadroid that
uses traditional (finite-state-machine-based) control-flow-analysis found in
existing malware analysis tools; the second team analyzed the apps using a
version of Anadroid that uses our enhanced pushdown-based
control-flow-analysis. We measured machine analysis time, human analyst time,
and their accuracy in flagging malicious applications. With pushdown analysis,
we found statistically significant (p < 0.05) decreases in time: from 85
minutes per app to 35 minutes per app in human plus machine analysis time; and
statistically significant (p < 0.05) increases in accuracy with the
pushdown-driven analyzer: from 71% correct identification to 95% correct
identification.Comment: Appears in 3rd Annual ACM CCS workshop on Security and Privacy in
SmartPhones and Mobile Devices (SPSM'13), Berlin, Germany, 201
Pushdown Control-Flow Analysis of Higher-Order Programs
Context-free approaches to static analysis gain precision over classical
approaches by perfectly matching returns to call sites---a property that
eliminates spurious interprocedural paths. Vardoulakis and Shivers's recent
formulation of CFA2 showed that it is possible (if expensive) to apply
context-free methods to higher-order languages and gain the same boost in
precision achieved over first-order programs.
To this young body of work on context-free analysis of higher-order programs,
we contribute a pushdown control-flow analysis framework, which we derive as an
abstract interpretation of a CESK machine with an unbounded stack. One
instantiation of this framework marks the first polyvariant pushdown analysis
of higher-order programs; another marks the first polynomial-time analysis. In
the end, we arrive at a framework for control-flow analysis that can
efficiently compute pushdown generalizations of classical control-flow
analyses.Comment: The 2010 Workshop on Scheme and Functional Programmin
Heap Abstractions for Static Analysis
Heap data is potentially unbounded and seemingly arbitrary. As a consequence,
unlike stack and static memory, heap memory cannot be abstracted directly in
terms of a fixed set of source variable names appearing in the program being
analysed. This makes it an interesting topic of study and there is an abundance
of literature employing heap abstractions. Although most studies have addressed
similar concerns, their formulations and formalisms often seem dissimilar and
some times even unrelated. Thus, the insights gained in one description of heap
abstraction may not directly carry over to some other description. This survey
is a result of our quest for a unifying theme in the existing descriptions of
heap abstractions. In particular, our interest lies in the abstractions and not
in the algorithms that construct them.
In our search of a unified theme, we view a heap abstraction as consisting of
two features: a heap model to represent the heap memory and a summarization
technique for bounding the heap representation. We classify the models as
storeless, store based, and hybrid. We describe various summarization
techniques based on k-limiting, allocation sites, patterns, variables, other
generic instrumentation predicates, and higher-order logics. This approach
allows us to compare the insights of a large number of seemingly dissimilar
heap abstractions and also paves way for creating new abstractions by
mix-and-match of models and summarization techniques.Comment: 49 pages, 20 figure
On Verifying Complex Properties using Symbolic Shape Analysis
One of the main challenges in the verification of software systems is the
analysis of unbounded data structures with dynamic memory allocation, such as
linked data structures and arrays. We describe Bohne, a new analysis for
verifying data structures. Bohne verifies data structure operations and shows
that 1) the operations preserve data structure invariants and 2) the operations
satisfy their specifications expressed in terms of changes to the set of
objects stored in the data structure. During the analysis, Bohne infers loop
invariants in the form of disjunctions of universally quantified Boolean
combinations of formulas. To synthesize loop invariants of this form, Bohne
uses a combination of decision procedures for Monadic Second-Order Logic over
trees, SMT-LIB decision procedures (currently CVC Lite), and an automated
reasoner within the Isabelle interactive theorem prover. This architecture
shows that synthesized loop invariants can serve as a useful communication
mechanism between different decision procedures. Using Bohne, we have verified
operations on data structures such as linked lists with iterators and back
pointers, trees with and without parent pointers, two-level skip lists, array
data structures, and sorted lists. We have deployed Bohne in the Hob and Jahob
data structure analysis systems, enabling us to combine Bohne with analyses of
data structure clients and apply it in the context of larger programs. This
report describes the Bohne algorithm as well as techniques that Bohne uses to
reduce the ammount of annotations and the running time of the analysis
CFA2: a Context-Free Approach to Control-Flow Analysis
In a functional language, the dominant control-flow mechanism is function
call and return. Most higher-order flow analyses, including k-CFA, do not
handle call and return well: they remember only a bounded number of pending
calls because they approximate programs with control-flow graphs. Call/return
mismatch introduces precision-degrading spurious control-flow paths and
increases the analysis time. We describe CFA2, the first flow analysis with
precise call/return matching in the presence of higher-order functions and tail
calls. We formulate CFA2 as an abstract interpretation of programs in
continuation-passing style and describe a sound and complete summarization
algorithm for our abstract semantics. A preliminary evaluation shows that CFA2
gives more accurate data-flow information than 0CFA and 1CFA.Comment: LMCS 7 (2:3) 201
The Transitivity of Trust Problem in the Interaction of Android Applications
Mobile phones have developed into complex platforms with large numbers of
installed applications and a wide range of sensitive data. Application security
policies limit the permissions of each installed application. As applications
may interact, restricting single applications may create a false sense of
security for the end users while data may still leave the mobile phone through
other applications. Instead, the information flow needs to be policed for the
composite system of applications in a transparent and usable manner. In this
paper, we propose to employ static analysis based on the software architecture
and focused data flow analysis to scalably detect information flows between
components. Specifically, we aim to reveal transitivity of trust problems in
multi-component mobile platforms. We demonstrate the feasibility of our
approach with Android applications, although the generalization of the analysis
to similar composition-based architectures, such as Service-oriented
Architecture, can also be explored in the future
On abstraction refinement for program analyses in Datalog
A central task for a program analysis concerns how to efficiently find a program abstraction that keeps only information relevant for proving properties of interest. We present a new approach for finding such abstractions for program analyses written in Datalog. Our approach is based on counterexample-guided abstraction refinement: when a Datalog analysis run fails using an abstraction, it seeks to generalize the cause of the failure to other abstractions, and pick a new abstraction that avoids a similar failure. Our solution uses a boolean satisfiability formulation that is general, complete, and optimal: it is independent of the Datalog solver, it generalizes the failure of an abstraction to as many other abstractions as possible, and it identifies the cheapest refined abstraction to try next. We show the performance of our approach on a pointer analysis and a typestate analysis, on eight real-world Java benchmark programs
- …