91 research outputs found
Context-Free Path Querying by Matrix Multiplication
Graph data models are widely used in many areas, for example, bioinformatics,
graph databases. In these areas, it is often required to process queries for
large graphs. Some of the most common graph queries are navigational queries.
The result of query evaluation is a set of implicit relations between nodes of
the graph, i.e. paths in the graph. A natural way to specify these relations is
by specifying paths using formal grammars over the alphabet of edge labels. An
answer to a context-free path query in this approach is usually a set of
triples (A, m, n) such that there is a path from the node m to the node n,
whose labeling is derived from a non-terminal A of the given context-free
grammar. This type of queries is evaluated using the relational query
semantics. Another example of path query semantics is the single-path query
semantics which requires presenting a single path from the node m to the node
n, whose labeling is derived from a non-terminal A for all triples (A, m, n)
evaluated using the relational query semantics. There is a number of algorithms
for query evaluation which use these semantics but all of them perform poorly
on large graphs. One of the most common technique for efficient big data
processing is the use of a graphics processing unit (GPU) to perform
computations, but these algorithms do not allow to use this technique
efficiently. In this paper, we show how the context-free path query evaluation
using these query semantics can be reduced to the calculation of the matrix
transitive closure. Also, we propose an algorithm for context-free path query
evaluation which uses relational query semantics and is based on matrix
operations that make it possible to speed up computations by using a GPU.Comment: 9 pages, 11 figures, 2 table
Fully Dynamic Single-Source Reachability in Practice: An Experimental Study
Given a directed graph and a source vertex, the fully dynamic single-source
reachability problem is to maintain the set of vertices that are reachable from
the given vertex, subject to edge deletions and insertions. It is one of the
most fundamental problems on graphs and appears directly or indirectly in many
and varied applications. While there has been theoretical work on this problem,
showing both linear conditional lower bounds for the fully dynamic problem and
insertions-only and deletions-only upper bounds beating these conditional lower
bounds, there has been no experimental study that compares the performance of
fully dynamic reachability algorithms in practice. Previous experimental
studies in this area concentrated only on the more general all-pairs
reachability or transitive closure problem and did not use real-world dynamic
graphs.
In this paper, we bridge this gap by empirically studying an extensive set of
algorithms for the single-source reachability problem in the fully dynamic
setting. In particular, we design several fully dynamic variants of well-known
approaches to obtain and maintain reachability information with respect to a
distinguished source. Moreover, we extend the existing insertions-only or
deletions-only upper bounds into fully dynamic algorithms. Even though the
worst-case time per operation of all the fully dynamic algorithms we evaluate
is at least linear in the number of edges in the graph (as is to be expected
given the conditional lower bounds) we show in our extensive experimental
evaluation that their performance differs greatly, both on generated as well as
on real-world instances
Pruning, Pushdown Exception-Flow Analysis
Statically reasoning in the presence of exceptions and about the effects of
exceptions is challenging: exception-flows are mutually determined by
traditional control-flow and points-to analyses. We tackle the challenge of
analyzing exception-flows from two angles. First, from the angle of pruning
control-flows (both normal and exceptional), we derive a pushdown framework for
an object-oriented language with full-featured exceptions. Unlike traditional
analyses, it allows precise matching of throwers to catchers. Second, from the
angle of pruning points-to information, we generalize abstract garbage
collection to object-oriented programs and enhance it with liveness analysis.
We then seamlessly weave the techniques into enhanced reachability computation,
yielding highly precise exception-flow analysis, without becoming intractable,
even for large applications. We evaluate our pruned, pushdown exception-flow
analysis, comparing it with an established analysis on large scale standard
Java benchmarks. The results show that our analysis significantly improves
analysis precision over traditional analysis within a reasonable analysis time.Comment: 14th IEEE International Working Conference on Source Code Analysis
and Manipulatio
Sound and Precise Malware Analysis for Android via Pushdown Reachability and Entry-Point Saturation
We present Anadroid, a static malware analysis framework for Android apps.
Anadroid exploits two techniques to soundly raise precision: (1) it uses a
pushdown system to precisely model dynamically dispatched interprocedural and
exception-driven control-flow; (2) it uses Entry-Point Saturation (EPS) to
soundly approximate all possible interleavings of asynchronous entry points in
Android applications. (It also integrates static taint-flow analysis and least
permissions analysis to expand the class of malicious behaviors which it can
catch.) Anadroid provides rich user interface support for human analysts which
must ultimately rule on the "maliciousness" of a behavior.
To demonstrate the effectiveness of Anadroid's malware analysis, we had teams
of analysts analyze a challenge suite of 52 Android applications released as
part of the Auto- mated Program Analysis for Cybersecurity (APAC) DARPA
program. The first team analyzed the apps using a ver- sion of Anadroid that
uses traditional (finite-state-machine-based) control-flow-analysis found in
existing malware analysis tools; the second team analyzed the apps using a
version of Anadroid that uses our enhanced pushdown-based
control-flow-analysis. We measured machine analysis time, human analyst time,
and their accuracy in flagging malicious applications. With pushdown analysis,
we found statistically significant (p < 0.05) decreases in time: from 85
minutes per app to 35 minutes per app in human plus machine analysis time; and
statistically significant (p < 0.05) increases in accuracy with the
pushdown-driven analyzer: from 71% correct identification to 95% correct
identification.Comment: Appears in 3rd Annual ACM CCS workshop on Security and Privacy in
SmartPhones and Mobile Devices (SPSM'13), Berlin, Germany, 201
Introspective Pushdown Analysis of Higher-Order Programs
In the static analysis of functional programs, pushdown flow analysis and
abstract garbage collection skirt just inside the boundaries of soundness and
decidability. Alone, each method reduces analysis times and boosts precision by
orders of magnitude. This work illuminates and conquers the theoretical
challenges that stand in the way of combining the power of these techniques.
The challenge in marrying these techniques is not subtle: computing the
reachable control states of a pushdown system relies on limiting access during
transition to the top of the stack; abstract garbage collection, on the other
hand, needs full access to the entire stack to compute a root set, just as
concrete collection does. \emph{Introspective} pushdown systems resolve this
conflict. Introspective pushdown systems provide enough access to the stack to
allow abstract garbage collection, but they remain restricted enough to compute
control-state reachability, thereby enabling the sound and precise product of
pushdown analysis and abstract garbage collection. Experiments reveal
synergistic interplay between the techniques, and the fusion demonstrates
"better-than-both-worlds" precision.Comment: Proceedings of the 17th ACM SIGPLAN International Conference on
Functional Programming, 2012, AC
Pushdown Control-Flow Analysis of Higher-Order Programs
Context-free approaches to static analysis gain precision over classical
approaches by perfectly matching returns to call sites---a property that
eliminates spurious interprocedural paths. Vardoulakis and Shivers's recent
formulation of CFA2 showed that it is possible (if expensive) to apply
context-free methods to higher-order languages and gain the same boost in
precision achieved over first-order programs.
To this young body of work on context-free analysis of higher-order programs,
we contribute a pushdown control-flow analysis framework, which we derive as an
abstract interpretation of a CESK machine with an unbounded stack. One
instantiation of this framework marks the first polyvariant pushdown analysis
of higher-order programs; another marks the first polynomial-time analysis. In
the end, we arrive at a framework for control-flow analysis that can
efficiently compute pushdown generalizations of classical control-flow
analyses.Comment: The 2010 Workshop on Scheme and Functional Programmin
- …