5 research outputs found
Scalability-First Pointer Analysis with Self-Tuning Context-Sensitivity
Context-sensitivity is important in pointer analysis to ensure high
precision, but existing techniques suffer from unpredictable scala-
bility. Many variants of context-sensitivity exist, and it is difficult
to choose one that leads to reasonable analysis time and obtains
high precision, without running the analysis multiple times.
We present the Scaler framework that addresses this problem.
Scaler efficiently estimates the amount of points-to information
that would be needed to analyze each method with different variants
of context-sensitivity. It then selects an appropriate variant for
each method so that the total amount of points-to information is
bounded, while utilizing the available space to maximize precision.
Our experimental results demonstrate that Scaler achieves pre-
dictable scalability for all the evaluated programs (e.g., speedups
can reach 10x for 2-object-sensitivity), while providing a precision
that matches or even exceeds that of the best alternative techniques
AutoPruner: Transformer-Based Call Graph Pruning
Constructing a static call graph requires trade-offs between soundness and
precision. Program analysis techniques for constructing call graphs are
unfortunately usually imprecise. To address this problem, researchers have
recently proposed call graph pruning empowered by machine learning to
post-process call graphs constructed by static analysis. A machine learning
model is built to capture information from the call graph by extracting
structural features for use in a random forest classifier. It then removes
edges that are predicted to be false positives. Despite the improvements shown
by machine learning models, they are still limited as they do not consider the
source code semantics and thus often are not able to effectively distinguish
true and false positives. In this paper, we present a novel call graph pruning
technique, AutoPruner, for eliminating false positives in call graphs via both
statistical semantic and structural analysis. Given a call graph constructed by
traditional static analysis tools, AutoPruner takes a Transformer-based
approach to capture the semantic relationships between the caller and callee
functions associated with each edge in the call graph. To do so, AutoPruner
fine-tunes a model of code that was pre-trained on a large corpus to represent
source code based on descriptions of its semantics. Next, the model is used to
extract semantic features from the functions related to each edge in the call
graph. AutoPruner uses these semantic features together with the structural
features extracted from the call graph to classify each edge via a feed-forward
neural network. Our empirical evaluation on a benchmark dataset of real-world
programs shows that AutoPruner outperforms the state-of-the-art baselines,
improving on F-measure by up to 13% in identifying false-positive edges in a
static call graph.Comment: Accepted to ESEC/FSE 2022, Research Trac
Precision-guided context sensitivity for pointer analysis
Context sensitivity is an essential technique for ensuring high precision in Java pointer analyses. It has been
observed that applying context sensitivity partially, only on a select subset of the methods, can improve the
balance between analysis precision and speed. However, existing techniques are based on heuristics that
do not provide much insight into what characterizes this method subset. In this work, we present a more
principled approach for identifying precision-critical methods, based on general patterns of value flows that
explain where most of the imprecision arises in context-insensitive pointer analysis. Accordingly, we provide
an efficient algorithm to recognize these flow patterns in a given program and exploit them to yield good
tradeoffs between analysis precision and speed.
Our experimental results on standard benchmark and real-world programs show that a pointer analysis that
applies context sensitivity partially, only on the identified precision-critical methods, preserves effectively all
(98.8%) of the precision of a highly-precise conventional context-sensitive pointer analysis (2-object-sensitive
with a context-sensitive heap), with a substantial speedup (on average 3.4X, and up to 9.2X)
Faster and More Precise Pointer Analysis Algorithms for Automatic Bug Detection
Pointer Analysis is a fundamental technique with enormous applications, such as value-flow analysis, bug detection, etc. It is also a prerequisite of many compiler optimizations. However, despite decades of research, the scalability and precision of pointer analysis remain to be an open question. In this dissertation, I introduce my research effort to apply pointer analysis to detect vulnerabilities in software and more importantly, to design and implement a faster and more precise pointer analysis algorithm.
In this dissertation, I present my works on improving both the precision and the performance of inclusion-based pointer analysis. I proposed two fundamental algorithms, origin-sensitive pointer analysis and partial update solver (PUS), and show their practicality by building two tools, O2 and XRust, on top of them. Origin-sensitive pointer analysis unifies widely-used concurrent pro-gramming models: events and threads, and analyzes data sharing (which is essential for static data race detection) with thread/event spawning sites as the context. PUS, a new solving algorithm for inclusion-based pointer analysis, advances the state-of-the-art by operating on a small subgraph of the entire points-to constraint graph at each iteration while still guaranteeing correctness. Our experimental results show that PUS is 2x faster in solving context-insensitive points-to constraints and 7x faster in solving context-sensitive constraints. Meanwhile, the tool, O2, that is backed by origin-sensitive pointer analysis was able to detect many previously unknown data races in real-world applications including Linux, Redis, memcached, etc; XRust can also isolate memory errors in unsafe Rust from safe Rust utilizing data sharing information computed by pointer analysis with negligible overhead
Scalability-First Pointer Analysis with Self-Tuning Context-Sensitivity (Artifact)
<p>This artifact is provided to reproduce the results of all four research questions (RQ1 -- RQ4) in our companion paper "Scalability-First Pointer Analysis with Self-Tuning Context-Sensitivity", i.e., the results in Table 1, Table 2, Figure 7 and Figure 8 of the paper. The artifact contains Scaler (implementation of our tool), Doop (a state-of-the-art whole-program pointer analysis framework for Java), and the Java programs and the library used in our evaluation. It also contains the comprehensive documentations which give step-by-step instructions for guiding the user to install and use this artifact.</p>
<p>To use this artifact, please start by reading <strong>README.pdf</strong> and <strong>INSTALL.pdf</strong> in the artifact package.</p