868 research outputs found
Gradual Program Analysis
Dataflow analysis and gradual typing are both well-studied methods to gain information about computer programs in a finite amount of time. The gradual program analysis project seeks to combine those two techniques in order to gain the benefits of both. This thesis explores the background information necessary to understand gradual program analysis, and then briefly discusses the research itself, with reference to publication of work done so far. The background topics include essential aspects of programming language theory, such as syntax, semantics, and static typing; dataflow analysis concepts, such as abstract interpretation, semilattices, and fixpoint computations; and gradual typing theory, such as the concept of an unknown type, liftings of predicates, and liftings of functions
NPEFix: Automatic Runtime Repair of Null Pointer Exceptions in Java
Null pointer exceptions, also known as null dereferences are the number one
exceptions in the field. In this paper, we propose 9 alternative execution
semantics when a null pointer exception is about to happen. We implement those
alternative execution strategies using code transformation in a tool called
NPEfix. We evaluate our prototype implementation on 11 field null dereference
bugs and 519 seeded failures and show that NPEfix is able to repair at runtime
10/11 and 318/519 failures
A Non-Null Annotation Inferencer for Java Bytecode
We present a non-null annotations inferencer for the Java bytecode language.
We previously proposed an analysis to infer non-null annotations and proved it
soundness and completeness with respect to a state of the art type system. This
paper proposes extensions to our former analysis in order to deal with the Java
bytecode language. We have implemented both analyses and compared their
behaviour on several benchmarks. The results show a substantial improvement in
the precision and, despite being a whole-program analysis, production
applications can be analyzed within minutes
AutoPruner: Transformer-Based Call Graph Pruning
Constructing a static call graph requires trade-offs between soundness and
precision. Program analysis techniques for constructing call graphs are
unfortunately usually imprecise. To address this problem, researchers have
recently proposed call graph pruning empowered by machine learning to
post-process call graphs constructed by static analysis. A machine learning
model is built to capture information from the call graph by extracting
structural features for use in a random forest classifier. It then removes
edges that are predicted to be false positives. Despite the improvements shown
by machine learning models, they are still limited as they do not consider the
source code semantics and thus often are not able to effectively distinguish
true and false positives. In this paper, we present a novel call graph pruning
technique, AutoPruner, for eliminating false positives in call graphs via both
statistical semantic and structural analysis. Given a call graph constructed by
traditional static analysis tools, AutoPruner takes a Transformer-based
approach to capture the semantic relationships between the caller and callee
functions associated with each edge in the call graph. To do so, AutoPruner
fine-tunes a model of code that was pre-trained on a large corpus to represent
source code based on descriptions of its semantics. Next, the model is used to
extract semantic features from the functions related to each edge in the call
graph. AutoPruner uses these semantic features together with the structural
features extracted from the call graph to classify each edge via a feed-forward
neural network. Our empirical evaluation on a benchmark dataset of real-world
programs shows that AutoPruner outperforms the state-of-the-art baselines,
improving on F-measure by up to 13% in identifying false-positive edges in a
static call graph.Comment: Accepted to ESEC/FSE 2022, Research Trac
Performance Evaluation of Automated Static Analysis Tools
Automated static analysis tools can perform efficient thorough checking of important properties of, and extract and summarize critical information about, a source program. This paper evaluates three open-source static analysis tools; Flawfinder, Cppcheck and Yasca. Each tool is analyzed with regards to usability, IDE integration, performance, and accuracy. Special emphasis is placed on the integration of these tools into the development environment to enable analysis during all phases of development as well as to enable extension of rules and other improvements within the tools. It is shown that Flawfinder be the easiest to modify and extend, Cppcheck be inviting to novices, and Yasca be the most accurate and versatile
Evaluating Pre-trained Language Models for Repairing API Misuses
API misuses often lead to software bugs, crashes, and vulnerabilities. While
several API misuse detectors have been proposed, there are no automatic repair
tools specifically designed for this purpose. In a recent study,
test-suite-based automatic program repair (APR) tools were found to be
ineffective in repairing API misuses. Still, since the study focused on
non-learning-aided APR tools, it remains unknown whether learning-aided APR
tools are capable of fixing API misuses. In recent years, pre-trained language
models (PLMs) have succeeded greatly in many natural language processing tasks.
There is a rising interest in applying PLMs to APR. However, there has not been
any study that investigates the effectiveness of PLMs in repairing API misuse.
To fill this gap, we conduct a comprehensive empirical study on 11
learning-aided APR tools, which include 9 of the state-of-the-art
general-purpose PLMs and two APR tools. We evaluate these models with an
API-misuse repair dataset, consisting of two variants. Our results show that
PLMs perform better than the studied APR tools in repairing API misuses. Among
the 9 pre-trained models tested, CodeT5 is the best performer in the exact
match. We also offer insights and potential exploration directions for future
research.Comment: Under review by TOSE
- …