1,286 research outputs found
Precise Null Pointer Analysis Through Global Value Numbering
Precise analysis of pointer information plays an important role in many
static analysis techniques and tools today. The precision, however, must be
balanced against the scalability of the analysis. This paper focusses on
improving the precision of standard context and flow insensitive alias analysis
algorithms at a low scalability cost. In particular, we present a
semantics-preserving program transformation that drastically improves the
precision of existing analyses when deciding if a pointer can alias NULL. Our
program transformation is based on Global Value Numbering, a scheme inspired
from compiler optimizations literature. It allows even a flow-insensitive
analysis to make use of branch conditions such as checking if a pointer is NULL
and gain precision. We perform experiments on real-world code to measure the
overhead in performing the transformation and the improvement in the precision
of the analysis. We show that the precision improves from 86.56% to 98.05%,
while the overhead is insignificant.Comment: 17 pages, 1 section in Appendi
Faults in Linux 2.6
In August 2011, Linux entered its third decade. Ten years before, Chou et al.
published a study of faults found by applying a static analyzer to Linux
versions 1.0 through 2.4.1. A major result of their work was that the drivers
directory contained up to 7 times more of certain kinds of faults than other
directories. This result inspired numerous efforts on improving the reliability
of driver code. Today, Linux is used in a wider range of environments, provides
a wider range of services, and has adopted a new development and release model.
What has been the impact of these changes on code quality? To answer this
question, we have transported Chou et al.'s experiments to all versions of
Linux 2.6; released between 2003 and 2011. We find that Linux has more than
doubled in size during this period, but the number of faults per line of code
has been decreasing. Moreover, the fault rate of drivers is now below that of
other directories, such as arch. These results can guide further development
and research efforts for the decade to come. To allow updating these results as
Linux evolves, we define our experimental protocol and make our checkers
available
Parameterized Construction of Program Representations for Sparse Dataflow Analyses
Data-flow analyses usually associate information with control flow regions.
Informally, if these regions are too small, like a point between two
consecutive statements, we call the analysis dense. On the other hand, if these
regions include many such points, then we call it sparse. This paper presents a
systematic method to build program representations that support sparse
analyses. To pave the way to this framework we clarify the bibliography about
well-known intermediate program representations. We show that our approach, up
to parameter choice, subsumes many of these representations, such as the SSA,
SSI and e-SSA forms. In particular, our algorithms are faster, simpler and more
frugal than the previous techniques used to construct SSI - Static Single
Information - form programs. We produce intermediate representations isomorphic
to Choi et al.'s Sparse Evaluation Graphs (SEG) for the family of data-flow
problems that can be partitioned per variables. However, contrary to SEGs, we
can handle - sparsely - problems that are not in this family
Generalized Points-to Graphs: A New Abstraction of Memory in the Presence of Pointers
Flow- and context-sensitive points-to analysis is difficult to scale; for
top-down approaches, the problem centers on repeated analysis of the same
procedure; for bottom-up approaches, the abstractions used to represent
procedure summaries have not scaled while preserving precision.
We propose a novel abstraction called the Generalized Points-to Graph (GPG)
which views points-to relations as memory updates and generalizes them using
the counts of indirection levels leaving the unknown pointees implicit. This
allows us to construct GPGs as compact representations of bottom-up procedure
summaries in terms of memory updates and control flow between them. Their
compactness is ensured by the following optimizations: strength reduction
reduces the indirection levels, redundancy elimination removes redundant memory
updates and minimizes control flow (without over-approximating data dependence
between memory updates), and call inlining enhances the opportunities of these
optimizations. We devise novel operations and data flow analyses for these
optimizations.
Our quest for scalability of points-to analysis leads to the following
insight: The real killer of scalability in program analysis is not the amount
of data but the amount of control flow that it may be subjected to in search of
precision. The effectiveness of GPGs lies in the fact that they discard as much
control flow as possible without losing precision (i.e., by preserving data
dependence without over-approximation). This is the reason why the GPGs are
very small even for main procedures that contain the effect of the entire
program. This allows our implementation to scale to 158kLoC for C programs
JITed: A Framework for JIT Education in the Classroom
The study of programming languages is a rich field within computer science, incorporating both the abstract theoretical portions of computer science and the platform specific details. Topics studied in programming languages, chiefly compilers or interpreters, are permanent fixtures in programming that students will interact with throughout their career. These systems are, however, considerably complicated, as they must cover a wide range of functionality in order to enable languages to be created and run. The process of educating students thus requires that the demanding workload of creating one of the systems be balanced against the time and resources present in a university classroom setting. Systems building upon these fundamental systems can become out of reach when the number of preceding concepts and thus classes are taken into account. Among these is the study of just-in-time (JIT) compilers, which marry the processes of interpreters and compilers for the purposes of a flexible and fast runtime.
The purpose of this thesis is to present JITed, a framework within which JIT compilers can be developed with a time commitment and workload befitting of a classroom setting, specifically one as short as ten weeks. A JIT compiler requires the development of both an interpreter and a compiler. This poses a problem, as classes teaching compilers and interpreters typically feature the construction of one of those systems as their term project. This makes the construction of both within the same time span as is usually allotted for a single system infeasible. To remedy this, JITed features a prebuilt interpreter, that provides the runtime environment necessary for the compiler portion of a JIT compiler to be built. JITed includes an interface for students to provide both their own compiler and the functionality to determine which portions of code should be compiled. The framework allows for important concepts of both compilers in general and JIT compilers to be taught in a reasonable timeframe
- …