614 research outputs found
Heap Abstractions for Static Analysis
Heap data is potentially unbounded and seemingly arbitrary. As a consequence,
unlike stack and static memory, heap memory cannot be abstracted directly in
terms of a fixed set of source variable names appearing in the program being
analysed. This makes it an interesting topic of study and there is an abundance
of literature employing heap abstractions. Although most studies have addressed
similar concerns, their formulations and formalisms often seem dissimilar and
some times even unrelated. Thus, the insights gained in one description of heap
abstraction may not directly carry over to some other description. This survey
is a result of our quest for a unifying theme in the existing descriptions of
heap abstractions. In particular, our interest lies in the abstractions and not
in the algorithms that construct them.
In our search of a unified theme, we view a heap abstraction as consisting of
two features: a heap model to represent the heap memory and a summarization
technique for bounding the heap representation. We classify the models as
storeless, store based, and hybrid. We describe various summarization
techniques based on k-limiting, allocation sites, patterns, variables, other
generic instrumentation predicates, and higher-order logics. This approach
allows us to compare the insights of a large number of seemingly dissimilar
heap abstractions and also paves way for creating new abstractions by
mix-and-match of models and summarization techniques.Comment: 49 pages, 20 figure
Compile-Time Analysis on Programs with Dynamic Pointer-Linked Data Structures
This paper studies static analysis on programs
that create and traverse dynamic pointer-linked data structures.
It introduces a new type of auxiliary structures, called {\em link graphs},
to depict the alias information of pointers and connection relationships
of dynamic pointer-linked data structures.
The link graphs can be used by compilers to detect side effects,
to identify the patterns of traversal, and to gather the
DEF-USE information of dynamic pointer-linked data structures.
The results of the above compile-time analysis are essential
for parallelization and optimizations on communication and
synchronization overheads.
Algorithms that perform compile-time analysis on side effects
and DEF-USE information using link graphs will be proposed
An Integrated Program Representation for Loop Optimizations
Inspite of all the advances, automatic parallelization has not entered the general purpose compiling environment for several reasons.
There have been two distinct schools of thought in parallelization domain namely, affine and non-affine which have remained incompatible with each other over the years. Thus, a good practical compiler will have to be able to analyze and parallelize any type of code - affine or non-affine or a mix of both.
To be able to achieve the best performance, compilers will have to derive the order of transformations best suitable for a given program on a given system. This problem, known as "Phase Ordering", is a very crucial impedance for practical compilers, more so for parallelizing compilers. The ideal compiler should be able to consider various orders of transformations and reason about the performance benefits of the same.
In order to achieve such a compiler, in this paper, we propose a unified program representation which has the following characteristics:
a) Modular in nature.
b) Ability to represent both ane and non-ane transformations.
c) Ability to use detailed static run-time estimators directly on the representation
Multi-core computation of transfer matrices for strip lattices in the Potts model
The transfer-matrix technique is a convenient way for studying strip lattices
in the Potts model since the compu- tational costs depend just on the periodic
part of the lattice and not on the whole. However, even when the cost is
reduced, the transfer-matrix technique is still an NP-hard problem since the
time T(|V|, |E|) needed to compute the matrix grows ex- ponentially as a
function of the graph width. In this work, we present a parallel
transfer-matrix implementation that scales performance under multi-core
architectures. The construction of the matrix is based on several repetitions
of the deletion- contraction technique, allowing parallelism suitable to
multi-core machines. Our experimental results show that the multi-core
implementation achieves speedups of 3.7X with p = 4 processors and 5.7X with p
= 8. The efficiency of the implementation lies between 60% and 95%, achieving
the best balance of speedup and efficiency at p = 4 processors for actual
multi-core architectures. The algorithm also takes advantage of the lattice
symmetry, making the transfer matrix computation to run up to 2X faster than
its non-symmetric counterpart and use up to a quarter of the original space
- …