Recent years have seen increasing interest in systems that reason about and menipulate executable code. Such systems can generally benefit from information about aliasing.
Introduction
Recent years have seen increasing interest in reasoning about and manipulating executable files [5, 15, 20, 25, 27, 30, 31, 331 . When working with an executable file, we typically have information about the entire program-inchming, potentially, library functionsthat is usually not available at compile time. Because of this, code manipulation and optimization at this level offers benefits that are dlilicult or impossible to obtain using traditional compilers. As with the compilation of source-level programs, code transformations on executable code can benefit greatly from pointer alias information. For example, inlining library routine5 may open up opportunities for moving invariant load instructions out of loops, but alias information is needed in order to identify such invariant load instructions. To obtain the full benefits of a superscalar architecture such as the DEC Alpha, link-time optimizers such as Spike [5] , alto [Xl] , and OM [30] need to carry out instruction scbedulimg again after link-time optimizations; without pointer alias information, however, the scheduler must be conservative in its treatment of loads and stores, and this can limit the amount of code reordering that is possible. As a final example, it may be possible to scavenge registers at link-time, e.g., by examiuing the register usage of Iibrary functions, but the ability to use such scavenged registers effectively is likely to be limited in the absence of pointer alias information.
There is an extensive body of work on pointer alias analysis of various kinds (see Section 6) . In almost all cases, these are high level analyses, carried out on rep resentations of source programs ln terms of source language constructs, and typically disregarding 'Lnasty'J features such as type casts, pointer arithmetic, and out-of-bounds array accesses. Such analyses turn out, unfortunately, to be of limited utility at the machine code level, because at this level all we have are the nasty features. The contents of registers and memory words are untyped bit-strings, so the issue of type casts is in some sense moot: everything is potentially an address. Memory accesses typically involve some address arithmetic to compute a base address into a register, followed by the use of a displacement off the base address to carry out the actual memory reference. Address arithmetic may also arise due to particular language features, e.g., the use of 'tag bits" in dynamically typed languages to indicate the type of the value pointed at. Dereferencing operations in the executable code for such programs will involve nontrivial arithmetic involving the tag bits that is invisibleand irrelevant-at the source level {at the level of executable programs, we can'ttell what source language a particukr piece of code was derived from, and different components of a program might have been written in different source languages, so we must be able to deal with all such address arithmetic in a reasonable way). If the number of arguments to a function is Iarge enough, some of the arguments may have to be passed on the stack. In such a case, the arguments passed on the stack will typically reside at the top of the caller's stack frame, and the caIlee will "reach into" the caller's frame to access them: this is nothing but an out-of-bounds array reference. Finally, executable programs may include library functions, in hand-written assembly code, that violate familiar and comfortable source-level assumptions, e.g., that execution does not jump out of the middle of one function. and into the middle of another (this happens, for example, in some Fortran library routines). To austrate some of the problems that arise, consider the fragment of C code shown in Figure 1 , together with the corresponding assembly code.' The point to,note is the extensive use of address arithmetic to access memory, even in this very simple program fragment. For example, in order to determine whether instructions (10) and (11) might write to the same memory location, we need to be able to reason about the contents of registers r16 and r17, which are defined through the arithmetic operations in instructions (5) and (6) . As this example illustrates, pointer arithmetic cannot be ignored during alias analysis at the machme code level.
In this paper, we describe a low-level, flowsensitive, context-insensitive interprocedural pointer alias analysis algorithm, designed and implemented in the context of the alto lii time optimizer [lo] , that can handle significant pointer arithmetic aud features, such as out-of-bound references, that are ignored by most existing alias analysis algorithms.
For simplicity in the discussion that follows, we assume a more or less canonical RISC instruction set. Memory is accessed only through explicit load and store instructions, which have the form load reg,, khg,) and store reg,, kh&, where k is a constant, and have the effect of reading from, or writing to, the location whose address is kc+ contents-of(reg&. To model arithmetic we assume the instructions add SW, 8x2, dest and mult srq, src2, dest, where dest is a destination register and srzr and s7ze are source registers; to simplify the discussion we abuse notation and allow either srcr or srcs to be an integer 'The assembly code shown corresponds to that obtained using gee -II on a DEC Alpha workstation, with some edits to enhance readability. On the Alpha, arguments to functions are typically passed in registers 16 . . .21, and register 30 is used as the stack pointer.
constant, denoting an immediate operand. These instructions compute, respectively, the sum and product of srcr and ~7x2 into dest (many other operations can be expressed in terms of these, e.g., subtraction and register-to-register moves can be modelled in terms of addition: we do not consider these separately). Jn addition to these we assume the usual complement of tests, conditional jumps, and direct and indirect unconditional jumps: the only effect of these instructions is to determine the control flow graph of the program, so we do not consider them explicitly in the context of alias analysis. We also ignore operations on floating point registers, since it seems unlikely that such operations would be used for address computations.
Local Alias Analysis
A technique called instruction inspection, commonly used in compile-time instruction schedulers, can be used to reason about memory references within a basic block. Here, two memory reference instructions il and is are taken to be non-con6icting if either of the following conditions hold:
1. they use distinct offsets from the same base register T, and r is not redefined between ir and is; or 2. one of the instructions uses a register known to point to the stack and the other uses a register kuown to point to the global data area.
Unfortunately, this simple approach does not work if information about address arithmetic needs to be propagated across basic block boundaries. In the next section we describe a global analysis that can be used to handle this.
Residue-Based Global Alias Analysis

The Basic Idea
An alias analysis will in general associate each register with a set of possible addresses at each program point, so we need to abstract sets of addresses to descriptions, or 'abstract address sets." These need to be easy to compute and compactly representable, with operations such as union, intersection, checking containment, etc., that are cheap enough to be practical for the analysis of large programs. A simple way to satisfy these criteria is to consider only some fixed number-say, m-of the Iow order bits of an address. That is, addresses are represented by their mod-k residues, where k = 2m. The set of all mod-k residues is & = (0,. . . , k -1). An Figure 1: A fragment of a C program and the corresponding assembly code abstract address set can then be represented as a bit vector of length k; since m-and, therefore, k = 2m-is fixed, set operations such as union, intersection, checking containment, etc., can be carried out in O(1) bitvector operations. This representation can cope with address arithmetic, e.g., as illustrated in Figure 1 , since such arithmetic translates in a straightforward way to mod-k arithmetic (see, for example, [17] ). Finally, since x mod k # (CC f 6) mod k for 0 < 6 < 2"', the representation can distinguish between addresses involving distinct "small" displacements (i.e., less than 2"') from a base register.
It turns out that mod-k residues are not, by themselves, adequate for our purposes. The probiem is that in many cases we won't be able to predict the actual value of a register r (e.g., the stack pointer) at a pre gram point, which means we won't be able to say anything about a displacement k from r, i.e., the address corresponding to k(r), either. To deal with this problem we extend abstract address sets to address descrip tars, which take an additional component that refers to an instruction: where I is either an instruction or one of the diitinguished values {NONE, ANY), and M is a set of mod-k residues. Given an address descriptor A E (I, M), the instruction I is said to be the defining hz.st~ction of A, while M is called the residue set of A. m
14
The intuition is that given an address descriptor (1, M), M denotes a set of mod-k residues r&&e to whatever value is computed by instruction I. A value of NONE indicates that the corresponding residue set represents mod-k residues of absolute addresses, while a value of ANY indicates that the address descriptor denotes all possible addresses. More formally, suppose that we are given an operational semantics for the instruction set under consideration (such a semantics is conceptually simple, if somewhat tedious, to specify for the simple instruction set considered here: we omit a formal specification due to space constraints, and rely instead on the informal description of the instructions given at the end of Section 1). Given a program P and an instruction I in P, let valp(l) denote the set of values w such that, for some input to P, there is an execution path from the entry point of P to the instruction I that causes I to compute w into its destination register (uaZp(l) = 0 if I does not compute a value into a register, or if control never reaches I). F&end this to the special values NONE and ANY as follows: for any program P, U&(NONE) = (0)'. aud VaIp(ANY) is the set of all values. Then, for an a&y-sis using mod-k residues, the set of addresses denoted by an address descriptor A EZ (I, M) in P-that is, the '%oncretization" of A in the context of P-is:
{w+ikfs jw E uaZp(I),z E M,i 2.0).
As this indicates, different values may be computed by
different executions of a particular instruction. This implies that, for the purposes of alias analysis, it is not enough to consider address descriptors in isolation. This issue is addressed in more detail in Section 3.3.
The relative precision of different address descriptors can be characterized via the binary relation S: It is straightforward to show that a is reflexive and transitive, i.e., a preorder. It ten be extended to a partial order in the usual way: define the relation N asAr~AsifandonlyifA1~AsandAs4Ar-itis easy to show that this is an equivalence relation-and consider the quotient of + with respect to N-. The set of address descriptors forms a lattice with respect to this partial order. In the remainder of this discussion, we abuse notation and write 4 to refer to the resulting partial order. In particular, the equivalence class containing (1, Zk) for all I, as well as {ANY, M) for all M, denotes a total lack of information, and is written as I; the equivalence class containing (1,0) for all 1, denotes the empty set of addresses and is written as T. Our analysis associates an address descriptor with each register at each program point of interest.2 If a register r has an associated address descriptor {I, M) at a program point, we will sometimes abuse terminology and refer to instruction I as the delining instruction for r at that point.
The Analysis Algorithm
Effects of Individual Instructions
As mentioned earlier, the defining instruction component of an address descriptor allows us to refer to modk residues relative to "whatever value is computed by the defining instruction." When examinmg an instruction I with destination register r, if we can't say anything about the value of r after instruction I, then instead of setting the address descriptor for r to I, we use I as the defining instruction for r and associate the address descriptor {I, (0)) with r at the point immediately after 1'. To simplify the discussion, we assume that an immediate operand c yields an address descriptor (NONE,(C~~~ k}) in an analysis based on mod-k residues. Individual instructions are analyzed as shown in Figure 2 . The reasoning behind these operations is as follows: -For load instructions, our analysis currently doesn't keep track of the contents of memory locations, except for read-only sections of the text and data segments.3 Otherwise, we can say nothing about the contents of t' after the load instruction, so the resulting address descriptor is (4 cw--A store instruction does not affect address descriptors since it does not affect the contents of any register.
-For an instruction add srco, Smb, d&it, Figure 2 shows two cases. The correctness of the first csse follows straightforwardly from the rules for mod-k arithmetic [17] ; the second case is obviously safe, but merits some discussion: if A, N 1, Ab fi 1, or Ia # lb, it's easy to see that we can't say anything about the reSdt Of the OperatiOn; if 1, = lb = 10 for some Is, it's tempting to think that the resulting address descriptor could be given as (10, M') , where M' = {(za + zb) mod k 1 za E Mar a E Mb}, but this is not the case, since M' doesn't account for the fact that the values being added have, as components, two (possibly different) values from &J(~o).
-For an instruction mult src,, smb, dest, Figure 2 shows three cases. The correctness of the first case follows easily from the rules for mod-k arithmetic; the second case can be thought of as %videning" Ab to (NONE, Z,), which is obviously safe, and then applying the first case; the reasoning for the third case is analogous to that for the add instruction above.
In typical RISC code, the most commonly encountered address expression by far involves a fixed displacement off a base register, which corresponds to the add instruction discussed above. As such it is especially important that this case be handled efficiently. (2) Otherwise, we can't say anything about the result of this operation, so the address descriptor for dest after I is taken to be (1; (0)). (3) Otherwise, we can't say much about the result of the multiplication, so the address descriptor for dest after instruction I is (I, (0)). Suppose that the instruction under consideration is add reg, , c, zgb. It turns out that given an address descriptor {I, M) for reg,, with M represented as a bit vector, the bit vector M' in the descriptor (I, M') for Te$, can be obtained simply by "rotating up" the bit-vector for M by c bits, and this is easy to implement efficiently. As an example, suppose that M = {1,5,6) in a mod-8 residue analysis, and c = 3, then M' = {4,8,9) mod 8 = {4,&l). If we represent these sets as bit vectors with the smallest element on the right, then X = 01100010; rotating up (i.e., to the left) by 3 bits gives us the vector 00010011, which is precisely the bit vector for M'.
Propagating
Address Descriptors
Conceptually, if we consider all possible execution paths through a program, each register at each program point will correspond to a set of values; abstracting from this, one would expect an analysis to map each register to a set of address descriptors at each program point. Given the handling of individual instructions as described in the previous section, the analysis is now a conceptually straightforward forward dataflow analysis where we compute the meet-over-allpaths solution,4 with union as the meet operator [l] .
. It turns out that if each register, at each program point, is mapped to a set of address descriptors, the memory requirements for the analysis can become excessive for large programs. This is due partly because fully linked executables tend to be considerably larger than source Isuguage modules, and partly because rea- soning about address arithmetic is usually less precise than, say, reasoning about aliasing at the source level. As a pragmatic measure, therefore, a widening operation [8] is used to ensure that at each program point, each register is mapped to a singleton set of address descriptors-or, equivalently, a single address descriptor, As mentioned in Section 3.1, the set of address descriptors forms a lattice with respect to the precision ordering 9. The widening operation v is defined to be simply the meet operation with respect to 3. In effect, what this does is that if a program point B has two predecessors BO and Bl , such that the address descriptors for a register r at Bo and Bl are & = (I&MO) and Al = (II, Ml) respectively, where neither As nor A1 are T, and lo # 4, then the address descriptor for ratBisAsvAr=L.
while this widening results in "less accurate" information in some sense--this is reflected in the experimental results on the precision of our analysis shown in Table l -it doesn't really change the alias relationships that are determined. To see this, consider a basic block B with two precedessors BO and Bl. Suppose that we have a register TV whose address descriptors at the exit from Bo and Bl are given by (2, Mz) and {Ii, nlr,') respectively, and we want to determine whether this is possibly aliased to a register rb, with address descriptor (&,, Mb), at the entry to B. If the defining instructions from two address descriptors are diierent, we can't say much about any relationship that may hold between them. This means that if 2 # I,' it will necessarily be the case that & will be different from at least one of c and Ii, leading us to conclude that we cannot de out aliasing between T, and Tb: this is the same conclusion as that from the result of the widening operation. Conversely, if 1[1 = It = Ia, then whether or not T, and Tb are possible aliases depends on whether or not Mb has a non-empty intersection with M,O U M$ again, this is the ssme as tith the widening operation.
The resulting analysis is reasonably memoryefficient: for each basic block we need two address d+ scriptors per register, one for the IN set, at the entry to the block, and one for the OUT set, at the exit. Thus, for a given choice of k, the analysis requires 2BN(k+w) bits of memory for a program with N basic blocks on a machine with R registers, where w is the number of bits per machine word.5
Reasoning about Alias Relationships
Given two address descriptors A1 E {II, Ml) and AZ = (Ia, Mz) at two points in a program, under what conditions can we conclude that they definitely do not refer to the same address? If II # I2 we cannot say much about any relationship that may hold between A1 and AZ, and so have to assume that they may refer to the same location. However, it is not sufhcient to require that Ir = 1s and Ml fl MZ = 0, since the value computed by a particular instruction may 5This cau be reduced to RN(b+w) bits, as in our implementation, by storing only OUT sets, since the IN set of a block can be computed fairly easily from the OUT sets of its predecessors.
be different when that instruction is executed at different times. The following proposition gives a simple sufficient condition for determining that two address expressions denote disjoint sets of addresses: Proof Conditions (i) and (ii) ensure that both the program points pr and pz see the same value computed by instruction I. Condition (iii) then ensures that relative to this value, the set of addresses referred to at pr is disjoint from that referred to at pz. a r Example 3.1 As an example of theapplication of this analysis to a real program, Figure 3 shows the flow graph of the function jpegidctifast(), which impIements a fast integer inverse discrete cosine transform, from the SPEG95 benchmark program ijpeg. To reduce clutter, only a few relevant instructions are shown explicitly: the number in brackets at the lower left hand corner of each basic block indicates the total number of instructions in that basic block. Register r30 is the stack pointer, while r21 is used to walk through a local array of structures with a stride of 32 bytes.
Using the current implementation of our analysis, which uses mod-64 residues, the address descriptor for register r21 immediately after instruction (2) in block 36 is computed as ((I), {8)), where (1) is the instruction in block Bl that defines the value of r30. Each iteration of the loop B7-38B9-BlO increments r21 by 32, so the address descriptor for 1~21 on entry to block B9 is {(l), (8,403); however, register r30 is not changed in the loop, so its address descriptor in B9 is ((l), (0)). Since the requirements of Proposition 3.1 are trivially satisfied within block B9, we can conclude from this that the store instruction (4), which assigns to location 80 (r30), refers to a different location than instruction (5), which accesses location 0 (r21). [7 
Alias Analysis in alto '
Alto ("Another Link-Time Optimizer"), a prototype link-time optimizer we have implemented [lo] , uses a combination of an extended version of the local analysis described in Section 2, and the global anaIysis described in Section 3, to reason about aliases in executable code: we conclude that a pair of memory references will not access overlapping sets of locations if either analysis is able to determine that this is so. We first carry out context-insensitive interprocedural constant propagation to identify references to global addresses, folIowed by the global alias analysis described earlier. The extended local analysis proceeds as foE lows: two memory reference instructions ir and i2 do not conflict if one of the following holds:
1. one of the instructions uses a register known to point to the stack and the other uses a register known to point to the global data area (note that because of the constant propagation carried out earlier, in this case ir and is need not belong to the same basic block); or 2. ir and ie, which use address expressions 12r(rl) and Jcz(rs) respectively, are both in the same basic block B; and there are two (possibly empty) chains of instructions whose effects are to compute the value q + contents-o&) into register 7-1 and ce + contents-of(rs) into rs, for some register rc, such that either both chains use the same definition of TO in the block B, or neither use any definition of TO in B; and q + kr # c2 + JE2.
Experimental Results
We evaluated our analysis on the SPEG95 benchmarks as well as some non-SPEC applications: agrep, a pattern matching utility [3'7] ; appbt and appsp, computatiomd fluid dynamics codes originally from NAS*; barnes-hut, a simulation program to compute n-body gravitational interactions 121; latex, a popular document formatting tool; and pseudoknot, a numerical benchmark that finds the 3-dimensional structure of a nucleic acid molecule. The input .programs were compiled with the DEC C compiler V&2-023 invoked as cc -04 +1,-r -Wl,-d +1,-z -nonshared (for the C programs), and the DEC Fortran compiler version 3.8 invoked as f77 -04 +1,-r -Ml,-d +1,-z -nonshared {for the Fortran programs), resulting in statically linked executables. The measurements reported here were carried out after first removing dead and unreachable code from these executables, as well as trivia3 loads, noops inserted for scheduling and alignment purposes, and redundant loads of the gp rep ister, using alto [lo] . The timings were obtained on a DEC Alphaworkstation, with a 300 MHz Alpha 21164 processor with 512 Mbytes of main memory, running Digital Unix 4.0. analysis, while Table 2 shows its the time and space requirements.
Precision
'IYaditionaUy, the precision of alias analysis algorithms is often presented in terms of the average size of pointsto sets or alias sets. In our context, however, there are no points-to or alias sets: a more meaningful measure, perhaps, is the (relative) number of memory references-i.e., load and store instructions-for which the analysis is able to provide information that would not have been available otherwise. This information is presented in Table 1 . The numbers presented correspond to mod-k residues with k = 64 (this choice was determined in part by the fact that the set of mod-k residues for this choice of k corresponds to a bit vector that fits exactly in one 64bit machine word), combined with the local analysis described in Section 2.
It can be seen that in the programs tested, the analysis is able to provide information for roughly 30%-60% of the memory reference instructions. Preliminary investigations indicate that much of the loss in precision occurs due to three reasons. First, since we don't keep track of the contents of memory, information about a register is lost if it is saved to memory and subsequently restored. Second, the widening operation described in Section 3.2.2, which causes information to be lost if a register can have diierent defining instructions at different predecessors of a join point in the control flow graph. The third reason, which is related to the second, is that since our aualysis is contextinsensitive at the inter-procedural level, pointer argu-1 PROGRAM 1 BASIC BLOCKS } INSTRUCTIONS 1 ANALYSIS TIME {set) } MEMORY USED (b) Non-SPEC applications Table 2 : Cost of Analysis ments to a procedure with multiple call sites will become widened to 1. Table 2 gives the time and space costs of our analysis. Columns 2 and 3 give the size of each benchmark, measured, respectively, in the total number of basic blocks and instructions in the program. Column 4 then gives the tota analysis time in seconds, while column 5 gives the total memory requirements of the an&y-sis in Mbytes. The analysis times range from about 2 seconds to 29 seconds, with the gee program an outher with a total analysis time of a little over a minute. These numbers are somewhat higher than we would like, but the reason for this is that every instruction within a basic block is examined whenever that basic block is processed. As Because of the widening operation described in Section 3.2.2, the memory requirements of the analysis are linear in the number of basic blocks in the input program: we feel that this is essential if the analysis is to be usable for large programs.
Cost
utility
At this point, the only optimization for which we have had the' time to evaluate the utility of the alias analysis described here involves reducing the number of load operations executed: by using scavenged registers to e!iminate some unnecessary load instructions, moving loop-invariant load instructions-typicalIy arising due to inlining-out of loops, and via partial redundancy elimination. Preliminary results are shown in Table  3 , which gives dynamic counts of the number of load instructions for some of our benchmarks. The column NOALIAS gives the number of load operations executed Table 3 : Utility of Analysis: Deletion of unnecessary load instructions in the absence of any alias analysis at all, i.e., where any pair of references to memory were considered to potentially access the same locations; the INSPECT column gives the number of load operations when we used simple inspection, as described in Section 2, for intra-block load optimizations; and the ALTO column gives the number of load operations executed when programs were optimized using our analyses to disambiguate memory references. Since all other optimizations, such as deletion of dead/unreachable code, inlining, etc., are carried out in the same way by all three versions considered, with the only difference arising out of the way in which potential conflicts in memory accesses were identified, NOALIAS forms a fair basis for comparisons. The last two columns give the percentage reduction in the number of load operations obtained using local inspection, measured as (NOALIAS -INSPECT)/NOALIAS, and global analysis, measured as (NOALIAS -ALTO)/NOALIAS, respectively.
It can be seen, from Table 3 , that improvements due to purely local alias analysis are small to nonexistent. This does not come as a surprise, since at op timization level -04, global register allocation has already been carried out by the compiler, leaving few loads available for easy removal. Global analysis gives better results, including 4.7% of the total number of load instructions removed for the ipppp benchmark, and 6.7%for appbt. The reason for the improvement for fpppp is that it contains a very heavily executed basic block that is so large that the register pressure forces the compiler to spill a number of variables to memory; alto is able to scavenge some registers at link time and use them to retain some of the spilled variables in registers, thereby allowing the spill code to be deleted. The overall percentage improvements are, nevertheless, relatively modest; this is consistent with the results of Cooper and Lu [7] . To a great extent, the reasons for this are twofold: first, the compiler has already done a good job of removing memory op erations via global register allocation; and second, in many cases, a lack of free registers prevented us from optimizing away load operations that our alias analysis had inferred as optimizable. To some extent, impreci-sion in our analysis, arising from the sources discussed in Section 5.1, also affected the number of memory operations deemed suitabIe for optimization.
Related Work
While a number of systems have been described for link-time code optimization [5, 15, 16, 27, 30, 31, 331 , to the best of our knowIedge, any alias analysis carried out by these systems is limited to fairly simple Iocal .analyses.
There is an extensive body of work on pointer alias anaIysis of various kinds (see, for example, {3, 4, 6, 9, 11, 12, 13, 14, 18, 19, 21, 22, 23, 24, 26, 28, 29, 32, 34, 351) . The work most closely related to ours is that of Wilsen and Lam 1351, who describe a low-level pointer alias anaIysis for C programs. Their work attempts to deai with cirasty" features of real programs and can handIe simple pointer inCrements and decrements, but is unable to cope with the more complex address arithmetic common in executable code (see Example 3.1). Also, it restricts itself to C language features, and so cannot handle arithmetic arisiig from idiosyncracies of other languages, e.g., manipulation of pointers with "tag bits," that may be encountered in executable code. Their algorithm is context-sensitive at the inter-procedural level, however, while our current implementation is context-insensitive (conceptually, it would not be too difficult to obtain a context-sensitive version of our algorithm, but we have not had time to implement this yet). The remaining analyses cited are all high level analyses that typically disregard type casts, pointer arithmetic, out-of-bounds array accesses, etc. As argued earlier, such analyses are of limited utility at the ma&me code level.
Also related is the work on dependence analysis in the scientific computing literature (see, for example, [36, 381) . While the goals of this work are conceptually similar to ours-namely, disambiguating array references whose indices can involve arithmetic expressions-the algorithms used for dependence analysis are very different from that described here. Since dependence analysis is typically formulated as a source level intra-procedural analysis, the analysis problems tend to be relatively smal1 in size. Because of this, dependence analyses are able to. use relatively more sophisticated, but '&so more expensive, algorithms than ours. We do not know of any attempts to apply such algorithms for whole-program analysis, and it is not obvious to us that the algorithms involved would scale up to problems of thii size.
22
ConcIllsions
Recent years have seen increasing interest in .reasoning about and manipulating executable files. Such manipulations can benefit greatly from information about s&sing. Unfortunately, there is a fundamental mismatch between the features present in executable programs and the features handled by existing pointer alias analyses: such analyses are typically formulated in terms of source-level constructs, and do not handle features such as pointer arithmetic and out-of-bound array references, whereas these are precisely the features encountered in executable programs. This paper describes a simple algorithm that can handle these features, and which can be used for alias analysis of executable programs. In order to be practical, the algorithm is careful to keep its memory requirements low, sacrificing precision where necessary to achieve this goal. Experimental results indicate that it is nevertheless able to provide nontrivial information about roughly 30%-60% of the memory references across a variety of benchmark programs.
