54 research outputs found

    Transparent pointer compression for linked data structures

    Full text link
    64-bit address spaces are increasingly important for modern applications, but they come at a price: pointers use twice as much memory, reducing the effective cache capacity and memory bandwidth of the system (compared to 32-bit ad-dress spaces). This paper presents a sophisticated, auto-matic transformation that shrinks pointers from 64-bits to 32-bits. The approach is “macroscopic, ” i.e., it operates on an entire logical data structure in the program at a time. It allows an individual data structure instance or even a subset thereof to grow up to 232 bytes in size, and can compress pointers to some data structures but not others. Together, these properties allow efficient usage of a large (64-bit) ad-dress space. We also describe (but have not implemented) a dynamic version of the technique that can transparently expand the pointers in an individual data structure if it ex-ceeds the 4GB limit. For a collection of pointer-intensive benchmarks, we show that the transformation reduces peak heap sizes substantially by (20 % to 2x) for several of these benchmarks, and improves overall performance significantly in some cases

    Technical Report: Region and Effect Inference for Safe Parallelism

    Get PDF
    In this paper, we present the first full regions-and-effects inference algorithm for explicitly parallel fork-join programs. We infer annotations inspired by Deterministic Parallel Java (DPJ) for a type-safe subset of C++. We chose the DPJ annotations because they give the \emph{strongest} safety guarantees of any existing concurrency-checking approach we know of, static or dynamic, and it is also the most expressive static checking system we know of that gives strong safety guarantees. This expressiveness, however, makes manual annotation difficult and tedious, which motivates the need for automatic inference, but it also makes the inference problem very challenging: the code may use region polymorphism, imperative updates with complex aliasing, arbitrary recursion, hierarchical region specifications, and wildcard elements to describe potentially infinite sets of regions. We express the inference as a constraint satisfaction problem and develop, implement, and evaluate an algorithm for solving it. The region and effect annotations inferred by the algorithm constitute a checkable proof of safe parallelism, and it can be recorded both for documentation and for fast and modular safety checking.Ope

    The influence of random delays on parallel execution times

    Full text link

    Automatic Pool Allocation: Compile-Time Control of Data Structure Layout in the Heap

    Get PDF
    Despite the potential importance of data structure layouts and traversal patterns, compiler transformations on pointer-intensive programs are performed primarily using pointer analysis, and not by controlling and using information about the layout of high-level data structures. This paper describes a compiler transformation called \emph{Automatic Pool Allocation} that segregates instances of ``logical'' data structures in the heap into distinct pools, and allows different heuristics to be used to partially control the internal layout of those data structures. Because these are rigorous transformations, their results, combined with pointer analysis information, can be used to perform further compiler analyses and transformations, and we briefly list a few examples. Automatic Pool Allocation also provides several direct performance benefits for pointer intensive programs, most importantly, that traversals of a logical data structure allocated to a separate pool can have better spatial locality and smaller working sets. We evaluate the performance and cache behavior of the code transformed by the automatic pool allocation transformation on a series of heap-intensive and general-purpose benchmarks, and find that it speeds up several C programs by 10-40\% percent or more, and does not hurt (or help) other programs

    Types, Regions, and Effects for Safe Programming with Object-Oriented Parallel Frameworks

    Get PDF
    Object-oriented frameworks can make parallel programming easier by providing generic parallel algorithms such as map, reduce, or scan, and letting the user fill in the details with sequential code. However, such frameworks can produce incorrect behavior if they are not carefully used, e.g., if a user-supplied function performs an unsynchronized access to a global variable. We develop novel techniques that a framework designer can use to prevent such errors. Building on a language (Deterministic Parallel Java, or DPJ) with an expressive region-based type and effect system, we show how to write a framework API that enables sound reasoning about the effects of unknown user-supplied methods. We also describe novel extensions to DPJ that enable generic types and effects --- essential for flexible frameworks --- while retaining soundness. Finally, we show how to make the reasoning modular: using any desired testing or verification technique, the framework author can guarantee noninterference subject to the API constraints; and the compiler can check the constraints to provide a noninterference guarantee for the entire user program. We evaluate our technique by using it to write two parallel frameworks and two realistic parallel algorithms.NSF grant CCF 07-02724NSF grant CNS 07-20772Microsoft and Intel through UPCRC Illinoisunpublishednot peer reviewe

    Parallel Programming Must Be Deterministic By Default

    Get PDF
    We examine the problem of providing a parallel programming model that guarantees deterministic semantics. We propose a research agenda focusing on the following questions: 1. How to guarantee determinism in a modern object-oriented language; 2. How to provide sound guarantees when parts of the program either cannot be proved deterministic or have "harmless" nondeterminism; 3. How to specify explicit non-determinism when needed; and 4. How to make it easier to port programs to the language

    An Empirical Study of Reported Bugs in Server Software with Implications for Automated Bug Diagnosis

    Get PDF
    Reproducing bug symptoms is a prerequisite for performing automatic bug diagnosis. Do bugs have characteristics that ease or hinder automatic bug diagnosis? In this paper, we conduct a thorough empirical study of several key characteristics of bugs that affect reproducibility at the production site. We examine randomly selected bug reports of six server applications and consider their implications on automatic bug diagnosis tools. Our results are promising. From the study, we find that nearly 82% of bug symptoms can be reproduced deterministically by re-running with the same set of inputs at the production site. We further find that very few input requests are needed to reproduce most failures; in fact, just one input request after session establishment suffices to reproduce the failure in nearly 77% of the cases. We describe the implications of the results on reproducing software failures and designing automated diagnosis tools for production runs.published or accepted for publicatio

    Enforcing Alias Analysis for Weakly Typed Languages

    Get PDF
    Static analysis of programs in weakly typed languages such as C and C++ generally is not guaranteed to be sound because programs in these languages may have undetected memory errors such as dangling pointer references, references using uninitialized pointers, and array bounds overflow. Optimizing compilers can produce undefined results for programs with memory access errors, but this is quite undesirable for many tools that aim to analyze security and reliability properties with guarantees of soundness. We describe a relatively simple compilation strategy for standard C programs that guarantees sound semantics for a call graph, an aggressive interprocedural pointer analysis, and type information for a subset of memory. These provide the foundation for sophisticated static analyses to be applied to such programs with a guarantee of soundness. Our work builds on a previously published transformation called Automatic Pool Allocation to ensure that hard-to-detect memory errors (dangling pointer references and certain array bounds errors) cannot invalidate the call graph, points-to information or type information. The key insight behind our approach is that pool allocation can be used to create a run-time partitioning of memory that matches the compile-time memory partitioning in a points-to graph, and efficient checks can be used to isolate the run-time partitions. Furthermore, we show that the sound analysis information enable static checking techniques that reliably eliminate many run-time checks. Our approach requires no source code changes, allows memory to be managed explicitly, and do not use meta-data on pointers or individual tag bits for memory thus avoiding any external library compatibility issues. Using many benchmarks and system codes, we show experimentally that the run-time overheads are low (less than 10% in nearly all cases and 30% in the worst case we have seen). We also show the effectiveness of reliable static analyses for eliminating run-time checks
    corecore