119 research outputs found

    Transparent pointer compression for linked data structures

    Full text link
    64-bit address spaces are increasingly important for modern applications, but they come at a price: pointers use twice as much memory, reducing the effective cache capacity and memory bandwidth of the system (compared to 32-bit ad-dress spaces). This paper presents a sophisticated, auto-matic transformation that shrinks pointers from 64-bits to 32-bits. The approach is “macroscopic, ” i.e., it operates on an entire logical data structure in the program at a time. It allows an individual data structure instance or even a subset thereof to grow up to 232 bytes in size, and can compress pointers to some data structures but not others. Together, these properties allow efficient usage of a large (64-bit) ad-dress space. We also describe (but have not implemented) a dynamic version of the technique that can transparently expand the pointers in an individual data structure if it ex-ceeds the 4GB limit. For a collection of pointer-intensive benchmarks, we show that the transformation reduces peak heap sizes substantially by (20 % to 2x) for several of these benchmarks, and improves overall performance significantly in some cases

    Semi-Supervised Object Detection in the Open World

    Full text link
    Existing approaches for semi-supervised object detection assume a fixed set of classes present in training and unlabeled datasets, i.e., in-distribution (ID) data. The performance of these techniques significantly degrades when these techniques are deployed in the open-world, due to the fact that the unlabeled and test data may contain objects that were not seen during training, i.e., out-of-distribution (OOD) data. The two key questions that we explore in this paper are: can we detect these OOD samples and if so, can we learn from them? With these considerations in mind, we propose the Open World Semi-supervised Detection framework (OWSSD) that effectively detects OOD data along with a semi-supervised learning pipeline that learns from both ID and OOD data. We introduce an ensemble based OOD detector consisting of lightweight auto-encoder networks trained only on ID data. Through extensive evalulation, we demonstrate that our method performs competitively against state-of-the-art OOD detection algorithms and also significantly boosts the semi-supervised learning performance in open-world scenarios

    Technical Report: Region and Effect Inference for Safe Parallelism

    Get PDF
    In this paper, we present the first full regions-and-effects inference algorithm for explicitly parallel fork-join programs. We infer annotations inspired by Deterministic Parallel Java (DPJ) for a type-safe subset of C++. We chose the DPJ annotations because they give the \emph{strongest} safety guarantees of any existing concurrency-checking approach we know of, static or dynamic, and it is also the most expressive static checking system we know of that gives strong safety guarantees. This expressiveness, however, makes manual annotation difficult and tedious, which motivates the need for automatic inference, but it also makes the inference problem very challenging: the code may use region polymorphism, imperative updates with complex aliasing, arbitrary recursion, hierarchical region specifications, and wildcard elements to describe potentially infinite sets of regions. We express the inference as a constraint satisfaction problem and develop, implement, and evaluate an algorithm for solving it. The region and effect annotations inferred by the algorithm constitute a checkable proof of safe parallelism, and it can be recorded both for documentation and for fast and modular safety checking.Ope

    The influence of random delays on parallel execution times

    Full text link

    Automatic Pool Allocation: Compile-Time Control of Data Structure Layout in the Heap

    Get PDF
    Despite the potential importance of data structure layouts and traversal patterns, compiler transformations on pointer-intensive programs are performed primarily using pointer analysis, and not by controlling and using information about the layout of high-level data structures. This paper describes a compiler transformation called \emph{Automatic Pool Allocation} that segregates instances of ``logical'' data structures in the heap into distinct pools, and allows different heuristics to be used to partially control the internal layout of those data structures. Because these are rigorous transformations, their results, combined with pointer analysis information, can be used to perform further compiler analyses and transformations, and we briefly list a few examples. Automatic Pool Allocation also provides several direct performance benefits for pointer intensive programs, most importantly, that traversals of a logical data structure allocated to a separate pool can have better spatial locality and smaller working sets. We evaluate the performance and cache behavior of the code transformed by the automatic pool allocation transformation on a series of heap-intensive and general-purpose benchmarks, and find that it speeds up several C programs by 10-40\% percent or more, and does not hurt (or help) other programs

    Types, Regions, and Effects for Safe Programming with Object-Oriented Parallel Frameworks

    Get PDF
    Object-oriented frameworks can make parallel programming easier by providing generic parallel algorithms such as map, reduce, or scan, and letting the user fill in the details with sequential code. However, such frameworks can produce incorrect behavior if they are not carefully used, e.g., if a user-supplied function performs an unsynchronized access to a global variable. We develop novel techniques that a framework designer can use to prevent such errors. Building on a language (Deterministic Parallel Java, or DPJ) with an expressive region-based type and effect system, we show how to write a framework API that enables sound reasoning about the effects of unknown user-supplied methods. We also describe novel extensions to DPJ that enable generic types and effects --- essential for flexible frameworks --- while retaining soundness. Finally, we show how to make the reasoning modular: using any desired testing or verification technique, the framework author can guarantee noninterference subject to the API constraints; and the compiler can check the constraints to provide a noninterference guarantee for the entire user program. We evaluate our technique by using it to write two parallel frameworks and two realistic parallel algorithms.NSF grant CCF 07-02724NSF grant CNS 07-20772Microsoft and Intel through UPCRC Illinoisunpublishednot peer reviewe

    Parallel Programming Must Be Deterministic By Default

    Get PDF
    We examine the problem of providing a parallel programming model that guarantees deterministic semantics. We propose a research agenda focusing on the following questions: 1. How to guarantee determinism in a modern object-oriented language; 2. How to provide sound guarantees when parts of the program either cannot be proved deterministic or have "harmless" nondeterminism; 3. How to specify explicit non-determinism when needed; and 4. How to make it easier to port programs to the language
    corecore