2,735 research outputs found

    A Scalable, Portable, and Memory-Efficient Lock-Free FIFO Queue

    Get PDF
    We present a new lock-free multiple-producer and multiple-consumer (MPMC) FIFO queue design which is scalable and, unlike existing high-performant queues, very memory efficient. Moreover, the design is ABA safe and does not require any external memory allocators or safe memory reclamation techniques, typically needed by other scalable designs. In fact, this queue itself can be leveraged for object allocation and reclamation, as in data pools. We use FAA (fetch-and-add), a specialized and more scalable than CAS (compare-and-set) instruction, on the most contended hot spots of the algorithm. However, unlike prior attempts with FAA, our queue is both lock-free and linearizable. We propose a general approach, SCQ, for bounded queues. This approach can easily be extended to support unbounded FIFO queues which can store an arbitrary number of elements. SCQ is portable across virtually all existing architectures and flexible enough for a wide variety of uses. We measure the performance of our algorithm on the x86-64 and PowerPC architectures. Our evaluation validates that our queue has exceptional memory efficiency compared to other algorithms and its performance is often comparable to, or exceeding that of state-of-the-art scalable algorithms

    Description and Optimization of Abstract Machines in a Dialect of Prolog

    Full text link
    In order to achieve competitive performance, abstract machines for Prolog and related languages end up being large and intricate, and incorporate sophisticated optimizations, both at the design and at the implementation levels. At the same time, efficiency considerations make it necessary to use low-level languages in their implementation. This makes them laborious to code, optimize, and, especially, maintain and extend. Writing the abstract machine (and ancillary code) in a higher-level language can help tame this inherent complexity. We show how the semantics of most basic components of an efficient virtual machine for Prolog can be described using (a variant of) Prolog. These descriptions are then compiled to C and assembled to build a complete bytecode emulator. Thanks to the high level of the language used and its closeness to Prolog, the abstract machine description can be manipulated using standard Prolog compilation and optimization techniques with relative ease. We also show how, by applying program transformations selectively, we obtain abstract machine implementations whose performance can match and even exceed that of state-of-the-art, highly-tuned, hand-crafted emulators.Comment: 56 pages, 46 figures, 5 tables, To appear in Theory and Practice of Logic Programming (TPLP

    CONFIGEN: A tool for managing configuration options

    Full text link
    This paper introduces CONFIGEN, a tool that helps modularizing software. CONFIGEN allows the developer to select a set of elementary components for his software through an interactive interface. Configuration files for use by C/assembly code and Makefiles are then automatically generated, and we successfully used it as a helper tool for complex system software refactoring. CONFIGEN is based on propositional logic, and its implementation faces hard theoretical problems.Comment: In Proceedings LoCoCo 2010, arXiv:1007.083

    A formally verified compiler back-end

    Get PDF
    This article describes the development and formal verification (proof of semantic preservation) of a compiler back-end from Cminor (a simple imperative intermediate language) to PowerPC assembly code, using the Coq proof assistant both for programming the compiler and for proving its correctness. Such a verified compiler is useful in the context of formal methods applied to the certification of critical software: the verification of the compiler guarantees that the safety properties proved on the source code hold for the executable compiled code as well

    Processor Models For Instruction Scheduling using Constraint Programming

    Get PDF
    Instruction scheduling is one of the most important optimisations performed when producing code in a compiler. The problem consists of finding a minimum length schedule subject to latency and different resource constraints. This is a hard problem, classically approached by heuristic algorithms. In the last decade, research interest has shifted from heuristic to potentially optimal methods. When using optimal methods, a lot of compilation time is spent searching for an optimal solution. This makes it important that the problem definition reflects the reality of the processor. In this work, a constraint programming approach was used to study the impact that the model detail has on performance. Several models of a superscalar processor were embedded in LLVM and evaluated using SPEC CPU2000. The result shows that there is substantial performance to be gained, over 5% for some programs. The stability of the improvement is heavily dependent on the accuracy of the model

    Solving package dependencies: from EDOS to Mancoosi

    Get PDF
    Mancoosi (Managing the Complexity of the Open Source Infrastructure) is an ongoing research project funded by the European Union for addressing some of the challenges related to the "upgrade problem" of interdependent software components of which Debian packages are prototypical examples. Mancoosi is the natural continuation of the EDOS project which has already contributed tools for distribution-wide quality assurance in Debian and other GNU/Linux distributions. The consortium behind the project consists of several European public and private research institutions as well as some commercial GNU/Linux distributions from Europe and South America. Debian is represented by a small group of Debian Developers who are working in the ranks of the involved universities to drive and integrate back achievements into Debian. This paper presents relevant results from EDOS in dependency management and gives an overview of the Mancoosi project and its objectives, with a particular focus on the prospective benefits for Debian

    Using Relocatable Bitstreams for Fault Tolerance

    Get PDF
    This research develops a method for relocating reconfigurable modules on the Virtex-II (Pro) family of Field Programmable Gate Arrays (FPGAs). A bitstream translation program is developed which correctly changes the location of a partial bitstream that implements a module on the FPGA. To take advantage of relocatable modules, three fault-tolerance circuit designs are developed and tested. This circuit can operate through a fault by efficiently removing the faulty module and replacing it with a relocated module without faults. The FPGA can recover from faults at a known location, without the need for external intervention using an embedded fault recovery system. The recovery system uses an internal PowerPC to relocate the modules and reprogram the FPGA. Due to the limited architecture of the target FPGA and Xilinx tool errors, an FPGA with automatic fault recovery could not be demonstrated. However, the various components needed to do this type of recovery have been implemented and demonstrated individually
    • …
    corecore