4,587 research outputs found

    Description and Optimization of Abstract Machines in a Dialect of Prolog

    Full text link
    In order to achieve competitive performance, abstract machines for Prolog and related languages end up being large and intricate, and incorporate sophisticated optimizations, both at the design and at the implementation levels. At the same time, efficiency considerations make it necessary to use low-level languages in their implementation. This makes them laborious to code, optimize, and, especially, maintain and extend. Writing the abstract machine (and ancillary code) in a higher-level language can help tame this inherent complexity. We show how the semantics of most basic components of an efficient virtual machine for Prolog can be described using (a variant of) Prolog. These descriptions are then compiled to C and assembled to build a complete bytecode emulator. Thanks to the high level of the language used and its closeness to Prolog, the abstract machine description can be manipulated using standard Prolog compilation and optimization techniques with relative ease. We also show how, by applying program transformations selectively, we obtain abstract machine implementations whose performance can match and even exceed that of state-of-the-art, highly-tuned, hand-crafted emulators.Comment: 56 pages, 46 figures, 5 tables, To appear in Theory and Practice of Logic Programming (TPLP

    Simple and Effective Type Check Removal through Lazy Basic Block Versioning

    Get PDF
    Dynamically typed programming languages such as JavaScript and Python defer type checking to run time. In order to maximize performance, dynamic language VM implementations must attempt to eliminate redundant dynamic type checks. However, type inference analyses are often costly and involve tradeoffs between compilation time and resulting precision. This has lead to the creation of increasingly complex multi-tiered VM architectures. This paper introduces lazy basic block versioning, a simple JIT compilation technique which effectively removes redundant type checks from critical code paths. This novel approach lazily generates type-specialized versions of basic blocks on-the-fly while propagating context-dependent type information. This does not require the use of costly program analyses, is not restricted by the precision limitations of traditional type analyses and avoids the implementation complexity of speculative optimization techniques. We have implemented intraprocedural lazy basic block versioning in a JavaScript JIT compiler. This approach is compared with a classical flow-based type analysis. Lazy basic block versioning performs as well or better on all benchmarks. On average, 71% of type tests are eliminated, yielding speedups of up to 50%. We also show that our implementation generates more efficient machine code than TraceMonkey, a tracing JIT compiler for JavaScript, on several benchmarks. The combination of implementation simplicity, low algorithmic complexity and good run time performance makes basic block versioning attractive for baseline JIT compilers

    Language processors for the Motorola M68000 microprocessor

    Get PDF
    A set of software tools for the Motorola M68000 microprocessor was developed to run under the UNIX* operating system. A C language cross compiler was created by modifying the UNIX ‘C’ compiler for the PDP-11. A macro cross assembler was designed and implemented to produce relocatable object modules for the M68000 in the a.out format of PDP-11 UNIX object modules. The UNIX loader for the PDP-11 was changed to allow relocation of 32-bit quantities as required by the M68000. A small set of utility routines was also written to assist in the implementation effort. The language processors and utilities provide the means by which high level ‘C’ programs can produce executable images for the M68000. All of the programs are currently running on a PDP-11/70 UNIX system

    Just-In-Time Compilation of NumPy Vector Operations

    Get PDF
    In this paper, we introduce JIT compilation for thehigh-productivity framework Python/NumPy in order to boost theperformance significantly. The JIT compilation of Python/NumPyis completely transparent to the user – the runtime system willautomatically JIT compile and execute the NumPy instructionsencountered in a Python application. In other words, we introducea framework that provides the high-productivity from Pythonwhile maintaining the high-performance of a low-level, compiledlanguage.We transforms NumPy vector instruction into an AbstractSyntax Tree representation that creates the basis for furtheroptimizations. From the AST we auto-generate C code whichwe compile into computational kernels and execute. These incorporatetemporary array removal and loop-fusion which are mainbenefactors in the achieved speedups. In order to amortize theoverhead of creation, we also implement a cache for the compiledkernels.We evaluate the JIT compilation by executing several scientificcomputing benchmarks on an AMD. Compared to NumPy, weachieve speedups of a factor 4.72 for a N-Body application and7.51 for a Jacobi Stencil application executing on a single CPUcore
    • …
    corecore