209 research outputs found

    Incremental copying garbage collection for WAM-based Prolog systems

    Full text link
    The design and implementation of an incremental copying heap garbage collector for WAM-based Prolog systems is presented. Its heap layout consists of a number of equal-sized blocks. Other changes to the standard WAM allow these blocks to be garbage collected independently. The independent collection of heap blocks forms the basis of an incremental collecting algorithm which employs copying without marking (contrary to the more frequently used mark&copy or mark&slide algorithms in the context of Prolog). Compared to standard semi-space copying collectors, this approach to heap garbage collection lowers in many cases the memory usage and reduces pause times. The algorithm also allows for a wide variety of garbage collection policies including generational ones. The algorithm is implemented and evaluated in the context of hProlog.Comment: 33 pages, 22 figures, 5 tables. To appear in Theory and Practice of Logic Programming (TPLP

    Optimizing the SICStus Prolog virtual machine instruction set

    Get PDF
    The Swedish Institute of Computer Science (SICS) is the vendor of SICStus Prolog. To decrease execution time and reduce space requirements, variants of SICStus Prolog's virtual instruction set were investigated. Semi-automatic ways of finding candidate sets of instructions to combine or specialize were developed and used. Several virtual machines were implemented and the relationship between improvements by combinations and by specializations were investigated. The benefits of specializations and combinations of instructions to the performance of the emulator is on the average of the order of 10%. The code size reduction is 15%

    Heap Reference Analysis Using Access Graphs

    Full text link
    Despite significant progress in the theory and practice of program analysis, analysing properties of heap data has not reached the same level of maturity as the analysis of static and stack data. The spatial and temporal structure of stack and static data is well understood while that of heap data seems arbitrary and is unbounded. We devise bounded representations which summarize properties of the heap data. This summarization is based on the structure of the program which manipulates the heap. The resulting summary representations are certain kinds of graphs called access graphs. The boundedness of these representations and the monotonicity of the operations to manipulate them make it possible to compute them through data flow analysis. An important application which benefits from heap reference analysis is garbage collection, where currently liveness is conservatively approximated by reachability from program variables. As a consequence, current garbage collectors leave a lot of garbage uncollected, a fact which has been confirmed by several empirical studies. We propose the first ever end-to-end static analysis to distinguish live objects from reachable objects. We use this information to make dead objects unreachable by modifying the program. This application is interesting because it requires discovering data flow information representing complex semantics. In particular, we discover four properties of heap data: liveness, aliasing, availability, and anticipability. Together, they cover all combinations of directions of analysis (i.e. forward and backward) and confluence of information (i.e. union and intersection). Our analysis can also be used for plugging memory leaks in C/C++ languages.Comment: Accepted for printing by ACM TOPLAS. This version incorporates referees' comment

    Description and Optimization of Abstract Machines in a Dialect of Prolog

    Full text link
    In order to achieve competitive performance, abstract machines for Prolog and related languages end up being large and intricate, and incorporate sophisticated optimizations, both at the design and at the implementation levels. At the same time, efficiency considerations make it necessary to use low-level languages in their implementation. This makes them laborious to code, optimize, and, especially, maintain and extend. Writing the abstract machine (and ancillary code) in a higher-level language can help tame this inherent complexity. We show how the semantics of most basic components of an efficient virtual machine for Prolog can be described using (a variant of) Prolog. These descriptions are then compiled to C and assembled to build a complete bytecode emulator. Thanks to the high level of the language used and its closeness to Prolog, the abstract machine description can be manipulated using standard Prolog compilation and optimization techniques with relative ease. We also show how, by applying program transformations selectively, we obtain abstract machine implementations whose performance can match and even exceed that of state-of-the-art, highly-tuned, hand-crafted emulators.Comment: 56 pages, 46 figures, 5 tables, To appear in Theory and Practice of Logic Programming (TPLP

    Parallel backtracking with answer memoing for independent and-parallelism

    Full text link
    Goal-level Independent and-parallelism (IAP) is exploited by scheduling for simultaneous execution two or more goals which will not interfere with each other at run time. This can be done safely even if such goals can produce mĂșltiple answers. The most successful IAP implementations to date have used recomputation of answers and sequentially ordered backtracking. While in principie simplifying the implementation, recomputation can be very inefficient if the granularity of the parallel goals is large enough and they produce several answers, while sequentially ordered backtracking limits parallelism. And, despite the expected simplification, the implementation of the classic schemes has proved to involve complex engineering, with the consequent difficulty for system maintenance and extensiĂłn, while still frequently running into the well-known trapped goal and garbage slot problems. This work presents an alternative parallel backtracking model for IAP and its implementation. The model features parallel out-of-order (i.e., non-chronological) backtracking and relies on answer memoization to reuse and combine answers. We show that this approach can bring significant performance advantages. Also, it can bring some simplification to the important engineering task involved in implementing the backtracking mechanism of previous approaches

    Region-based memory management for Mercury programs

    Full text link
    Region-based memory management (RBMM) is a form of compile time memory management, well-known from the functional programming world. In this paper we describe our work on implementing RBMM for the logic programming language Mercury. One interesting point about Mercury is that it is designed with strong type, mode, and determinism systems. These systems not only provide Mercury programmers with several direct software engineering benefits, such as self-documenting code and clear program logic, but also give language implementors a large amount of information that is useful for program analyses. In this work, we make use of this information to develop program analyses that determine the distribution of data into regions and transform Mercury programs by inserting into them the necessary region operations. We prove the correctness of our program analyses and transformation. To execute the annotated programs, we have implemented runtime support that tackles the two main challenges posed by backtracking. First, backtracking can require regions removed during forward execution to be "resurrected"; and second, any memory allocated during a computation that has been backtracked over must be recovered promptly and without waiting for the regions involved to come to the end of their life. We describe in detail our solution of both these problems. We study in detail how our RBMM system performs on a selection of benchmark programs, including some well-known difficult cases for RBMM. Even with these difficult cases, our RBMM-enabled Mercury system obtains clearly faster runtimes for 15 out of 18 benchmarks compared to the base Mercury system with its Boehm runtime garbage collector, with an average runtime speedup of 24%, and an average reduction in memory requirements of 95%. In fact, our system achieves optimal memory consumption in some programs.Comment: 74 pages, 23 figures, 11 tables. A shorter version of this paper, without proofs, is to appear in the journal Theory and Practice of Logic Programming (TPLP

    Automatic goal distribution strategies for the execution of committed choice logic languages on distributed memory parallel computers

    Get PDF
    There has been much research interest in efficient implementations of the Committed Choice Non-Deterministic (CCND) logic languages on parallel computers. To take full advantage of the speed gains of parallel computers, methods need to be found to automatically distribute goals over the machine processors, ideally with as little involvement from the user as possible.In this thesis we explore some automatic goal distribution strategies for the execu¬ tion of the CCND languages on commercially available distributed memory parallel computers.There are two facets to the goal distribution strategies we have chosen to explore:DEMAND DRIVEN: An idle processor requests work from other processors. We describe two strategies in this class: one in which an idle processor asks only neighbouring processors for spare work, the nearest-neighbour strategy; and one where an idle processor may ask any other processor in the machine for spare work, the allprocessors strategy.WEIGHTS: Using a program analysis technique devised by Tick, weights are attached to goals; the weights can be used to order the goals so that they can be executed and distributed out in weighted order, possibly increasing performance.We describe a framework in which to implement and analyse goal distribution strategies, and then go on to describe experiments with demand driven strategies, both with and without weights. The experiments were made using two of our own implementations of Flat Guarded Horn Clauses — an interpreter and a WAM-like system — executing on a MEIKO T800 Transputer Array configured in a 2-D mesh topology.Analysis of the results show that the all-processors strategies are promising (AP-NW), adding weights had little positive effect on performance, and that nearest-neighbours strategies can reduce performance due to bad load balancing.We also describe some preliminary experiments for a variant of the AP-NW strategy: goals which suspend on one variable are sent to the processor that controls that variable, the processes-to-data strategy. And we briefly look at some preliminary results of executing programs on large numbers of processors (> 30)

    Towards flexible goal-oriented logic programming

    Get PDF
    • 

    corecore