40,772 research outputs found

    Optimizing Abstract Abstract Machines

    Full text link
    The technique of abstracting abstract machines (AAM) provides a systematic approach for deriving computable approximations of evaluators that are easily proved sound. This article contributes a complementary step-by-step process for subsequently going from a naive analyzer derived under the AAM approach, to an efficient and correct implementation. The end result of the process is a two to three order-of-magnitude improvement over the systematically derived analyzer, making it competitive with hand-optimized implementations that compute fundamentally less precise results.Comment: Proceedings of the International Conference on Functional Programming 2013 (ICFP 2013). Boston, Massachusetts. September, 201

    PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation

    Full text link
    High-performance computing has recently seen a surge of interest in heterogeneous systems, with an emphasis on modern Graphics Processing Units (GPUs). These devices offer tremendous potential for performance and efficiency in important large-scale applications of computational science. However, exploiting this potential can be challenging, as one must adapt to the specialized and rapidly evolving computing environment currently exhibited by GPUs. One way of addressing this challenge is to embrace better techniques and develop tools tailored to their needs. This article presents one simple technique, GPU run-time code generation (RTCG), along with PyCUDA and PyOpenCL, two open-source toolkits that support this technique. In introducing PyCUDA and PyOpenCL, this article proposes the combination of a dynamic, high-level scripting language with the massive performance of a GPU as a compelling two-tiered computing platform, potentially offering significant performance and productivity advantages over conventional single-tier, static systems. The concept of RTCG is simple and easily implemented using existing, robust infrastructure. Nonetheless it is powerful enough to support (and encourage) the creation of custom application-specific tools by its users. The premise of the paper is illustrated by a wide range of examples where the technique has been applied with considerable success.Comment: Submitted to Parallel Computing, Elsevie

    (Leftmost-Outermost) Beta Reduction is Invariant, Indeed

    Get PDF
    Slot and van Emde Boas' weak invariance thesis states that reasonable machines can simulate each other within a polynomially overhead in time. Is lambda-calculus a reasonable machine? Is there a way to measure the computational complexity of a lambda-term? This paper presents the first complete positive answer to this long-standing problem. Moreover, our answer is completely machine-independent and based over a standard notion in the theory of lambda-calculus: the length of a leftmost-outermost derivation to normal form is an invariant cost model. Such a theorem cannot be proved by directly relating lambda-calculus with Turing machines or random access machines, because of the size explosion problem: there are terms that in a linear number of steps produce an exponentially long output. The first step towards the solution is to shift to a notion of evaluation for which the length and the size of the output are linearly related. This is done by adopting the linear substitution calculus (LSC), a calculus of explicit substitutions modeled after linear logic proof nets and admitting a decomposition of leftmost-outermost derivations with the desired property. Thus, the LSC is invariant with respect to, say, random access machines. The second step is to show that LSC is invariant with respect to the lambda-calculus. The size explosion problem seems to imply that this is not possible: having the same notions of normal form, evaluation in the LSC is exponentially longer than in the lambda-calculus. We solve such an impasse by introducing a new form of shared normal form and shared reduction, deemed useful. Useful evaluation avoids those steps that only unshare the output without contributing to beta-redexes, i.e. the steps that cause the blow-up in size. The main technical contribution of the paper is indeed the definition of useful reductions and the thorough analysis of their properties.Comment: arXiv admin note: substantial text overlap with arXiv:1405.331

    First Class Call Stacks: Exploring Head Reduction

    Get PDF
    Weak-head normalization is inconsistent with functional extensionality in the call-by-name λ\lambda-calculus. We explore this problem from a new angle via the conflict between extensionality and effects. Leveraging ideas from work on the λ\lambda-calculus with control, we derive and justify alternative operational semantics and a sequence of abstract machines for performing head reduction. Head reduction avoids the problems with weak-head reduction and extensionality, while our operational semantics and associated abstract machines show us how to retain weak-head reduction's ease of implementation.Comment: In Proceedings WoC 2015, arXiv:1606.0583
    • …
    corecore