40,772 research outputs found
Optimizing Abstract Abstract Machines
The technique of abstracting abstract machines (AAM) provides a systematic
approach for deriving computable approximations of evaluators that are easily
proved sound. This article contributes a complementary step-by-step process for
subsequently going from a naive analyzer derived under the AAM approach, to an
efficient and correct implementation. The end result of the process is a two to
three order-of-magnitude improvement over the systematically derived analyzer,
making it competitive with hand-optimized implementations that compute
fundamentally less precise results.Comment: Proceedings of the International Conference on Functional Programming
2013 (ICFP 2013). Boston, Massachusetts. September, 201
PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation
High-performance computing has recently seen a surge of interest in
heterogeneous systems, with an emphasis on modern Graphics Processing Units
(GPUs). These devices offer tremendous potential for performance and efficiency
in important large-scale applications of computational science. However,
exploiting this potential can be challenging, as one must adapt to the
specialized and rapidly evolving computing environment currently exhibited by
GPUs. One way of addressing this challenge is to embrace better techniques and
develop tools tailored to their needs. This article presents one simple
technique, GPU run-time code generation (RTCG), along with PyCUDA and PyOpenCL,
two open-source toolkits that support this technique.
In introducing PyCUDA and PyOpenCL, this article proposes the combination of
a dynamic, high-level scripting language with the massive performance of a GPU
as a compelling two-tiered computing platform, potentially offering significant
performance and productivity advantages over conventional single-tier, static
systems. The concept of RTCG is simple and easily implemented using existing,
robust infrastructure. Nonetheless it is powerful enough to support (and
encourage) the creation of custom application-specific tools by its users. The
premise of the paper is illustrated by a wide range of examples where the
technique has been applied with considerable success.Comment: Submitted to Parallel Computing, Elsevie
(Leftmost-Outermost) Beta Reduction is Invariant, Indeed
Slot and van Emde Boas' weak invariance thesis states that reasonable
machines can simulate each other within a polynomially overhead in time. Is
lambda-calculus a reasonable machine? Is there a way to measure the
computational complexity of a lambda-term? This paper presents the first
complete positive answer to this long-standing problem. Moreover, our answer is
completely machine-independent and based over a standard notion in the theory
of lambda-calculus: the length of a leftmost-outermost derivation to normal
form is an invariant cost model. Such a theorem cannot be proved by directly
relating lambda-calculus with Turing machines or random access machines,
because of the size explosion problem: there are terms that in a linear number
of steps produce an exponentially long output. The first step towards the
solution is to shift to a notion of evaluation for which the length and the
size of the output are linearly related. This is done by adopting the linear
substitution calculus (LSC), a calculus of explicit substitutions modeled after
linear logic proof nets and admitting a decomposition of leftmost-outermost
derivations with the desired property. Thus, the LSC is invariant with respect
to, say, random access machines. The second step is to show that LSC is
invariant with respect to the lambda-calculus. The size explosion problem seems
to imply that this is not possible: having the same notions of normal form,
evaluation in the LSC is exponentially longer than in the lambda-calculus. We
solve such an impasse by introducing a new form of shared normal form and
shared reduction, deemed useful. Useful evaluation avoids those steps that only
unshare the output without contributing to beta-redexes, i.e. the steps that
cause the blow-up in size. The main technical contribution of the paper is
indeed the definition of useful reductions and the thorough analysis of their
properties.Comment: arXiv admin note: substantial text overlap with arXiv:1405.331
First Class Call Stacks: Exploring Head Reduction
Weak-head normalization is inconsistent with functional extensionality in the
call-by-name -calculus. We explore this problem from a new angle via
the conflict between extensionality and effects. Leveraging ideas from work on
the -calculus with control, we derive and justify alternative
operational semantics and a sequence of abstract machines for performing head
reduction. Head reduction avoids the problems with weak-head reduction and
extensionality, while our operational semantics and associated abstract
machines show us how to retain weak-head reduction's ease of implementation.Comment: In Proceedings WoC 2015, arXiv:1606.0583
- …