1,643 research outputs found

    A Rational Deconstruction of Landin's SECD Machine with the J Operator

    Full text link
    Landin's SECD machine was the first abstract machine for applicative expressions, i.e., functional programs. Landin's J operator was the first control operator for functional languages, and was specified by an extension of the SECD machine. We present a family of evaluation functions corresponding to this extension of the SECD machine, using a series of elementary transformations (transformation into continu-ation-passing style (CPS) and defunctionalization, chiefly) and their left inverses (transformation into direct style and refunctionalization). To this end, we modernize the SECD machine into a bisimilar one that operates in lockstep with the original one but that (1) does not use a data stack and (2) uses the caller-save rather than the callee-save convention for environments. We also identify that the dump component of the SECD machine is managed in a callee-save way. The caller-save counterpart of the modernized SECD machine precisely corresponds to Thielecke's double-barrelled continuations and to Felleisen's encoding of J in terms of call/cc. We then variously characterize the J operator in terms of CPS and in terms of delimited-control operators in the CPS hierarchy. As a byproduct, we also present several reduction semantics for applicative expressions with the J operator, based on Curien's original calculus of explicit substitutions. These reduction semantics mechanically correspond to the modernized versions of the SECD machine and to the best of our knowledge, they provide the first syntactic theories of applicative expressions with the J operator

    A C++11 implementation of arbitrary-rank tensors for high-performance computing

    Full text link
    This article discusses an efficient implementation of tensors of arbitrary rank by using some of the idioms introduced by the recently published C++ ISO Standard (C++11). With the aims at providing a basic building block for high-performance computing, a single Array class template is carefully crafted, from which vectors, matrices, and even higher-order tensors can be created. An expression template facility is also built around the array class template to provide convenient mathematical syntax. As a result, by using templates, an extra high-level layer is added to the C++ language when dealing with algebraic objects and their operations, without compromising performance. The implementation is tested running on both CPU and GPU.Comment: 21 pages, 6 figures, 1 tabl

    Parallel VLSI architecture emulation and the organization of APSA/MPP

    Get PDF
    The Applicative Programming System Architecture (APSA) combines an applicative language interpreter with a novel parallel computer architecture that is well suited for Very Large Scale Integration (VLSI) implementation. The Massively Parallel Processor (MPP) can simulate VLSI circuits by allocating one processing element in its square array to an area on a square VLSI chip. As long as there are not too many long data paths, the MPP can simulate a VLSI clock cycle very rapidly. The APSA circuit contains a binary tree with a few long paths and many short ones. A skewed H-tree layout allows every processing element to simulate a leaf cell and up to four tree nodes, with no loss in parallelism. Emulation of a key APSA algorithm on the MPP resulted in performance 16,000 times faster than a Vax. This speed will make it possible for the APSA language interpreter to run fast enough to support research in parallel list processing algorithms

    Stream Fusion, to Completeness

    Full text link
    Stream processing is mainstream (again): Widely-used stream libraries are now available for virtually all modern OO and functional languages, from Java to C# to Scala to OCaml to Haskell. Yet expressivity and performance are still lacking. For instance, the popular, well-optimized Java 8 streams do not support the zip operator and are still an order of magnitude slower than hand-written loops. We present the first approach that represents the full generality of stream processing and eliminates overheads, via the use of staging. It is based on an unusually rich semantic model of stream interaction. We support any combination of zipping, nesting (or flat-mapping), sub-ranging, filtering, mapping-of finite or infinite streams. Our model captures idiosyncrasies that a programmer uses in optimizing stream pipelines, such as rate differences and the choice of a "for" vs. "while" loops. Our approach delivers hand-written-like code, but automatically. It explicitly avoids the reliance on black-box optimizers and sufficiently-smart compilers, offering highest, guaranteed and portable performance. Our approach relies on high-level concepts that are then readily mapped into an implementation. Accordingly, we have two distinct implementations: an OCaml stream library, staged via MetaOCaml, and a Scala library for the JVM, staged via LMS. In both cases, we derive libraries richer and simultaneously many tens of times faster than past work. We greatly exceed in performance the standard stream libraries available in Java, Scala and OCaml, including the well-optimized Java 8 streams

    The Environment as an Argument

    Get PDF
    Context-awareness as defined in the setting of Ubiquitous Computing [3] is all about expressing the dependency of a specific computation upon some implicit piece of information. The manipulation and expression of such dependencies may thus be neatly encapsulated in a language where computations are first-class values. Perhaps surprisingly however, context-aware programming has not been explored in a functional setting, where first-class computations and higher-order functions are commonplace. In this paper we present an embedded domain-specific language (EDSL) for constructing context-aware applications in the functional programming language Haskell. © 2012 Springer-Verlag

    Beta Reduction is Invariant, Indeed (Long Version)

    Full text link
    Slot and van Emde Boas' weak invariance thesis states that reasonable machines can simulate each other within a polynomially overhead in time. Is λ\lambda-calculus a reasonable machine? Is there a way to measure the computational complexity of a λ\lambda-term? This paper presents the first complete positive answer to this long-standing problem. Moreover, our answer is completely machine-independent and based over a standard notion in the theory of λ\lambda-calculus: the length of a leftmost-outermost derivation to normal form is an invariant cost model. Such a theorem cannot be proved by directly relating λ\lambda-calculus with Turing machines or random access machines, because of the size explosion problem: there are terms that in a linear number of steps produce an exponentially long output. The first step towards the solution is to shift to a notion of evaluation for which the length and the size of the output are linearly related. This is done by adopting the linear substitution calculus (LSC), a calculus of explicit substitutions modelled after linear logic and proof-nets and admitting a decomposition of leftmost-outermost derivations with the desired property. Thus, the LSC is invariant with respect to, say, random access machines. The second step is to show that LSC is invariant with respect to the λ\lambda-calculus. The size explosion problem seems to imply that this is not possible: having the same notions of normal form, evaluation in the LSC is exponentially longer than in the λ\lambda-calculus. We solve such an impasse by introducing a new form of shared normal form and shared reduction, deemed useful. Useful evaluation avoids those steps that only unshare the output without contributing to β\beta-redexes, i.e., the steps that cause the blow-up in size.Comment: 29 page

    Monadic Deep Learning

    Full text link
    The Java and Scala community has built a very successful big data ecosystem. However, most of neural networks running on it are modeled in dynamically typed programming languages. These dynamically typed deep learning frameworks treat neural networks as differentiable expressions that contain many trainable variable, and perform automatic differentiation on those expressions when training them. Until 2019, none of the learning frameworks in statically typed languages provided the expressive power of traditional frameworks. Their users are not able to use custom algorithms unless creating plenty of boilerplate code for hard-coded back-propagation. We solved this problem in DeepLearning.scala 2. Our contributions are: 1. We discovered a novel approach to perform automatic differentiation in reverse mode for statically typed functions that contain multiple trainable variable, and can interoperate freely with the metalanguage. 2. We designed a set of monads and monad transformers, which allow users to create monadic expressions that represent dynamic neural networks. 3. Along with these monads, we provide some applicative functors, to perform multiple calculations in parallel. With these features, users of DeepLearning.scala were able to create complex neural networks in an intuitive and concise way, and still maintain type safety.Comment: 27 pages, 7 figures, 3 table
    corecore