1,643 research outputs found
A Rational Deconstruction of Landin's SECD Machine with the J Operator
Landin's SECD machine was the first abstract machine for applicative
expressions, i.e., functional programs. Landin's J operator was the first
control operator for functional languages, and was specified by an extension of
the SECD machine. We present a family of evaluation functions corresponding to
this extension of the SECD machine, using a series of elementary
transformations (transformation into continu-ation-passing style (CPS) and
defunctionalization, chiefly) and their left inverses (transformation into
direct style and refunctionalization). To this end, we modernize the SECD
machine into a bisimilar one that operates in lockstep with the original one
but that (1) does not use a data stack and (2) uses the caller-save rather than
the callee-save convention for environments. We also identify that the dump
component of the SECD machine is managed in a callee-save way. The caller-save
counterpart of the modernized SECD machine precisely corresponds to Thielecke's
double-barrelled continuations and to Felleisen's encoding of J in terms of
call/cc. We then variously characterize the J operator in terms of CPS and in
terms of delimited-control operators in the CPS hierarchy. As a byproduct, we
also present several reduction semantics for applicative expressions with the J
operator, based on Curien's original calculus of explicit substitutions. These
reduction semantics mechanically correspond to the modernized versions of the
SECD machine and to the best of our knowledge, they provide the first syntactic
theories of applicative expressions with the J operator
A C++11 implementation of arbitrary-rank tensors for high-performance computing
This article discusses an efficient implementation of tensors of arbitrary
rank by using some of the idioms introduced by the recently published C++ ISO
Standard (C++11). With the aims at providing a basic building block for
high-performance computing, a single Array class template is carefully crafted,
from which vectors, matrices, and even higher-order tensors can be created. An
expression template facility is also built around the array class template to
provide convenient mathematical syntax. As a result, by using templates, an
extra high-level layer is added to the C++ language when dealing with algebraic
objects and their operations, without compromising performance. The
implementation is tested running on both CPU and GPU.Comment: 21 pages, 6 figures, 1 tabl
Parallel VLSI architecture emulation and the organization of APSA/MPP
The Applicative Programming System Architecture (APSA) combines an applicative language interpreter with a novel parallel computer architecture that is well suited for Very Large Scale Integration (VLSI) implementation. The Massively Parallel Processor (MPP) can simulate VLSI circuits by allocating one processing element in its square array to an area on a square VLSI chip. As long as there are not too many long data paths, the MPP can simulate a VLSI clock cycle very rapidly. The APSA circuit contains a binary tree with a few long paths and many short ones. A skewed H-tree layout allows every processing element to simulate a leaf cell and up to four tree nodes, with no loss in parallelism. Emulation of a key APSA algorithm on the MPP resulted in performance 16,000 times faster than a Vax. This speed will make it possible for the APSA language interpreter to run fast enough to support research in parallel list processing algorithms
Stream Fusion, to Completeness
Stream processing is mainstream (again): Widely-used stream libraries are now
available for virtually all modern OO and functional languages, from Java to C#
to Scala to OCaml to Haskell. Yet expressivity and performance are still
lacking. For instance, the popular, well-optimized Java 8 streams do not
support the zip operator and are still an order of magnitude slower than
hand-written loops. We present the first approach that represents the full
generality of stream processing and eliminates overheads, via the use of
staging. It is based on an unusually rich semantic model of stream interaction.
We support any combination of zipping, nesting (or flat-mapping), sub-ranging,
filtering, mapping-of finite or infinite streams. Our model captures
idiosyncrasies that a programmer uses in optimizing stream pipelines, such as
rate differences and the choice of a "for" vs. "while" loops. Our approach
delivers hand-written-like code, but automatically. It explicitly avoids the
reliance on black-box optimizers and sufficiently-smart compilers, offering
highest, guaranteed and portable performance. Our approach relies on high-level
concepts that are then readily mapped into an implementation. Accordingly, we
have two distinct implementations: an OCaml stream library, staged via
MetaOCaml, and a Scala library for the JVM, staged via LMS. In both cases, we
derive libraries richer and simultaneously many tens of times faster than past
work. We greatly exceed in performance the standard stream libraries available
in Java, Scala and OCaml, including the well-optimized Java 8 streams
The Environment as an Argument
Context-awareness as defined in the setting of Ubiquitous Computing [3] is all about expressing the dependency of a specific computation upon some implicit piece of information. The manipulation and expression of such dependencies may thus be neatly encapsulated in a language where computations are first-class values. Perhaps surprisingly however, context-aware programming has not been explored in a functional setting, where first-class computations and higher-order functions are commonplace. In this paper we present an embedded domain-specific language (EDSL) for constructing context-aware applications in the functional programming language Haskell. © 2012 Springer-Verlag
Beta Reduction is Invariant, Indeed (Long Version)
Slot and van Emde Boas' weak invariance thesis states that reasonable
machines can simulate each other within a polynomially overhead in time. Is
-calculus a reasonable machine? Is there a way to measure the
computational complexity of a -term? This paper presents the first
complete positive answer to this long-standing problem. Moreover, our answer is
completely machine-independent and based over a standard notion in the theory
of -calculus: the length of a leftmost-outermost derivation to normal
form is an invariant cost model. Such a theorem cannot be proved by directly
relating -calculus with Turing machines or random access machines,
because of the size explosion problem: there are terms that in a linear number
of steps produce an exponentially long output. The first step towards the
solution is to shift to a notion of evaluation for which the length and the
size of the output are linearly related. This is done by adopting the linear
substitution calculus (LSC), a calculus of explicit substitutions modelled
after linear logic and proof-nets and admitting a decomposition of
leftmost-outermost derivations with the desired property. Thus, the LSC is
invariant with respect to, say, random access machines. The second step is to
show that LSC is invariant with respect to the -calculus. The size
explosion problem seems to imply that this is not possible: having the same
notions of normal form, evaluation in the LSC is exponentially longer than in
the -calculus. We solve such an impasse by introducing a new form of
shared normal form and shared reduction, deemed useful. Useful evaluation
avoids those steps that only unshare the output without contributing to
-redexes, i.e., the steps that cause the blow-up in size.Comment: 29 page
Monadic Deep Learning
The Java and Scala community has built a very successful big data ecosystem.
However, most of neural networks running on it are modeled in dynamically typed
programming languages. These dynamically typed deep learning frameworks treat
neural networks as differentiable expressions that contain many trainable
variable, and perform automatic differentiation on those expressions when
training them.
Until 2019, none of the learning frameworks in statically typed languages
provided the expressive power of traditional frameworks. Their users are not
able to use custom algorithms unless creating plenty of boilerplate code for
hard-coded back-propagation.
We solved this problem in DeepLearning.scala 2. Our contributions are:
1. We discovered a novel approach to perform automatic differentiation in
reverse mode for statically typed functions that contain multiple trainable
variable, and can interoperate freely with the metalanguage.
2. We designed a set of monads and monad transformers, which allow users to
create monadic expressions that represent dynamic neural networks.
3. Along with these monads, we provide some applicative functors, to perform
multiple calculations in parallel.
With these features, users of DeepLearning.scala were able to create complex
neural networks in an intuitive and concise way, and still maintain type
safety.Comment: 27 pages, 7 figures, 3 table
- …